We are looking for a highly skilled GCP Data Engineer with a strong background in either Python or Java to join our team. As a GCP Data Engineer, you will design, develop, and maintain data processing systems on Google Cloud Platform (GCP) to handle large-scale datasets efficiently. You will work closely with cross-functional teams to build and deploy robust data pipelines, ensuring data availability, accuracy, and accessibility for business analytics and insights.
Key Responsibilities:
- Design and implement scalable data pipelines using Google Cloud Platform (GCP) services such as BigQuery, Cloud Storage, Dataflow, Pub/Sub, and Cloud Composer.
- Develop data integration solutions to extract, transform, and load (ETL) data from various structured and unstructured data sources.
- Collaborate with data scientists, analysts, and software engineers to optimize data workflows and ensure high-performance data processing.
- Build and maintain automated processes for data ingestion, transformation, and analysis.
- Write efficient, reusable, and reliable code primarily in Python or Java for data processing tasks.
- Optimize and troubleshoot data pipelines for maximum reliability and performance.
- Ensure data governance and security standards are implemented in all aspects of the data pipeline.
- Implement and manage CI/CD pipelines for deploying and monitoring data workflows.
- Work with the operations team to ensure the availability, reliability, and performance of data infrastructure.
- Perform data quality checks, validation, and documentation.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Data Engineering, Information Technology, or a related field.
- 3+ years of experience in a data engineering role, preferably with cloud platforms (Google Cloud Platform preferred).
- Proficiency in Python or Java for developing data pipelines and ETL processes.
- Strong experience with GCP services like BigQuery, Dataflow, Cloud Storage, Pub/Sub, Cloud Functions, Cloud Composer.
- Hands-on experience with SQL and query optimization for processing large datasets.
- Knowledge of Apache Beam or Apache Kafka is a plus.
- Experience with version control (Git) and CI/CD pipelines (e.g., Jenkins, Cloud Build).
- Familiarity with data modeling, schema design, and data warehousing concepts.
- Strong problem-solving and troubleshooting skills.
- Excellent communication and collaboration skills, with the ability to work in a team-oriented environment.
Preferred Qualifications:
- Experience with other cloud platforms like AWS or Azure.
- Knowledge of Terraform or other infrastructure-as-code tools.
- Certification in Google Cloud Professional Data Engineer is a plus.
- Exposure to machine learning workflows or data science projects.