Lead Data Engineer
: Job Details :


Lead Data Engineer

Clearpoint

Location: Houston,TX, USA

Date: 2024-11-16T07:34:42Z

Job Description:

TITLE: Lead Data Engineer LOCATION: Houston Texas TYPE: Direct Hire SALARY: $220,000 - $240,000 SUMMARY: The Lead Data Engineer will play a crucial role in architecting, implementing, and managing robust, scalable data infrastructure. This position demands a blend of systems engineering, data integration, and data analytics skills to enhance data capabilities, supporting advanced analytics, machine learning projects, and real-time data processing needs. DUTIES: - Design and implement scalable and reliable data pipelines to ingest, process, and store diverse data at scale, using technologies such as Apache Spark, Hadoop, and Kafka. - Work within cloud environments like AWS or Azure to leverage services including but not limited to EC2, RDS, S3, Lambda, and Azure Data Lake for efficient data handling and processing. - Develop and optimize data models and storage solutions (SQL, NoSQL, Data Lakes) to support operational and analytical applications, ensuring data quality and accessibility. - Utilize ETL tools and frameworks (e.g., Apache Airflow, Talend) to automate data workflows, ensuring efficient data integration and timely availability of data for analytics. - Collaborate closely with data scientists, providing the data infrastructure and tools needed for complex analytical models, leveraging Python or R for data processing scripts. - Ensure compliance with data governance and security policies, implementing best practices in data encryption, masking, and access controls within a cloud environment. - Monitor and troubleshoot data pipelines and databases for performance issues, applying tuning techniques to optimize data access and throughput. - Stay abreast of emerging technologies and methodologies in data engineering, advocating for and implementing improvements to the data ecosystem. REQUIREMENTS: - 7+ years of experience in data engineering, with a proven track record in designing and operating large-scale data pipelines and architectures - Expertise in developing ETL/ELT workflows - Comprehensive knowledge of platforms and services like Databricks, Dataiku, and AWS native data offerings - Solid experience with big data technologies (Apache Spark, Hadoop, Kafka) and cloud services (AWS, Azure) related to data processing and storage - Strong experience in AWS and Azure cloud services, with hands-on experience in integrating cloud storage and compute services with Databricks - Proficient in SQL and programming languages relevant to data engineering (Python, Java, Scala) - Hands on RDBMS experience (data modeling, analysis, programming, stored procedures) - Familiarity with machine learning model deployment and management practices is a plus - Strong communication skills, capable of collaborating effectively across technical and non-technical teams EDUCATION: - Bachelor's Degree computer science, MIS, or other business discipline and 1 Req or Master's Degree computer science, MIS, or other business discipline AWS Certified Solution Architect Preferred - Databricks Certified Associate Developer for Apache Spark Preferred - Azure Data Engineer Associate Preferred or other relevant certifications. Preferred

Apply Now!

Similar Jobs (0)