Location: San Francisco,CA, USA
Job Title: Sr. Data Engineer
Location: Remote or Hybrid (San Francisco, CA)
Duration: 6 Months
Job Description
Design, develop, and manage data pipelines and workflows to enable efficient and
accurate data processing using Trino SQL/Spark SQL warehoused in HDFS datasets.
Effectively performs code designs and reviews/approves test cases.
Implement data quality checks and audits to maintain high data accuracy and integrity.
Produces elegant and efficient designs, high performance, and scalable code that
allows for easy extension to future needs.
Collaborate with cross-functional teams, especially data engineering, to understand
data requirements and implement robust data solutions.
Work closely with data domain experts to gather data requirements, translate business
needs into technical specifications, and communicate data insights effectively for sales
representative workflow efficiency.
Optimize data storage for performance and scalability, ensuring efficient data
Extraction, Transformation and Load (ETL).
Develop and maintain documentation related to data pipelines, QA, metrics, and data
policy as it relates to best practice, compliance and GDPR.
Stay up to date with industry best practices and emerging trends in data engineering
and analytics, including Generative AI as it impacts our data operations.
Qualifications:
2+ years in using SQL and experience optimizing SQL databases for performance
(Trino SQL, or Spark).
Demonstrated experience in managing data pipelines (like HDFS), data repository (like
GitHub), workflows (like Apache Airflow), and ETL (best practice coding).
Ability to communicate complex technical concepts to both technical and non-technical
individuals.
Experience working with multiple stakeholders, setting project priorities and delivering
on Objectives and Key Results (OKRs).
Experience automating script changes in Python
Preferred Qualifications:
BA/BS in engineering, computer science, or related technical field (such as statistics, or
data science).
Excellent analytical skills, designing data workflows and analyzing data for anomalies,
or setting data quality thresholds via automated solutions.
Familiarity with data governance principles
Program Manager experience
Demonstrated experience in managing data pipelines in HDFS
Experience running a scrum team and using Jira.
Spark SQL
Suggested Skills:
Data Analysis
Project Management
Data Engineering