Senior Data Engineer
: Job Details :


Senior Data Engineer

TENTH MOUNTAIN LLC

Location: New York,NY, USA

Date: 2024-10-01T11:31:05Z

Job Description:
Job DescriptionJob Description

Join KGES as a Senior Data Engineer

At KGES, we're dedicated to empowering veterans by connecting them with top-tier employers who value their unique military background. As a Senior Data Engineer, you'll have the opportunity to leverage your skills in data engineering, machine learning, and cloud platforms while making a meaningful impact. This is your chance to unlock your potential and embark on a fulfilling civilian career. Submit your information today and get ready to take the next step towards a brighter future with KGES!

Position: Senior Data Engineer

Location: RemotePay Rate: $120,000 annually

Role Overview:

As a Senior Data Engineer at KGES, you'll play a crucial role in migrating our existing data platform to an on-premises Hadoop environment. You'll be responsible for implementing standards, governance, and automation processes. This role requires strong coordination and communication skills as you'll work closely with key stakeholders, data scientists, and other team members.

Key Responsibilities:

  • Design and Develop Data Pipelines:
    • Architect and implement scalable, efficient ETL (Extract, Transform, Load) pipelines using PySpark.
    • Optimize data processing workflows to handle large-scale datasets.
  • Machine Learning Model Development:
    • Develop and train machine learning models using appropriate algorithms and frameworks.
    • Collaborate with data scientists to translate models into production-ready code.
  • MLOps Implementation:
    • Establish and maintain automated CI/CD pipelines for machine learning models.
    • Implement version control for data, models, and code using tools like DVC, MLflow, or similar.
    • Monitor and automate the retraining of models as new data becomes available.
  • Data Quality and Governance:
    • Implement data validation, quality checks, and data governance best practices.
    • Ensure data lineage and documentation for reproducibility and compliance.
  • Performance Tuning:
    • Optimize PySpark jobs for performance, including tuning Spark configurations, optimizing shuffles, and managing memory.
    • Profile and debug PySpark applications to identify and resolve performance bottlenecks.
  • Integration with Cloud Platforms:
    • Deploy and manage data pipelines and machine learning models on cloud platforms such as AWS.
    • Utilize cloud-native services for data storage, processing, and orchestration.
  • Collaboration and Communication:
    • Work closely with data scientists, software engineers, and DevOps teams to integrate machine learning models into the broader software infrastructure.
    • Collaborate with business stakeholders to understand requirements and ensure the successful deployment of machine learning solutions.
    • Effectively communicate technical concepts and project status to non-technical stakeholders.

Qualifications:

  • 8+ years of experience in Engineering.
  • Strong experience with PySpark and Hadoop (on-prem).
  • Experience with MLOps tools like DVC, MLflow, or similar.
  • Proficiency with Jupyter, DataRobot, or similar tools.
  • Experience with AWS SageMaker.

Why KGES?

At KGES, we believe in your potential and are committed to helping you transition smoothly into a rewarding civilian career. Join our team and become part of a company that values your skills, experience, and dedication.

Ready to Apply?

Submit your information today and take the next step towards a brighter future with KGES!

Apply Now!

Similar Jobs (0)