Distributed Systems Engineer - LLM Platform
: Job Details :


Distributed Systems Engineer - LLM Platform

Acceler8 Talent

Location: Sunnyvale,CA, USA

Date: 2025-01-06T04:10:07Z

Job Description:

Introduction:

As a Distributed Systems Engineer – LLM Platform, you'll develop scalable systems to deploy and manage large language models. If tackling complex distributed systems challenges and working on advanced AI platforms excites you, this role offers an ideal opportunity.

About the Company:

A leader in AI innovation, this company empowers enterprises to build secure, efficient AI systems. Backed by top-tier VCs, they focus on solving real-world problems in LLM deployment and infrastructure.

About the Role:

You'll design Kubernetes-based platforms for LLM tuning and inference, manage GPU clusters, and resolve distributed system issues. Collaboration with machine learning engineers and product teams will be central to building cutting-edge AI solutions.

What We Can Offer You

  • Competitive salary: $150,000–$200,000 per year.
  • Work on state-of-the-art AI platforms and infrastructure.
  • A collaborative, fast-paced environment with room for growth.

Key Responsibilities

  • Develop and maintain Kubernetes-based platforms for LLMs.
  • Manage GPU clusters in private data centers.
  • Debug complex distributed systems.
  • Collaborate across teams to deliver scalable enterprise solutions.

Keywords: Distributed Systems, Machine Learning, Software Development, GPU, Graphic Processing Unit, Kubernetes, Large Language Models, LLMs, Enterprise Software, Distributed Systems, ML Infra, ML Infrastructure, LLM Inference, LLM tuning, GPU clusters

Apply Now!

Similar Jobs (0)