Location: Sunnyvale,CA, USA
Introduction:
As a Distributed Systems Engineer – LLM Platform, you'll develop scalable systems to deploy and manage large language models. If tackling complex distributed systems challenges and working on advanced AI platforms excites you, this role offers an ideal opportunity.
About the Company:
A leader in AI innovation, this company empowers enterprises to build secure, efficient AI systems. Backed by top-tier VCs, they focus on solving real-world problems in LLM deployment and infrastructure.
About the Role:
You'll design Kubernetes-based platforms for LLM tuning and inference, manage GPU clusters, and resolve distributed system issues. Collaboration with machine learning engineers and product teams will be central to building cutting-edge AI solutions.
What We Can Offer You
Key Responsibilities
Keywords: Distributed Systems, Machine Learning, Software Development, GPU, Graphic Processing Unit, Kubernetes, Large Language Models, LLMs, Enterprise Software, Distributed Systems, ML Infra, ML Infrastructure, LLM Inference, LLM tuning, GPU clusters