Location: all cities,CA, USA
Our product is doing for LLMs what Google did for websites. In the early internet, the number of websites was exploding and it was hard to figure out what website you should use for what task. Google fixed that problem by building a search engine that aggregated websites across the internet. A similar problem exists in AI today; the number of models is exploding and it's hard to figure out what model you should use for what task. Our product fixes that problem through a model router: You give us your prompt, we run it on the best model in real time.
We can do this because we've learned how to predict the performance of a model without running it. That lets us find a model which can complete your request with the highest performance and lowest cost. The value proposition is simple: stop worrying about AI, start focusing on product.That idea -- making it so that people can stop worrying about AI -- is the core of what we do. Model-routing is just the first tool we're building to help understand the way in which models behave. By pioneering techniques like this, we want to solve the most fundamental problem in AI: understanding why models behave the way they do, and creating guarantees they'll behave the way we want.
About the role:As a research scientist with Martian, you will develop new techniques to understand how AI models work. This work will focus on exploring and improving a technique we call “model mapping”: converting transformers into more interpretable representations (such as programs). We are looking for people who can develop and scale up methods for making transformers more interpretable through model mapping and then understanding the transformers in the new domain we map to.
Responsibilities may include:
Designing experiments to measure the effectiveness of model mapping techniques
Studying models and how they can be turned into programs
Managing and exploring large datasets from interpretability experiments
Investigating the internal operations of large language models
Designing novel approaches for understanding how large language models work
AI Research Content Development
Produce in-depth technical content on model interpretability and LLM routing
Write research papers for top AI conferences and journals
Create detailed blog posts explaining complex AI concepts
Technical Community Engagement
Participate in technical discussions with AI researchers on social media
Represent Martian's technical perspective in AI forums and conferences
Develop and maintain relationships with key technical influencers in AI
You'll thrive in this role if you:
Are excited about Martian's mission to understand AI and build better AI tooling
Want to discover the algorithms underlying intelligence and are motivated to spend your career exploring how models work
Enjoy a fast-paced startup environment
Have experience implementing ML algorithms (e.g. pytorch) and distributed training (e.g. pytorch lightning, deepspeed)
Enjoy writing clean code and thinking about programming languages
Have an interest in Mechanistic Interpretability