Research Engineer, Trust & Safety
: Job Details :

Research Engineer, Trust & Safety

Menlo Ventures Management, L.P

Location: San Francisco,CA, USA

Date: 2025-01-08T00:20:12Z

Job Description:

About the roleWe are looking for Research engineers to help design and build safety and oversight algorithms for our AI models and products. As a Trust and Safety Research Engineer, you will work to design and train ML models based on research progress, which detect harmful user/model behaviors and help ensure society's well-being. You will apply your research skills to uphold our principles of safety, transparency, and oversight while enforcing our terms of service and acceptable use policies.What you will be working on:

Design, iterate and build ML models to detect unwanted or anomalous behaviors from both users and LLM models
Work with T&S ML engineers to review and iterate experiment ideations. Co-author the experiment success criteria and production deployment roadmaps
Partner with T&S Policy and Enforcement cross-functional teams to understand emerging and sustained abuse patterns from user prompts and behaviors. Incorporate the insights into T&S research datasets
Surface abuse patterns to sibling research teams in the company. Collaborate together to harden Anthropic's LLMs at the pre/post training stages
Stay current with state-of-the-art research in AI and machine learning, and propose ways to apply these advancements to T&S systemsYou may be a good fit if you:
- Have 4+ years of experience in a research engineering or an applied research scientist position, preferably with a focus on trust and safety
- Have significant Python programming experience and machine learning experience
- Have proficiency in building trustworthy and safe AI technology
- Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
- Care about the societal impacts and long-term implications of your work and are results orientedStrong candidates may also:
  - Have experience fine-tuning large language models with supervised learning or reinforcement learning
  - Have experience with machine learning frameworks like Scikit-Learn, Tensorflow, or Pytorch
  - Have experience authoring research papers in machine learning, NLP, or AI alignment or similar industry experience
  - Have developed evaluations for language models #J-18808-Ljbffr

Apply Now!

Similar Jobs (0)

-- View More Similar Jobs --

Research Engineer, Trust & Safety: Job Details :

Research Engineer, Trust & Safety

Research Engineer, Trust & Safety
: Job Details :