Research Engineer, Trust & Safety
: Job Details :


Research Engineer, Trust & Safety

Menlo Ventures Management, L.P

Location: San Francisco,CA, USA

Date: 2025-01-08T00:20:12Z

Job Description:
About the roleWe are looking for Research engineers to help design and build safety and oversight algorithms for our AI models and products. As a Trust and Safety Research Engineer, you will work to design and train ML models based on research progress, which detect harmful user/model behaviors and help ensure society's well-being. You will apply your research skills to uphold our principles of safety, transparency, and oversight while enforcing our terms of service and acceptable use policies.What you will be working on:
  • Design, iterate and build ML models to detect unwanted or anomalous behaviors from both users and LLM models
  • Work with T&S ML engineers to review and iterate experiment ideations. Co-author the experiment success criteria and production deployment roadmaps
  • Partner with T&S Policy and Enforcement cross-functional teams to understand emerging and sustained abuse patterns from user prompts and behaviors. Incorporate the insights into T&S research datasets
  • Surface abuse patterns to sibling research teams in the company. Collaborate together to harden Anthropic's LLMs at the pre/post training stages
  • Stay current with state-of-the-art research in AI and machine learning, and propose ways to apply these advancements to T&S systemsYou may be a good fit if you:
    • Have 4+ years of experience in a research engineering or an applied research scientist position, preferably with a focus on trust and safety
    • Have significant Python programming experience and machine learning experience
    • Have proficiency in building trustworthy and safe AI technology
    • Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
    • Care about the societal impacts and long-term implications of your work and are results orientedStrong candidates may also:
      • Have experience fine-tuning large language models with supervised learning or reinforcement learning
      • Have experience with machine learning frameworks like Scikit-Learn, Tensorflow, or Pytorch
      • Have experience authoring research papers in machine learning, NLP, or AI alignment or similar industry experience
      • Have developed evaluations for language models #J-18808-Ljbffr
Apply Now!

Similar Jobs (0)