Posted in

Research Engineer / Scientist – Safeguards

Research Engineer / Scientist – Safeguards

CompanyAnthropic
LocationSan Francisco, CA, USA, New York, NY, USA
Salary$320000 – $560000
TypeFull-Time
DegreesBachelor’s
Experience LevelMid Level, Senior

Requirements

  • Have significant software, ML, or research engineering experience
  • Have some experience contributing to empirical AI research projects
  • Have some familiarity with technical AI safety research
  • Education requirements: We require at least a Bachelor’s degree in a related field or equivalent experience.

Responsibilities

  • Testing the robustness of our safety techniques by training language models to subvert our safety techniques, and seeing how effective they are at subverting our interventions.
  • Run multi-agent reinforcement learning experiments to test out techniques like AI Debate.
  • Build tooling to efficiently evaluate the effectiveness of novel LLM-generated jailbreaks.
  • Write scripts and prompts to efficiently produce evaluation questions to test models’ reasoning abilities in safety-relevant contexts.
  • Contribute ideas, figures, and writing to research papers, blog posts, and talks.
  • Run experiments that feed into key AI safety efforts at Anthropic, like the design and implementation of our Responsible Scaling Policy.

Preferred Qualifications

  • Have experience authoring research papers in machine learning, NLP, or AI safety
  • Have experience with LLMs
  • Have experience with reinforcement learning
  • Have experience with Kubernetes clusters and complex shared codebases