Posted in

Research Engineer – Tokens ML Infra

Research Engineer – Tokens ML Infra

CompanyAnthropic
LocationSan Francisco, CA, USA
Salary$315000 – $425000
TypeFull-Time
DegreesMaster’s, PhD
Experience LevelMid Level, Senior

Requirements

  • Strong software engineering skills with experience in building distributed systems
  • Expertise in Python and experience with distributed computing frameworks
  • Deep understanding of cloud computing platforms and distributed systems architecture
  • Experience with high-throughput, fault-tolerant system design
  • Strong background in performance optimization and system scaling
  • Excellent problem-solving skills and attention to detail
  • Strong communication skills and ability to work in a collaborative environment

Responsibilities

  • Design and implement high-performance ML training infrastructure for large language model research
  • Develop and maintain core ML framework primitives in JAX, PyTorch, etc.
  • Create robust automated evaluation and benchmarking systems for model performance
  • Implement comprehensive monitoring and debugging tools for ML workflows
  • Design and optimize data loading pipelines that maximize training throughput
  • Build MLOps tooling to support reproducible research and experimentation
  • Collaborate with research teams to prototype and scale novel training architectures
  • Develop infrastructure for efficient hyperparameter sweeps and architecture search

Preferred Qualifications

  • Advanced degree (MS or PhD) in Computer Science or related field
  • Experience with language model training infrastructure
  • Strong background in distributed systems and parallel computing
  • Expertise in tokenization algorithms and techniques
  • Experience building high-throughput, fault-tolerant systems
  • Deep knowledge of monitoring and observability practices
  • Experience with infrastructure-as-code and configuration management
  • Background in MLOps or ML infrastructure