Taro Logo

Research Engineer, Interpretability

Anthropic's mission is to create reliable, interpretable, and steerable AI systems.
$230,000 - $515,000
Machine Learning
Senior Software Engineer
Hybrid
101 - 500 Employees
5+ years of experience
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Research Engineer, Interpretability

Anthropic is seeking a Research Engineer for their Interpretability team in London. The role focuses on reverse engineering how trained models work, with the goal of making advanced AI systems safe through mechanistic understanding. Key responsibilities include implementing research experiments, optimizing workflows, building tools for rapid experimentation, and developing infrastructure to support model safety improvements. The ideal candidate should have 5-10+ years of software engineering experience, proficiency in programming languages (especially Python), and a strong ability to prioritize impactful work. Experience with machine learning, language modeling, and GPU optimization is beneficial. The role offers competitive compensation, including a salary range of £230,000 — £515,000 GBP, equity, and comprehensive benefits. Anthropic values diversity and encourages applications from underrepresented groups. The company operates on a hybrid work model, requiring at least 25% in-office presence, and offers visa sponsorship for eligible candidates.

Last updated a year ago

Responsibilities For Research Engineer, Interpretability

  • Implement and analyze research experiments, both quickly in toy scenarios and at scale in large models
  • Set up and optimize research workflows to run efficiently and reliably at large scale
  • Build tools and abstractions to support rapid pace of research experimentation
  • Develop and improve tools and infrastructure to support other teams in using Interpretability's work to improve model safety

Requirements For Research Engineer, Interpretability

Python
  • 5-10+ years of experience building software
  • Highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python
  • Strong ability to prioritize and direct effort toward the most impactful work
  • Comfortable operating with ambiguity and questioning assumptions
  • Interest in learning about machine learning research and its applications
  • Willingness to collaborate closely with researchers
  • Care about the societal impacts and ethics of your work

Benefits For Research Engineer, Interpretability

Equity
Medical Insurance
Dental Insurance
Vision Insurance
401k
Education Budget
Parental Leave
  • Equity
  • Health insurance
  • Dental insurance
  • Vision insurance
  • 401k with 4% matching
  • 22 weeks paid parental leave
  • Unlimited PTO
  • Education stipend
  • Home office improvement stipend
  • Commuting stipend
  • Wellness stipend
  • Fertility benefits
  • Daily lunches and snacks in office
  • Relocation support