Machine Learning Engineer, Trust & Safety

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.
$300,000 - $405,000
Machine Learning
Senior Software Engineer
Hybrid
101 - 500 Employees
4+ years of experience
This job posting may no longer be active. You may be interested in these related jobs instead:
Machine Learning Systems Engineer, RL Engineering

Senior ML Systems Engineer role at Anthropic focused on building and improving reinforcement learning systems for AI model training

Research Engineer, Knowledge Team

Senior Research Engineer position at Anthropic focused on redesigning how AI systems interact with external data sources through innovative information architectures and LLM training.

Research Engineer, Frontier Red Team (RSP Evaluations)

Senior Research Engineer position at Anthropic focusing on AI safety evaluations and implementing responsible scaling policies for frontier AI models.

Software Engineer

Senior Software Engineering role at Anthropic focusing on building and optimizing large-scale ML systems, with emphasis on AI safety and interpretability.

Sr. Software Development Engineer, Demand Science Optimization (DSO)

Senior Software Engineering role at Amazon focusing on machine learning and big data analytics for device demand forecasting and supply chain optimization.

Description For Machine Learning Engineer, Trust & Safety

Anthropic is seeking a Machine Learning Engineer for Trust & Safety to help build safety and oversight mechanisms for their AI systems. The role involves training models to detect harmful behaviors, ensure user well-being, and uphold principles of safety, transparency, and oversight while enforcing terms of service and acceptable use policies.

Key responsibilities include:

  • Building ML models to detect unwanted or anomalous behaviors from users and API partners
  • Improving automated detection and enforcement systems
  • Analyzing user reports of inappropriate accounts and building proactive detection models
  • Surfacing abuse patterns to research teams to harden models at the training stage

The ideal candidate should have:

  • 4+ years of experience in research/ML engineering or applied research, preferably in trust and safety
  • Proficiency in SQL, Python, and data analysis/mining tools
  • Experience building trust and safety AI/ML systems (e.g., behavioral classifiers, anomaly detection)
  • Strong communication skills to explain complex technical concepts
  • Care about societal impacts and long-term implications of their work

Additional valuable experience includes:

  • Familiarity with ML frameworks like Scikit-Learn, Tensorflow, or PyTorch
  • Experience with high-performance, large-scale ML systems
  • Knowledge of language modeling with transformers
  • Background in reinforcement learning
  • Experience with large-scale ETL

Anthropic offers a competitive compensation package, including salary, equity, and benefits. They provide a collaborative work environment, flexible hours, and the opportunity to work on impactful AI research and development.

The company values diversity and encourages applications from underrepresented groups. They offer visa sponsorship for eligible candidates and have a hybrid work policy requiring at least 25% in-office presence.

Join Anthropic to contribute to the development of safe and beneficial AI systems while working with a committed team of researchers, engineers, and experts in a rapidly growing field.

Last updated 9 months ago

Responsibilities For Machine Learning Engineer, Trust & Safety

  • Build ML models to detect unwanted behaviors
  • Improve automated detection and enforcement systems
  • Analyze user reports and build proactive detection models
  • Surface abuse patterns to research teams

Requirements For Machine Learning Engineer, Trust & Safety

Python
  • 4+ years of experience in research/ML engineering or applied research, preferably in trust and safety
  • Proficiency in SQL, Python, and data analysis/mining tools
  • Experience building trust and safety AI/ML systems
  • Strong communication skills
  • Care about societal impacts of AI work

Benefits For Machine Learning Engineer, Trust & Safety

Equity
Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Equity
  • Health insurance
  • Dental insurance
  • Vision insurance
  • 401(k) with 4% matching
  • 22 weeks paid parental leave
  • Unlimited PTO
  • Education stipend
  • Wellness stipend
  • Fertility benefits
  • Daily lunches and snacks
  • Relocation support

Interested in this job?