Machine Learning Engineer, Trust & Safety

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.

San Francisco Bay Area, CA, USA • New York, NY, USA

$300,000 - $405,000

Machine Learning

Senior Software Engineer

Hybrid

101 - 500 Employees

4+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Machine Learning Engineer, Trust & Safety

Anthropic is seeking a Machine Learning Engineer for Trust & Safety to help build safety and oversight mechanisms for their AI systems. The role involves training models to detect harmful behaviors, ensure user well-being, and uphold principles of safety, transparency, and oversight while enforcing terms of service and acceptable use policies.

Key responsibilities include:

Building ML models to detect unwanted or anomalous behaviors from users and API partners
Improving automated detection and enforcement systems
Analyzing user reports of inappropriate accounts and building proactive detection models
Surfacing abuse patterns to research teams to harden models at the training stage

The ideal candidate should have:

4+ years of experience in research/ML engineering or applied research, preferably in trust and safety
Proficiency in SQL, Python, and data analysis/mining tools
Experience building trust and safety AI/ML systems (e.g., behavioral classifiers, anomaly detection)
Strong communication skills to explain complex technical concepts
Care about societal impacts and long-term implications of their work

Additional valuable experience includes:

Familiarity with ML frameworks like Scikit-Learn, Tensorflow, or PyTorch
Experience with high-performance, large-scale ML systems
Knowledge of language modeling with transformers
Background in reinforcement learning
Experience with large-scale ETL

Anthropic offers a competitive compensation package, including salary, equity, and benefits. They provide a collaborative work environment, flexible hours, and the opportunity to work on impactful AI research and development.

The company values diversity and encourages applications from underrepresented groups. They offer visa sponsorship for eligible candidates and have a hybrid work policy requiring at least 25% in-office presence.

Join Anthropic to contribute to the development of safe and beneficial AI systems while working with a committed team of researchers, engineers, and experts in a rapidly growing field.

Last updated a year ago

Responsibilities For Machine Learning Engineer, Trust & Safety

Build ML models to detect unwanted behaviors
Improve automated detection and enforcement systems
Analyze user reports and build proactive detection models
Surface abuse patterns to research teams

Requirements For Machine Learning Engineer, Trust & Safety

Python

4+ years of experience in research/ML engineering or applied research, preferably in trust and safety
Proficiency in SQL, Python, and data analysis/mining tools
Experience building trust and safety AI/ML systems
Strong communication skills
Care about societal impacts of AI work

Benefits For Machine Learning Engineer, Trust & Safety

Equity

Medical Insurance

Dental Insurance

Vision Insurance

401k

Parental Leave

Equity
Health insurance
Dental insurance
Vision insurance
401(k) with 4% matching
22 weeks paid parental leave
Unlimited PTO
Education stipend
Wellness stipend
Fertility benefits
Daily lunches and snacks
Relocation support