Machine Learning Engineer, Safeguards

Anthropic

Anthropic creates reliable, interpretable, and steerable AI systems, focusing on safe and beneficial AI development.

San Francisco, CA, USA • New York, NY, USA

$340,000 - $425,000

Machine Learning

Senior Software Engineer

Hybrid

501 - 1,000 Employees

4+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Machine Learning Engineer, Safeguards

Anthropic is seeking a Machine Learning Engineer for their Safeguards team to help build safety and oversight mechanisms for AI systems. This role combines technical ML expertise with a focus on ensuring AI safety and beneficial outcomes. The position offers a competitive salary range of $340,000-$425,000 USD and is based in either San Francisco or New York City with a hybrid work arrangement.

The role involves building and implementing ML models for detecting harmful behaviors and ensuring user well-being, while upholding Anthropic's principles of safety, transparency, and oversight. Key responsibilities include developing detection systems for unwanted behaviors, improving enforcement mechanisms, and working closely with research teams to enhance model safety at the training stage.

Ideal candidates should have 4+ years of experience in ML engineering or applied research, with expertise in Python, SQL, and trust/safety systems. Strong communication skills are essential as you'll be explaining complex technical concepts to various stakeholders. The role requires a bachelor's degree or equivalent experience in a related field.

Anthropic offers an impressive benefits package including competitive compensation, equity donation matching, generous vacation and parental leave, and flexible working hours. The company maintains a collaborative environment with a focus on big science approaches to AI research, similar to physics and biology. They value diversity and encourage applications from candidates of all backgrounds, recognizing that AI systems have significant social and ethical implications.

The company operates as a public benefit corporation and maintains a hybrid work model requiring at least 25% office presence. They offer visa sponsorship and have a strong commitment to advancing safe and beneficial AI technology. This role presents an opportunity to work on cutting-edge AI safety challenges while contributing to Anthropic's mission of creating reliable and interpretable AI systems.

Last updated 3 months ago

Responsibilities For Machine Learning Engineer, Safeguards

Build machine learning models to detect unwanted or anomalous behaviors from users and API partners
Improve automated detection and enforcement systems
Analyze user reports of inappropriate accounts and build ML models for proactive detection
Surface abuse patterns to research teams to harden models at the training stage

Requirements For Machine Learning Engineer, Safeguards

Python

4+ years of experience in research/ML engineering or applied research scientist position
Proficiency in SQL, Python, and data analysis/data mining tools
Proficiency in building trust and safety AI/ML systems
Strong communication skills
Bachelor's degree in a related field or equivalent experience

Benefits For Machine Learning Engineer, Safeguards

Medical Insurance

Visa Sponsorship

Parental Leave

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Office space for collaboration
Visa sponsorship available