Anthropic is seeking a Machine Learning Engineer for their Safeguards team to help build safety and oversight mechanisms for AI systems. This role combines technical ML expertise with a focus on ensuring AI safety and beneficial outcomes. The position offers a competitive salary range of $340,000-$425,000 USD and is based in either San Francisco or New York City with a hybrid work arrangement.
The role involves building and implementing ML models for detecting harmful behaviors and ensuring user well-being, while upholding Anthropic's principles of safety, transparency, and oversight. Key responsibilities include developing detection systems for unwanted behaviors, improving enforcement mechanisms, and working closely with research teams to enhance model safety at the training stage.
Ideal candidates should have 4+ years of experience in ML engineering or applied research, with expertise in Python, SQL, and trust/safety systems. Strong communication skills are essential as you'll be explaining complex technical concepts to various stakeholders. The role requires a bachelor's degree or equivalent experience in a related field.
Anthropic offers an impressive benefits package including competitive compensation, equity donation matching, generous vacation and parental leave, and flexible working hours. The company maintains a collaborative environment with a focus on big science approaches to AI research, similar to physics and biology. They value diversity and encourage applications from candidates of all backgrounds, recognizing that AI systems have significant social and ethical implications.
The company operates as a public benefit corporation and maintains a hybrid work model requiring at least 25% office presence. They offer visa sponsorship and have a strong commitment to advancing safe and beneficial AI technology. This role presents an opportunity to work on cutting-edge AI safety challenges while contributing to Anthropic's mission of creating reliable and interpretable AI systems.