Trust and Safety Software Engineer

Anthropic creates reliable, interpretable, and steerable AI systems for safe and beneficial use.
$240,000 - $325,000
Security
Mid-Level Software Engineer
Hybrid
3+ years of experience
AI · Cybersecurity
This job posting may no longer be active. You may be interested in these related jobs instead:
Application Security Engineer, AWS Proactive Security

AWS Security seeks Application Security Engineer to protect cloud infrastructure through security reviews, penetration testing, and tool development. 3+ years experience required.

Software Development Engineer II, VMR Engineering, Defensive Security

Software Development Engineer II role at Amazon focusing on defensive security and cloud protection, building mission-critical security systems using AWS technologies.

Security Engineer II, Dedicated Security Team

Security Engineer II position at Amazon focusing on acquisition security diligence, threat modeling, and enabling secure environments across subsidiaries.

Software Development Engineer, AWS Security

AWS Security Software Development Engineer position focusing on building and maintaining security telemetry systems at scale.

Security Engineer, AWS Cloud Response

Security Engineer role at AWS Cloud Response team, focusing on cloud security operations, incident response, and security improvement initiatives.

Description For Trust and Safety Software Engineer

Anthropic is seeking a Trust and Safety Software Engineer to help build safety and oversight mechanisms for their AI systems. This role focuses on developing monitoring systems, abuse detection mechanisms, and multi-layered defenses to ensure the safe and ethical use of AI models. You'll work on detecting unwanted behaviors, preventing misuse, and ensuring user well-being while enforcing terms of service and acceptable use policies.

Key responsibilities include:

  • Developing monitoring systems for API partners
  • Building abuse detection infrastructure
  • Surfacing abuse patterns to research teams
  • Implementing real-time safety mechanisms at scale
  • Analyzing user reports of inappropriate content

The ideal candidate has:

  • A Bachelor's degree in Computer Science or equivalent experience
  • 3-8+ years of software engineering experience, preferably in integrity or abuse detection
  • Proficiency in SQL, Python, and data analysis tools
  • Strong communication skills

Anthropic offers a competitive compensation package including salary, equity, and comprehensive benefits. They provide a collaborative work environment, focusing on high-impact AI research and development. The company values diversity and encourages applications from underrepresented groups.

Join Anthropic in their mission to create safe and beneficial AI systems that can positively impact society as a whole.

Last updated 6 months ago

Responsibilities For Trust and Safety Software Engineer

  • Develop monitoring systems to detect unwanted behaviors from API partners
  • Build abuse detection mechanisms and infrastructure
  • Surface abuse patterns to research teams
  • Build robust and reliable multi-layered defenses for real-time improvement of safety mechanisms
  • Analyze user reports of inappropriate content or accounts

Requirements For Trust and Safety Software Engineer

Python
  • Bachelor's degree in Computer Science, Software Engineering or comparable experience
  • 3-8+ years of experience in a software engineering position
  • Proficiency in SQL, Python, and data analysis tools
  • Strong communication skills

Benefits For Trust and Safety Software Engineer

401k
Dental Insurance
Education Budget
Equity
Medical Insurance
Parental Leave
Relocation Benefits
Vision Insurance
  • Equity donation matching
  • Health insurance
  • Dental insurance
  • Vision insurance
  • 401(k) with 4% matching
  • 22 weeks paid parental leave
  • Unlimited PTO
  • Education stipend
  • Home office improvement stipend
  • Commuting stipend
  • Wellness stipend
  • Fertility benefits
  • Daily lunches and snacks
  • Relocation support

Interested in this job?