Taro Logo

Software Engineer, Safeguards Intelligence

Anthropic creates reliable, interpretable, and steerable AI systems, focusing on safe and beneficial AI development.
$300,000 - $320,000
Security
Senior Software Engineer
Hybrid
501 - 1,000 Employees
3+ years of experience
AI · Cybersecurity

Description For Software Engineer, Safeguards Intelligence

Anthropic is seeking a Software Engineer for their Safeguards Intelligence team to help build safety and oversight mechanisms for AI systems. This role focuses on developing systems to monitor and understand how users interact with AI models, particularly in detecting and preventing potential misuse. The position combines technical software engineering skills with safety and security expertise.

The role involves building sophisticated monitoring systems, data analysis tools, and infrastructure for detecting novel patterns of abuse. You'll work closely with data scientists to track usage patterns and with threat investigators to enhance their capabilities. This is a critical position in ensuring AI systems remain safe and beneficial for users and society.

Anthropic offers a collaborative environment focused on high-impact AI research and development. The company approaches AI research as an empirical science, similar to physics and biology. They value team-based work on large-scale research efforts rather than smaller, isolated projects. The company maintains a strong focus on safety, transparency, and responsible oversight in AI development.

The position offers competitive compensation ($300,000-$320,000), comprehensive benefits, and a hybrid work arrangement requiring at least 25% time in office. Anthropic sponsors visas and values diverse perspectives, encouraging applications from candidates of all backgrounds. The company's mission-driven approach, focus on beneficial AI development, and commitment to empirical research make this an opportunity to contribute to significant advancements in safe AI technology.

Last updated 19 days ago

Responsibilities For Software Engineer, Safeguards Intelligence

  • Develop monitoring systems to detect unwanted behaviors from users and take automated enforcement actions
  • Build robust and reliable internal tools for rich data understanding and exploration
  • Work with data scientists to maintain situational awareness of usage patterns and trends
  • Build integrations with third-party data-enrichment vendors
  • Create infrastructure to power large scale, unsupervised learning techniques to detect novel patterns of abuse

Requirements For Software Engineer, Safeguards Intelligence

Python
  • Bachelor's degree in Computer Science, Software Engineering or comparable experience
  • 3-10+ years of experience in software engineering, preferably with focus on integrity, spam, fraud, or abuse detection
  • Proficiency in Python, SQL, and data analysis tools
  • Strong communication skills and ability to explain complex technical concepts to non-technical stakeholders
  • Experience building trust and safety mechanisms for AI/ML systems (preferred)
  • Experience with prompt engineering, jailbreak attacks, and adversarial inputs (preferred)
  • Experience working with threat intelligence or investigative teams (preferred)

Benefits For Software Engineer, Safeguards Intelligence

Medical Insurance
Visa Sponsorship
Parental Leave
  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation
  • Flexible working hours
  • Office space

Jobs Related To Anthropic Software Engineer, Safeguards Intelligence