Evals Software Engineer (Infrastructure focus)

Apollo Research

Apollo Research focuses on behavioral model evaluations and auditing real-world AI models, specializing in studying deceptive alignment in AI systems.

London, UK

Backend

Senior Software Engineer

In-Person

11 - 50 Employees

5+ years of experience

AI · Cybersecurity

Description For Evals Software Engineer (Infrastructure focus)

Apollo Research is seeking a Senior Software Engineer to lead their infrastructure initiatives for AI evaluation systems. This role combines sophisticated backend engineering with cutting-edge AI safety research, focusing on building secure and scalable systems for evaluating frontier AI models. The position offers an opportunity to work with a distinguished team of researchers and engineers in central London, contributing to critical work in AI safety and alignment.

The role involves designing and implementing infrastructure for running frontier Language Model evaluations, making key technical decisions about the stack, and building essential internal tools. You'll work closely with researchers to understand and support their needs while ensuring robust security practices across all systems. The position requires expertise in Python, Kubernetes, and AWS, with a strong focus on Infrastructure as Code and security best practices.

As part of Apollo Research's mission to understand and prevent deceptive alignment in AI systems, you'll be working on projects that directly impact the safety and reliability of frontier AI systems. The company offers a collaborative environment, working alongside team members like Mikita Balesni, Jérémy Scheurer, and others who are leading figures in AI safety research.

The position comes with excellent benefits, including competitive compensation, flexible working hours, unlimited vacation, and comprehensive relocation support through the UK Government's AI Futures Grants program. The role is based in London, sharing office space with the London Initiative for Safe AI (LISA), and offers visa sponsorship for international candidates.

This is an ideal opportunity for a senior engineer who wants to combine technical excellence with meaningful impact in AI safety, working on infrastructure that supports critical research in understanding and evaluating advanced AI systems. The role offers significant autonomy and the chance to shape the technical direction of an important player in the AI safety field.

Last updated 3 hours ago

Responsibilities For Evals Software Engineer (Infrastructure focus)

Design, implement, scale, and maintain infrastructure for running frontier LLM evals using Infrastructure as Code (IaC)
Choose and integrate appropriate technologies for infrastructure stack
Build internal software tools for job orchestration, project access, and results storage
Collaborate with researchers to understand future infrastructure needs
Ensure evals run on infrastructure and debug issues throughout the technology stack
Administer and secure internal AWS accounts
Help set up and manage organisation-wide security processes
Co-create and lead the infrastructure team

Requirements For Evals Software Engineer (Infrastructure focus)

Python

Kubernetes

Linux

Strong software engineering background, preferably in Python
Experience leading infrastructure projects from start to finish
Strong hands-on experience with Kubernetes
Solid knowledge of AWS, including IAM and EKS
Experience implementing security best practices for cloud and containerised environments
Experience with Infrastructure as Code tools (e.g. Terraform)

Benefits For Evals Software Engineer (Infrastructure focus)

Visa Sponsorship

Education Budget

Relocation Benefits

Competitive UK-based salary
Flexible work hours and schedule
Unlimited vacation
Unlimited sick leave
Lunch, dinner, and snacks provided on workdays
Paid work trips, including staff retreats and conferences
$1,000 USD yearly professional development budget
Visa sponsorship available
Relocation support up to £10,000 (via AI Futures Grants)

Apollo Research

Apollo Research focuses on behavioral model evaluations and auditing real-world AI models, specializing in studying deceptive alignment in AI systems.

London, UK

Backend

Senior Software Engineer

In-Person

11 - 50 Employees

5+ years of experience

AI · Cybersecurity

Interested in this job?

Evals Software Engineer (Infrastructure focus)

Apollo Research

Description For Evals Software Engineer (Infrastructure focus)

Responsibilities For Evals Software Engineer (Infrastructure focus)

Requirements For Evals Software Engineer (Infrastructure focus)

Benefits For Evals Software Engineer (Infrastructure focus)

Apollo Research

Jobs Related To Apollo Research Evals Software Engineer (Infrastructure focus)