Taro Logo

Site Reliability Engineer, Lead

AI company developing high-performance, open-source models and solutions, including le Chat AI assistant, for enterprise needs in cloud and on-premises environments.
Site Reliability
Staff Software Engineer
Hybrid
10+ years of experience
AI · Enterprise SaaS

Description For Site Reliability Engineer, Lead

Mistral AI is seeking a Lead Site Reliability Engineer (SRE) to spearhead their infrastructure team in building reliable, fault-tolerant, and scalable systems. This role combines leadership responsibilities with hands-on technical work, split equally between team leadership (33%), operations (33%), and development (33%). The position involves managing high-performing teams while ensuring the reliability of critical distributed environments and improving customer interactions with core products.

The role requires extensive experience in DevOps/SRE (10+ years) and leadership capabilities. You'll be responsible for designing and maintaining scalable infrastructure, implementing monitoring systems, and driving automation improvements. The position involves working with cutting-edge AI/ML technologies and contributing to open-source projects.

Mistral AI offers a flexible work environment with offices across Europe (Paris, London, Barcelona/Madrid, Berlin/Munich/Frankfurt). The company provides competitive compensation, including equity, and comprehensive benefits. They maintain a strong culture focused on rigorous reasoning, audacious thinking, and customer success.

The ideal candidate will bring expertise in cloud computing, distributed systems, and modern DevOps tools, combined with strong leadership and communication skills. Experience with AI/ML environments and high-performance computing would be particularly valuable. This is an opportunity to shape the future of AI infrastructure at a pioneering company with a global presence.

Last updated 2 days ago

Responsibilities For Site Reliability Engineer, Lead

  • Lead and empower the infrastructure team
  • Design, build, and maintain scalable, highly available infrastructure
  • Ensure platform and inference environments are highly available
  • Implement monitoring, alerting, and incident response systems
  • Manage CI/CD, containerization, and orchestration workflows
  • Participate in on-call rotations
  • Drive infrastructure automation improvements
  • Collaborate with AI/ML researchers
  • Build cloud-agnostic platform
  • Ensure infrastructure security compliance
  • Project planning and stakeholder collaboration

Requirements For Site Reliability Engineer, Lead

Python
Go
Linux
Kubernetes
  • 10+ years of experience in DevOps/SRE role
  • Experience with building and leading high-performing teams
  • Experience with cloud computing and distributed systems
  • Experience with site reliability in critical environments
  • Experience with reliability KPIs
  • Hands-on experience with CI/CD, containerization, and orchestration tools
  • Proficiency in scripting languages (Python, Go, Bash)
  • Understanding of networking, security, and system administration
  • Excellent problem-solving and communication skills
  • Self-motivated with startup mindset

Benefits For Site Reliability Engineer, Lead

Medical Insurance
Visa Sponsorship
Parental Leave
  • Competitive cash salary and equity
  • Health insurance (except Germany)
  • Transportation allowance
  • Sport allowance
  • Meal vouchers
  • Private pension plan
  • Parental leave (France only)
  • Visa sponsorship (France only)
  • 100% inter-country travel coverage
  • 50% intra-country travel coverage

Interested in this job?

Jobs Related To Mistral AI Site Reliability Engineer, Lead