Site Reliability Engineer

Orgvue

Orgvue is an organisational design and planning platform that empowers businesses to transform their workforce by understanding work and skills.

London, UK

Site Reliability

Principal Software Engineer

Hybrid

101 - 500 Employees

8+ years of experience

Enterprise SaaS

Description For Site Reliability Engineer

Orgvue, headquartered in London with global offices, is seeking a Principal Site Reliability Engineer to join their team. The role combines technical leadership with hands-on expertise in AWS and Kubernetes infrastructure. As a senior technical leader, you'll be responsible for scaling and hardening their cloud infrastructure while building a world-class reliability culture. The position involves working across product, platform, and operations teams to ensure system reliability, observability, and resilience at scale. You'll be instrumental in defining SLOs, implementing cloud infrastructure strategies, and mentoring teams on SRE practices. The company offers a comprehensive benefits package including hybrid working, healthcare, wellbeing programs, and various lifestyle perks. This is an excellent opportunity for an experienced SRE leader who combines technical expertise with strategic vision and strong communication skills. The role is perfect for someone passionate about building robust, scalable systems while fostering a culture of operational excellence.

Last updated 2 months ago

Responsibilities For Site Reliability Engineer

Define and enforce SLOs, SLIs, and error budgets across critical services
Craft and implement cloud infrastructure and tooling strategy
Work across organization to level up SRE practices
Implement robust observability metrics, logs & traces
Guide the team in building automated, self-healing systems
Own and evolve incident response processes
Mentor engineers on best practices
Drive Infrastructure as Code using Terraform, Kubernetes, CloudFormation and GitOps practices
Collaborate with security, DevOps, and software teams
Evaluate and introduce tools for performance and reliability improvement

Requirements For Site Reliability Engineer

Kubernetes

Linux

Demonstrable experience leading SRE transformations
Deep hands-on expertise with Kubernetes (EKS preferred) in production environments
Strong experience with AWS core services
Expert in Infrastructure as Code using tools such as Terraform
Strong background in observability: metrics, visualization, logging, and tracing
Understanding of automation, SDLC, CI/CD pipelines, deployment automation
Proven experience with incident management, disaster recovery planning, root cause analysis

Benefits For Site Reliability Engineer

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Assistance

Hybrid working - 1+ days a week in the London office
Sanctus Coaching
Virtual fitness sessions
Wellbeing webinars
Annual Wellbeing day
Subsidised Gym Membership
Private Medical Insurance (including Dental and Vision)
Life Assurance
25 days holiday (increasing to 30 days)
Summer Fridays (half-day Fridays for July and August)
5% employer pension contribution
Season ticket Loan
Cycle to Work Scheme
Annual Discretionary Bonus

Orgvue

Orgvue is an organisational design and planning platform that empowers businesses to transform their workforce by understanding work and skills.

London, UK

Site Reliability

Principal Software Engineer

Hybrid

101 - 500 Employees

8+ years of experience

Enterprise SaaS

Interested in this job?

Site Reliability Engineer

Orgvue

Description For Site Reliability Engineer

Responsibilities For Site Reliability Engineer

Requirements For Site Reliability Engineer

Benefits For Site Reliability Engineer

Orgvue

Jobs Related To Orgvue Site Reliability Engineer