Taro Logo

Senior Software Engineer, Site Reliability Engineering

Google is a global technology company that builds innovative products and services used by billions of users.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS

Job Description

Google's Site Reliability Engineering (SRE) team is looking for a Senior Software Engineer to join their mission of building and running large-scale, massively distributed, fault-tolerant systems. This role combines software and systems engineering to ensure Google Cloud's services maintain reliability and appropriate uptime while continuously improving performance.

As an SRE, you'll work on optimizing existing systems, building infrastructure, and automating processes. The role offers unique challenges of scale specific to Google Cloud, requiring expertise in coding, algorithms, complexity analysis, and large-scale system design. You'll be part of a culture that values intellectual curiosity, problem-solving, and openness, working alongside people with diverse backgrounds and perspectives.

The Technical Infrastructure team is crucial in maintaining Google's architecture, from developing and maintaining data centers to building next-generation Google platforms. The role involves ensuring networks run optimally for the best user experience possible.

This position offers the opportunity to work with cutting-edge technology at massive scale, contribute to critical infrastructure, and be part of a team that values continuous learning and innovation. The ideal candidate will combine technical expertise with leadership skills to drive improvements in system reliability and performance.

Working at Google also means being part of a company committed to diversity, equality, and creating a culture of belonging. The role offers the chance to make a significant impact on systems used by billions of users while working with some of the industry's brightest minds in distributed systems and reliability engineering.

Last updated 3 days ago

Responsibilities For Senior Software Engineer, Site Reliability Engineering

  • Engage in and improve the whole lifecycle of services—from inception and design, through to deployment, operation and refinement
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Practice sustainable incident response and blameless postmortems

Requirements For Senior Software Engineer, Site Reliability Engineering

Linux
Kubernetes
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 5 years of experience with software development in one or more programming languages
  • 3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems
  • 2 years of experience leading projects and providing technical leadership

Benefits For Senior Software Engineer, Site Reliability Engineering

Medical Insurance
401k
Parental Leave
  • Comprehensive health benefits
  • Retirement plans
  • Parental leave

Related Jobs

Senior Software Engineer, Site Reliability Engineering

Senior Site Reliability Engineering role at Google, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Engineer, Site Reliability Engineering

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems, combining software development with systems engineering expertise.

Senior Software Engineer, Site Reliability Engineering

Senior Site Reliability Engineering role at Google, combining software and systems engineering to build and maintain large-scale distributed systems with focus on reliability and performance.

Senior Specialist - Site Reliability Engineer III

Senior Site Reliability Engineer III position at On in Berlin, focusing on maintaining and improving system reliability and performance through SRE practices.

Senior Site Reliability Engineer — AI Studio (Inference Platform)

Senior SRE position at Nebius focusing on AI infrastructure, requiring expertise in Kubernetes, observability, and GPU optimization for large-scale inference platforms.