Taro Logo

Senior Software Engineer, Site Reliability Engineering

Google is a global technology company that builds innovative products and services used by billions of users.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS

Job Description

Google's Site Reliability Engineering (SRE) team is at the forefront of ensuring the reliability and performance of Google Cloud's massive distributed systems. As a Senior SRE, you'll combine software and systems engineering expertise to build and maintain fault-tolerant systems at scale. The role involves working on critical infrastructure that powers both internal and customer-facing services, focusing on reliability, uptime, and continuous improvement.

The position offers unique challenges in managing complex systems at Google-scale while leveraging your expertise in coding, algorithms, and large-scale system design. You'll be part of a culture that values intellectual curiosity, problem-solving, and openness, working in a blame-free environment that encourages collaboration and innovation.

The Technical Infrastructure team, which this role is part of, is fundamental to Google's product portfolio, developing and maintaining data centers and building next-generation platforms. The team takes pride in being the engineers' engineers, focusing on keeping networks running optimally to ensure the best user experience.

This role is perfect for someone who enjoys the intersection of software development and systems engineering, with opportunities to work on system optimization, infrastructure development, and automation. You'll be involved in the entire service lifecycle, from design to deployment and refinement, while also participating in on-call rotations and incident response.

The position offers the chance to work with cutting-edge technology, solve complex distributed systems challenges, and make a significant impact on Google's infrastructure. You'll be part of a diverse team that brings together people with various backgrounds and perspectives, promoting self-direction while providing support and mentorship for continuous learning and growth.

Last updated a month ago

Responsibilities For Senior Software Engineer, Site Reliability Engineering

  • Engage in and improve the whole lifecycle of services—from inception and design, through to deployment, operation and refinement
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Practice sustainable incident response and blameless postmortems

Requirements For Senior Software Engineer, Site Reliability Engineering

Linux
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 5 years of experience with software development in one or more programming languages
  • 3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems
  • 2 years of experience leading projects and providing technical leadership
  • Experience working in computing, distributed systems, storage, or networking
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug, optimize code, and to automate routine tasks
  • Systematic problem-solving approach, coupled with effective verbal and written communication skills

Benefits For Senior Software Engineer, Site Reliability Engineering

Medical Insurance
401k
Parental Leave
  • Comprehensive health benefits
  • Retirement plans
  • Parental leave

Related Jobs

Senior Software Engineer, Site Reliability Engineering

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems for Google Cloud services.

Senior Software Engineer, Site Reliability Engineering

Senior Site Reliability Engineering role at Google, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Engineer, Site Reliability Engineering

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems, combining software development with systems engineering expertise.

Senior Specialist - Site Reliability Engineer III

Senior Site Reliability Engineer III position at On in Berlin, focusing on maintaining and improving system reliability and performance through SRE practices.

Senior Site Reliability Engineer — AI Studio (Inference Platform)

Senior SRE position at Nebius focusing on AI infrastructure, requiring expertise in Kubernetes, observability, and GPU optimization for large-scale inference platforms.