Taro Logo

Senior Software Engineer, SRE, Cloud Incident Response

Google is a global technology company that builds innovative products and services used by billions of users.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS · Cloud

Description For Senior Software Engineer, SRE, Cloud Incident Response

Google's Site Reliability Engineering (SRE) team is seeking a Senior Software Engineer to join their Cloud Incident Response team. This role combines software and systems engineering to build and maintain large-scale, distributed systems for Google Cloud Platform. The position focuses on ensuring service reliability, managing incidents, and driving continuous improvement through automation.

As an SRE, you'll tackle complex challenges unique to Google Cloud's scale, applying expertise in coding, algorithms, and system design. The role involves critical incident support, building tooling for incident response, and implementing processes to improve system reliability. You'll work in a culture that values intellectual curiosity and problem-solving, collaborating with diverse teams across Google's Technical Infrastructure organization.

The ideal candidate brings strong experience in distributed systems, incident management, and technical leadership. You'll be responsible for maintaining system stability, developing automation tools, and driving improvements in incident response processes. This is an opportunity to work on mission-critical systems that power Google's vast product portfolio while contributing to the evolution of cloud infrastructure.

Working at Google offers exposure to cutting-edge technology, collaboration with world-class engineers, and the chance to impact billions of users. The role provides opportunities for growth, learning, and technical leadership in a supportive environment that promotes self-direction and innovation. Join Google's SRE team to help build and maintain the future of cloud computing while solving some of the most interesting technical challenges in the industry.

Last updated 3 days ago

Responsibilities For Senior Software Engineer, SRE, Cloud Incident Response

  • Ensure Google Cloud Platform (GCP) stability and reliability through critical incident support
  • Create training, end-to-end processes for incident management life-cycle
  • Build systems and tooling to support Incident Response team
  • Define and escalate risks in Cloud, reduce Major incident probabilities
  • Ensure the scalability and reliability of systems throughout their life-cycle

Requirements For Senior Software Engineer, SRE, Cloud Incident Response

Linux
Kubernetes
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 5 years of experience with software development in one or more programming languages
  • 5 years of experience with data structures or algorithms
  • 3 years of experience in designing, analyzing, and troubleshooting distributed systems
  • 2 years of experience leading projects and providing technical leadership
  • Experience in SRE or incident management/response environments

Benefits For Senior Software Engineer, SRE, Cloud Incident Response

Medical Insurance
Dental Insurance
Vision Insurance
Parental Leave
  • Comprehensive health benefits
  • Parental leave

Interested in this job?

Jobs Related To Google Senior Software Engineer, SRE, Cloud Incident Response