Staff Software Engineer, Site Reliability Engineering

Google is a global technology leader that specializes in internet-related services and products.
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
Enterprise SaaS · AI

Description For Staff Software Engineer, Site Reliability Engineering

Site Reliability Engineering (SRE) at Google Cloud combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As a Staff Software Engineer in SRE, you'll ensure that Google Cloud's services have reliability, uptime appropriate to customer's needs, and a fast rate of improvement. You'll work on optimizing existing systems, building infrastructure, and eliminating work through automation.

The role requires expertise in coding, algorithms, complexity analysis, and large-scale system design. You'll manage complex challenges of scale unique to Google Cloud while working in a culture that values diversity, intellectual curiosity, problem-solving, and openness.

Key responsibilities include engaging in the entire lifecycle of services, supporting services pre-launch, scaling systems sustainably, working on critical Google Cloud services, and solving operations problems using software engineering principles. You'll collaborate with developer teams on design, architecture, and processes.

The Technical Infrastructure team, which you'll be part of, is crucial in developing and maintaining data centers and building the next generation of Google platforms. This team ensures that Google's networks run smoothly, providing users with the best and fastest experience possible.

Ideal candidates will have experience in computing, distributed systems, storage, or networking, with strong skills in designing, analyzing, and troubleshooting large-scale distributed systems. The ability to debug, optimize code, and automate routine tasks is essential, along with excellent problem-solving and communication skills.

Join Google's SRE team to work on meaningful projects, collaborate with diverse perspectives, and contribute to the architecture that powers Google's vast product portfolio.

Last updated 6 months ago

Responsibilities For Staff Software Engineer, Site Reliability Engineering

  • Engage in and improve the whole lifecycle of services from inception, design to deployment, operation and refinement
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
  • Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Work on the availability, scalability, efficiency and latency of some of Google Cloud's most critical services
  • Solve operations problems by using software engineering principles and best practices. Collaborate with the developer teams on design, architecture and processes

Requirements For Staff Software Engineer, Site Reliability Engineering

Java
Python
Go
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 8 years of experience with data structures or algorithms
  • 5 years of experience with software development in one or more programming languages
  • 3 years of experience leading projects and designing, analyzing, and troubleshooting distributed systems

Benefits For Staff Software Engineer, Site Reliability Engineering

Equity
  • Equal opportunity employer
  • Accommodation for applicants with special needs

Interested in this job?

Jobs Related To Google Staff Software Engineer, Site Reliability Engineering

Staff Software Engineer, Site Reliability Engineering

Staff Software Engineer position at Google focusing on Site Reliability Engineering, maintaining and optimizing large-scale distributed systems for Google Cloud services.

Staff Software Engineer, Site Reliability Engineering, Google Cloud

Staff Software Engineer position at Google Cloud focusing on Site Reliability Engineering, building and maintaining large-scale distributed systems with competitive compensation and benefits.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and multiple location options.

Senior Technical Program Manager I, Site Reliability Engineering, Google Cloud Platforms

Senior Technical Program Manager role at Google Cloud, focusing on Site Reliability Engineering, offering competitive compensation and the opportunity to lead complex technical projects.