Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services—both internally critical and externally-visible systems—have reliability, uptime appropriate to users' needs and a fast rate of improvement. SRE's also keep an ever-watchful eye on systems capacity and performance. Much of the software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation.
As an Engineering Manager, you'll lead a team and be responsible for products globally, providing technical leadership to key projects and empowering and developing teams to do the same. You'll manage on-call rotations across continents, using a follow-the-sun model.
The Technical Infrastructure team builds the architecture behind everything users see online, from developing and maintaining data centers to building the next generation of Google platforms. The team ensures networks are up and running, providing users with the best and fastest experience possible.
This role offers the opportunity to work on unique challenges of scale at Google, using expertise in coding, algorithms, complexity analysis, and large-scale system design. SRE's culture values diversity, intellectual curiosity, problem-solving, and openness, promoting self-direction to work on meaningful projects while providing support and mentorship for learning and growth.