Google's Site Reliability Engineering (SRE) team is at the forefront of maintaining and optimizing large-scale, distributed systems that power Google Cloud's services. This senior role combines software and systems engineering to ensure reliability, appropriate uptime, and continuous improvement of both internal and external systems. As an SRE, you'll tackle unique scaling challenges while leveraging your expertise in coding, algorithms, and system design.
The position offers the opportunity to work with Google's Technical Infrastructure team, which is fundamental to Google's entire product portfolio. You'll be part of a diverse and collaborative culture that encourages intellectual curiosity, problem-solving, and risk-taking in a blame-free environment. The role involves the complete lifecycle of services, from design and deployment to operation and refinement.
The ideal candidate will bring strong technical expertise in distributed systems, software development, and technical leadership. You'll be responsible for system design consulting, capacity planning, launch reviews, and implementing automation to scale systems sustainably. The role offers significant growth opportunities and the chance to work on some of the most complex technical challenges in the industry.
Working at Google means joining a company committed to diversity, equity, and inclusion, with comprehensive benefits and a culture that promotes self-direction while providing necessary support and mentorship. This role is perfect for engineers who are passionate about reliability, scalability, and building robust systems that serve millions of users globally.