Site Reliability Engineering (SRE) at Google Cloud combines software and systems engineering expertise to build and maintain large-scale, distributed systems. This senior role focuses on ensuring reliability and optimal performance of Google Cloud's critical services. You'll work on complex challenges unique to Google's scale, applying your expertise in coding, algorithms, and system design. The position offers opportunities to optimize existing systems, build infrastructure, and automate processes.
The role requires strong technical leadership abilities and hands-on experience with distributed systems. You'll be responsible for the entire service lifecycle, from design through deployment and maintenance. Key responsibilities include system design consulting, capacity planning, monitoring system health, and implementing automation to scale systems effectively.
Google offers a competitive compensation package including base salary, bonus, equity, and comprehensive benefits. The company promotes a culture of intellectual curiosity and problem-solving, encouraging collaboration and innovation in a blame-free environment. You'll join a diverse team of professionals working on meaningful projects while receiving support and mentorship for continued growth.
This is an excellent opportunity for experienced engineers who want to work on infrastructure at massive scale, contribute to Google Cloud's reliability, and lead technical initiatives. The role combines hands-on engineering with technical leadership, making it ideal for those who want to both code and guide technical direction.