Google's Site Reliability Engineering (SRE) team is at the forefront of ensuring the reliability and performance of Google Cloud's massive distributed systems. As a Senior SRE, you'll combine software and systems engineering expertise to build and maintain fault-tolerant systems at scale. The role involves working on critical infrastructure that powers both internal and customer-facing services, focusing on reliability, uptime, and continuous improvement.
The position offers unique challenges in managing complex systems at Google-scale while leveraging your expertise in coding, algorithms, and large-scale system design. You'll be part of a culture that values intellectual curiosity, problem-solving, and openness, working in a blame-free environment that encourages collaboration and innovation.
The Technical Infrastructure team, which this role is part of, is fundamental to Google's product portfolio, developing and maintaining data centers and building next-generation platforms. The team takes pride in being the engineers' engineers, focusing on keeping networks running optimally to ensure the best user experience.
This role is perfect for someone who enjoys the intersection of software development and systems engineering, with opportunities to work on system optimization, infrastructure development, and automation. You'll be involved in the entire service lifecycle, from design to deployment and refinement, while also participating in on-call rotations and incident response.
The position offers the chance to work with cutting-edge technology, solve complex distributed systems challenges, and make a significant impact on Google's infrastructure. You'll be part of a diverse team that brings together people with various backgrounds and perspectives, promoting self-direction while providing support and mentorship for continuous learning and growth.