Google Cloud is seeking a Senior Software Engineer for their Site Reliability Engineering (SRE) team. This role combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure that Google Cloud's services—both internally critical and externally-visible systems—have reliability and uptime appropriate to customer needs, while maintaining a fast rate of improvement.
Your responsibilities will include managing the complex challenges of scale unique to Google Cloud, optimizing existing systems, building infrastructure, and eliminating work through automation. You'll have the opportunity to apply your expertise in coding, algorithms, complexity analysis, and large-scale system design.
The ideal candidate will have a strong background in software development, data structures, algorithms, and experience with designing, analyzing, and troubleshooting large-scale distributed systems. You should be able to engage in the entire lifecycle of services, from inception and design to deployment and refinement.
Key responsibilities include:
Google's SRE culture values diversity, intellectual curiosity, problem-solving, and openness. The organization brings together people with various backgrounds and perspectives, encouraging collaboration and innovation in a supportive environment.
Join Google's Technical Infrastructure team and be part of the backbone that makes Google's product portfolio possible. If you're passionate about solving complex problems at scale and want to work with cutting-edge technology, this role offers an exciting opportunity to grow your career in site reliability engineering.