Google is seeking a Staff Software Engineer for their Site Reliability Engineering (SRE) team. SRE combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. The role involves ensuring Google Cloud's services have reliability and uptime appropriate to customer needs, while maintaining a fast rate of improvement.
Key responsibilities include managing the lifecycle of services, supporting pre-launch activities, scaling systems through automation, working on critical Google Cloud services, and solving operations problems using software engineering principles. The ideal candidate will have extensive experience with data structures, algorithms, software development, and leading projects involving distributed systems.
Google offers a culture of diversity, intellectual curiosity, and problem-solving. The Technical Infrastructure team, which includes SRE, is crucial in developing and maintaining data centers and building next-generation Google platforms. This role provides an opportunity to work on unique, large-scale challenges while collaborating with a diverse team in a supportive, mentorship-rich environment.
The position requires a bachelor's degree in Computer Science or related field (or equivalent experience), along with significant experience in software development and distributed systems. Preferred qualifications include expertise in computing, distributed systems, storage, or networking, as well as strong problem-solving and communication skills.
Google is an equal opportunity employer committed to building a diverse and inclusive workforce. They offer accommodations for applicants with needs and require English proficiency for effective global collaboration.