Google is seeking a PhD-level Software Engineer to join their Cluster Management (Borg) team, focusing on developing and maintaining critical infrastructure that powers Google's vast service portfolio. The role involves working with distributed systems and cluster management, primarily using C++. As part of a versatile team, you'll be responsible for designing, implementing, and optimizing workload scheduling systems while ensuring reliability and maintainability at scale.
The position offers an opportunity to work on Google's core infrastructure, specifically the Borg system, which manages and schedules the company's massive computational workloads. You'll be involved in various aspects of the system, from machine-level agents to autoscaling applications and spatial flexibility solutions. The work combines data analysis, cross-team collaboration, coding, and debugging in a complex distributed systems environment.
This is an ideal role for candidates with a strong academic background in computer science or related fields, particularly those with experience in systems programming and distributed computing. You'll be working at the heart of Google's technical infrastructure, helping to ensure that billions of users can access Google's services reliably and efficiently. The role offers exposure to cutting-edge technology and the chance to solve complex technical challenges at unprecedented scale.
As part of Google's Technical Infrastructure team, you'll collaborate with other engineers to build and maintain the next generation of Google's platforms. The role requires a mix of technical expertise, problem-solving skills, and the ability to work effectively with teams across the organization. You'll have the opportunity to make significant contributions to one of the world's largest and most sophisticated distributed systems.