Google is seeking a Software Engineer III to join their ML, Systems, & Cloud AI (MSCA) organization, focusing on diagnostics and tools for the Google Cloud Platform. This role combines software development with system health verification and performance optimization for machine learning and AI acceleration platforms. The position involves working with cutting-edge technologies including TPUs and hyperscale computing infrastructure.
The ideal candidate will develop diagnostic tools and utilities that support system health verification, performance characterization, and reliability of ML/AI platforms. They will create software for parallel system execution and build analytical dashboards. The role requires collaboration with various teams across software, firmware, and hardware domains.
This is an excellent opportunity for someone with strong programming skills and experience in system diagnostics to work on Google's global infrastructure that powers services used by billions of users. The position offers exposure to advanced hardware solutions, AI/ML acceleration, and cloud computing at scale.
The team is part of Google's broader infrastructure organization that designs and manages hardware, software, and ML systems for both Google services and Google Cloud customers. They play a crucial role in ensuring the reliability, efficiency, and security of Google's global computing infrastructure.