Google is seeking a Software Engineer III specializing in AI/ML to join their ML, Systems, & Cloud AI (MSCA) organization. This role focuses on developing and optimizing GPU kernels for large language models, requiring expertise in NVIDIA GPU architecture and CUDA programming. The position involves working on critical projects that impact billions of users across Google's services and Google Cloud platform.
The role combines deep technical expertise in machine learning infrastructure with hands-on development of high-performance computing solutions. You'll be working specifically on improving LLM model inference performance using cutting-edge tools like Pallas/Mosaic. This position is part of Google's broader mission to advance hyperscale computing and AI technology.
The ideal candidate will have strong foundations in software development, specific expertise in GPU programming and machine learning infrastructure, and experience with large language models. The role offers competitive compensation including base salary, bonus, equity, and comprehensive benefits. You'll be part of a team that designs and implements infrastructure used by Google services like Search and YouTube, as well as Google Cloud's Vertex AI platform.
This is an excellent opportunity for someone passionate about AI/ML infrastructure who wants to work on technology that shapes the future of machine learning at scale. You'll be working with state-of-the-art AI models and infrastructure, while having the chance to impact products used by billions of people worldwide.