Staff Software Engineer, Cloud ML Compute Services

Google Cloud provides organizations with leading infrastructure, platform capabilities, and industry solutions, leveraging cutting-edge technology.
$189,000 - $284,000
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS · Cloud

Description For Staff Software Engineer, Cloud ML Compute Services

Google Cloud is seeking a Staff Software Engineer to join their Cloud ML Compute Services team, focusing on building and supporting Google Cloud Platform's Cloud TPU and GPU services. This role is perfect for experienced engineers passionate about machine learning infrastructure and high-performance computing.

The position offers an exciting opportunity to work at the intersection of cloud computing and machine learning, developing next-generation technologies that impact billions of users. As part of the Cloud ML Compute Services team, you'll be responsible for the ML frameworks, tools, models, and processes that achieve scale and performance in ML workloads in Google Cloud.

The ideal candidate brings 8+ years of software development experience, with deep expertise in machine learning algorithms and infrastructure. You'll work with cutting-edge technologies like PyTorch and JAX, optimizing LLM training and inference performance on TPUs. The role involves collaboration with various teams and direct interaction with Cloud TPU power users to solve complex technical challenges.

This is a unique opportunity to join Google Cloud, a trusted partner for organizations worldwide, offering competitive compensation ($189,000-$284,000 + bonus + equity + benefits) and the chance to work on infrastructure that powers the future of machine learning. You'll be part of a team that's pushing the boundaries of AI technology while building solutions that help companies operate more efficiently and adapt to changing needs.

The role combines technical leadership with hands-on development, requiring both deep technical expertise and the ability to collaborate across teams. You'll be at the forefront of AI infrastructure development, working with the latest models, tools, and techniques while contributing to Google Cloud's mission of accelerating digital transformation across industries.

Last updated 18 days ago

Responsibilities For Staff Software Engineer, Cloud ML Compute Services

  • Work across the tech stack to improve LLM training and inference performance on TPU
  • Add new features and publish high-performance open-source kernels
  • Partner with the XLA and PyTorch team to design and implement new PyTorch features
  • Collaborate directly with Cloud TPU power users to solve tricky problems and enable new workloads
  • Create smooth inter-operations between JAX and PyTorch
  • Implement and benchmark reference PyTorch models and techniques

Requirements For Staff Software Engineer, Cloud ML Compute Services

Python
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience in software development and with data structures/algorithms
  • 5 years of experience testing, and launching software products
  • 3 years of experience with software design and architecture
  • 5 years of experience with machine learning algorithms, tools, and libraries
  • Experience with building high-quality and reusable AI infrastructure, compilers, or performance engineering
  • Experience with stack-spanning systems and tools, from high-level Python to low-level C++
  • Understanding of the full user experience

Benefits For Staff Software Engineer, Cloud ML Compute Services

  • bonus
  • equity
  • benefits

Interested in this job?

Jobs Related To Google Staff Software Engineer, Cloud ML Compute Services

Senior Research Scientist

Senior Research Scientist position at Google Research, focusing on machine learning and AI systems development, requiring PhD and research experience.

Senior Research Scientist, Deep Learning Data

Senior Research Scientist position at Google focusing on Deep Learning Data, graph algorithms, and Gemini Data infrastructure development.

Senior Research Scientist, Google Cloud AI

Senior Research Scientist position at Google Cloud AI focusing on advancing AI research and development across various industries with competitive compensation and benefits.

Senior Technical Program Manager I, Machine Learning, Google Cloud Platforms

Lead complex machine learning programs at Google Cloud, driving technical innovation and strategic initiatives with competitive compensation and benefits.

Group Product Manager Lead, End-to-End Workflows, Google Cloud

Lead Product Manager role at Google Cloud focusing on GenAI workflows and AI/ML technologies implementation across Google products.