Staff Software Engineer, Cloud ML Compute Services

Google Cloud provides organizations with leading infrastructure, platform capabilities, and industry solutions, leveraging cutting-edge technology to help companies operate efficiently and adapt to changing needs.
$189,000 - $284,000
Machine Learning
Staff Software Engineer
In-Person
5000+ Employees
8+ years of experience
AI · Enterprise SaaS · Cloud

Description For Staff Software Engineer, Cloud ML Compute Services

Google Cloud is seeking a Staff Software Engineer for their Cloud ML Compute Services team. This role is crucial in building and supporting the Google Cloud Platform (GCP) Cloud TPU and GPU services, as well as related ML models and frameworks. The position involves working on projects that provide ML infrastructure customers with large-scale, cloud-based access to Google's ML supercomputers for running training and inference workloads using PyTorch and JAX.

Key responsibilities include improving LLM training and inference performance on TPU, adding new features, publishing high-performance open-source kernels, and collaborating with various teams to design and implement new PyTorch features. The ideal candidate will have extensive experience in software development, machine learning algorithms, and technical leadership.

Google Cloud accelerates digital transformation for organizations worldwide, delivering enterprise-grade solutions that leverage cutting-edge technology. The role offers a competitive salary range of $189,000-$284,000 plus bonus, equity, and benefits, depending on factors such as location, skills, and experience.

This position requires a blend of technical expertise and leadership skills, with a focus on machine learning infrastructure and high-performance computing. The successful candidate will play a vital role in advancing Google Cloud's ML capabilities and supporting customers in solving critical business problems through innovative cloud solutions.

Last updated 12 days ago

Responsibilities For Staff Software Engineer, Cloud ML Compute Services

  • Work across the tech stack to improve LLM training and inference performance on TPU
  • Add new features and publish high-performance open-source kernels
  • Partner with the XLA and PyTorch team to design and implement new PyTorch features, and collaborate directly with Cloud TPU power users to solve tricky problems and enable new workloads
  • Create smooth inter-operations between JAX and PyTorch (e.g., for data loading, hybrid models, or portability)
  • Implement and benchmark reference PyTorch models and techniques, also inform new PyTorch features and improvements

Requirements For Staff Software Engineer, Cloud ML Compute Services

Python
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience in software development and with data structures/algorithms
  • 5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture
  • 5 years of experience with machine learning algorithms, tools, and libraries

Benefits For Staff Software Engineer, Cloud ML Compute Services

  • bonus
  • equity
  • benefits

Interested in this job?

Jobs Related To Google Staff Software Engineer, Cloud ML Compute Services

Staff AI/ML Engineer, Data Exchange

Staff AI/ML Engineer role at Intuit, developing machine learning models for Data Exchange group, requiring 6+ years of experience in AI model production.

Staff Machine Learning Engineer

Staff Machine Learning Engineer role at DoorDash, developing ML solutions for personalized shopping experiences in retail and grocery delivery.

Senior Staff Engineer, AI Foundations

Senior Staff Engineer role at Cruise, leading AI infrastructure for self-driving vehicles. Shape the future of autonomous technology.

Staff Machine Learning Engineer, Price Modeling

Staff Machine Learning Engineer role at Airbnb, focusing on price modeling using reinforcement learning techniques.

Staff Machine Learning Engineer, Relevance

Staff Machine Learning Engineer role at Airbnb, focusing on search and recommendation algorithms for the Relevance and Personalization team.