Principal Engineer, Cloud ML Compute Services

Google Cloud provides enterprise-grade solutions leveraging cutting-edge technology and tools for digital transformation.
$278,000 - $399,000
Machine Learning
Principal Software Engineer
In-Person
5,000+ Employees
15+ years of experience
AI · Enterprise SaaS · Cloud

Description For Principal Engineer, Cloud ML Compute Services

Google's Cloud ML Compute Services (CMCS) is seeking a Principal Engineer to drive technical strategy for ML Frameworks and Models. This role focuses on building the best Cloud ML platform for demanding and innovative ML workloads. You'll be responsible for enabling massive scale ML Services powered by GPUs and TPUs, working with cutting-edge AI technologies including LLMs, MoE, and Diffusion models.

The position requires deep expertise in machine learning, distributed systems, and cloud architecture. You'll work on realtime scalability through model/data parallelization, performance tuning, and low latency serving of both first-party and third-party models. The role involves collaboration with various teams across Google, including Core ML, GDM, Storage, GKE, and VertexAI.

As a Principal Engineer, you'll be at the forefront of AI/ML innovation, working with state-of-the-art hardware like Google TPUs and NVIDIA GPUs. You'll drive the technical strategy for large-scale training and inference services on GCP, while working with popular ML frameworks such as PyTorch, JAX, and TensorFlow.

This is an exceptional opportunity for a seasoned engineer to shape the future of cloud-based machine learning infrastructure at one of the world's leading technology companies. The role offers competitive compensation, including a substantial base salary range of $278,000-$399,000, plus bonus, equity, and comprehensive benefits.

The ideal candidate will combine deep technical expertise with strong leadership abilities, capable of building strategic alignments across organizations and delivering innovative solutions that meet the dynamic needs of AI/ML compute. If you're passionate about machine learning, distributed systems, and want to make a significant impact on the future of cloud computing, this role offers an unparalleled opportunity to work with cutting-edge technology at scale.

Last updated 9 days ago

Responsibilities For Principal Engineer, Cloud ML Compute Services

  • Design, build, and deploy solutions that leverage GPU, TPU and highly-scalable hardware and software infrastructure
  • Build strategic alignment with major organizations across Google contributing to the ML landscape
  • Work across Engineering teams that build, design, and implement both hardware and software
  • Provide leadership for cloud developer technology inside Google
  • Optimize the latest emerging ML model types, benchmarks, and common ML frameworks

Requirements For Principal Engineer, Cloud ML Compute Services

Python
  • Bachelor's degree in Computer Science, Electrical Engineering, or equivalent practical experience
  • 15 years of experience building software and distributed systems
  • 10 years of experience with machine learning algorithms and tools
  • 10 years of experience with hardware and software design, data structures and algorithms
  • 10 years of experience with private and public cloud design
  • Experience with PyTorch, TensorFlow, JAX
  • Experience with LLMs, NLP, and deep learning models
  • Excellent organization, problem-solving, and prioritization skills
  • Outstanding teamwork and communication skills

Benefits For Principal Engineer, Cloud ML Compute Services

  • bonus
  • equity
  • benefits

Interested in this job?

Jobs Related To Google Principal Engineer, Cloud ML Compute Services

Senior Product Manager, Real World Journeys Search Quality

Senior Product Manager role at Google, leading Real World Journeys Search Quality initiatives, requiring 8+ years of experience in product management and ML systems.

Senior Product Manager, Assistant Natural Language Processing

Lead product strategy for Google Assistant's NLP team, focusing on LLM implementation and AI-driven user experience enhancement.

Field Solutions Developer IV, Generative AI, Google Cloud

Senior AI developer role at Google Cloud focusing on Generative AI solutions, requiring 10+ years of experience and offering competitive compensation.

Silicon AI/ML Lead Architect

Lead the architecture and development of AI/ML accelerators for Google Cloud's data centers, focusing on custom silicon solutions and hardware optimization.

Principal, Specialized Software, AI, Office of the CTO

Principal Engineer role at Google Cloud's Office of the CTO, focusing on AI innovation and strategic customer partnerships in cloud technology transformation.