Taro Logo

Staff Software Engineer, ML Performance

Google develops next-generation technologies that change how billions of users connect, explore, and interact with information and one another.
$197,000 - $291,000
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS

Job Description

Google is seeking a Staff Software Engineer to join their Machine Learning (ML) Performance team, focusing on optimizing performance and efficiency for ML and AI workloads. This role is critical in driving Google's ML performance through deep fleet-scale analysis and auto-optimizations. The team is responsible for identifying performance opportunities in production and research ML workloads, demonstrating ML performance at MLPerf competition, and pushing efficiency on trillion-parameter multipod ML models.

The position requires extensive experience in software development, testing, and performance analysis. The ideal candidate will have strong expertise in machine learning systems, compiler optimizations, and ML frameworks like TensorFlow, JAX, and PyTorch. They will work on scaling solutions for Google products and ML models, including model/data sharding, mesh sizes, quantization, and other optimization techniques.

This is an opportunity to work at the forefront of ML performance optimization at one of the world's leading technology companies. The role offers competitive compensation including base salary, bonus, equity, and comprehensive benefits. The position involves collaboration with various Google product teams and requires strong technical leadership skills in a matrixed organization.

The successful candidate will contribute to Google's mission of developing next-generation technologies that impact billions of users. They will work with cutting-edge ML infrastructure and have the opportunity to influence the performance and efficiency of Google's ML systems at scale. This role combines deep technical expertise with strategic thinking to solve complex performance challenges in machine learning systems.

Last updated 7 days ago

Responsibilities For Staff Software Engineer, ML Performance

  • Identify and maintain LLM/non-LLM training and serving benchmarks
  • Work on scaling partitioning and algorithmic optimizations to Google products and ML models
  • Engage with Google product teams to solve their LLM performance problems
  • Analyze performance and efficiency metrics to identify bottlenecks, design, and implement solutions at Google fleet-wide scale

Requirements For Staff Software Engineer, ML Performance

Python
Java
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience in testing, and launching software products
  • 5 years of experience with software development in one or more programming languages
  • 3 years of experience in performance analysis including system architecture, performance, benchmarking and machine learning infrastructure

Benefits For Staff Software Engineer, ML Performance

Medical Insurance
401k
  • Bonus
  • Equity
  • Benefits package

Related Jobs