Taro Logo

Staff Software Engineer, Machine Learning Performance, TPU

A global technology company that develops AI, search, cloud computing, software and online advertising technologies.
$197,000 - $291,000
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Staff Software Engineer, Machine Learning Performance, TPU

Google is seeking a Staff Software Engineer to lead machine learning performance optimization for their TPU (Tensor Processing Unit) systems. This role is part of the ML, Systems, and Cloud AI organization, which is responsible for the infrastructure powering Google's services and Cloud AI offerings. The position focuses on maximizing efficiency for ML/AI workloads, particularly around Large Language Models (LLMs).

The ideal candidate will work at the intersection of machine learning infrastructure and performance optimization, driving improvements in how Google's ML models train and serve across their TPU fleet. Key responsibilities include maintaining LLM benchmarks, implementing optimization techniques, and collaborating with product teams to solve performance challenges.

This is an opportunity to impact Google's next-generation AI technologies, working with cutting-edge hardware like TPUs and software frameworks like TensorFlow and JAX. The role offers competitive compensation including base salary, bonus, equity, and comprehensive benefits.

The position requires deep expertise in both software engineering and machine learning systems, with opportunities to shape the future of AI infrastructure at scale. You'll be working with teams across Google to optimize critical ML workloads that power products used by billions of users worldwide.

Last updated 2 months ago

Responsibilities For Staff Software Engineer, Machine Learning Performance, TPU

  • Identify and maintain Large Language Model (LLM) training and serving benchmarks
  • Work on scaling numeric and algorithmic optimizations to Google products and ML models
  • Engage with Google product teams to solve their Large Language Model (LLM) performance problems
  • Analyze performance and efficiency metrics to identify bottlenecks

Requirements For Staff Software Engineer, Machine Learning Performance, TPU

Python
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience in testing, and launching software products
  • 5 years of experience with software development in one or more programming languages (e.g., Python, C, C++)
  • Experience in performance analysis including system architecture, performance modeling, benchmarking or machine learning infrastructure

Benefits For Staff Software Engineer, Machine Learning Performance, TPU

Medical Insurance
Equity
401k
  • Medical Insurance
  • Equity
  • 401k