Staff Software Engineer, Machine Learning Infrastructure, Google Cloud

Google is a global technology company that develops cloud computing, search, software, and online advertising technologies.
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS · Cloud

Description For Staff Software Engineer, Machine Learning Infrastructure, Google Cloud

Google Cloud is seeking a Staff Software Engineer to join their Machine Learning Infrastructure team within the Core ML organization. This role focuses on optimizing Google's Machine Learning resources, particularly working with TPUs and GPUs used across all Google products. The position requires extensive experience in software development, machine learning, and system architecture. As part of the ML, Systems, & Cloud AI (MSCA) organization, you'll be responsible for developing monitoring tools and dashboards to track performance and efficiency of ML resources, identifying areas for improvement, and driving efficiency gains across Google's products. The role involves working with cutting-edge technologies including TensorFlow, Kubernetes, and Google's custom TPUs, while contributing to Google Cloud's Vertex AI platform. This is an opportunity to impact billions of users while working on next-generation technologies in areas such as distributed computing, large-scale system design, and artificial intelligence. The position offers the chance to work with advanced ML infrastructure, lead junior engineers, and collaborate across different teams to improve Google's ML fleet efficiency. The ideal candidate will combine technical expertise in ML systems with leadership abilities and a drive for innovation.

Last updated a day ago

Responsibilities For Staff Software Engineer, Machine Learning Infrastructure, Google Cloud

  • Design, implement and advance the telemetry capabilities needed for monitoring and evaluating the fleet-wide efficiency of ML resources
  • Identify opportunities to improve the efficiency of the ML fleet and build solutions
  • Build reporting and analytic solutions with key partners
  • Drive collaboration with various teams across different PAs
  • Lead junior SWEs towards delivering project goals

Requirements For Staff Software Engineer, Machine Learning Infrastructure, Google Cloud

Kubernetes
Python
  • Bachelor's degree in Computer Science or related technical field or equivalent practical experience
  • 8 years of experience with software development in one or more programming languages, and with data structures/algorithms
  • 5 years of experience testing, and launching software products
  • 3 years of experience with software design and architecture
  • 5 years of experience with machine learning algorithms and tools
  • Experience with Kubernetes, Google Kubernetes Engine, GPU Programming, TensorFlow, and Cloud
  • Experience analyzing ML models performance or working on LLM prompting, training or developing LLMs
  • Experience and knowledge of CPU/GPU architecture or HW accelerators
  • Ability to quickly adapt to new tools, frameworks, and languages

Benefits For Staff Software Engineer, Machine Learning Infrastructure, Google Cloud

Medical Insurance
Parental Leave
  • Equal opportunity employer
  • Comprehensive benefits package

Interested in this job?

Jobs Related To Google Staff Software Engineer, Machine Learning Infrastructure, Google Cloud

Senior Staff Software Engineer, AI/ML GenAI, Google Ads

Senior Staff Software Engineer position at Google focusing on AI/ML and GenAI technologies for Google Ads, offering competitive compensation and the opportunity to work on large-scale advertising solutions.

Staff Software Engineer, AI/ML Recommendations, Rankings, Predictions, YouTube

Lead AI/ML engineering role at YouTube focusing on recommendations and rankings systems, offering competitive compensation and the opportunity to impact billions of users.

Staff Software Engineer, Generative AI, Google Workspace

Senior technical role focusing on integrating generative AI capabilities into Google Workspace products, combining machine learning expertise with software engineering leadership.

Staff Software Engineer, Generative AI, Google Workspace

Lead software engineer position focusing on implementing Generative AI solutions for Google Workspace products, requiring extensive experience in machine learning and large-scale system design.

Senior Staff Software Engineer, AI/ML GenAI, Google Ads

Senior Staff Software Engineer position at Google focusing on AI/ML and GenAI technologies for Google Ads, offering competitive compensation and the opportunity to work on large-scale advertising systems.