Technical Program Manager, Machine Learning Operations and Maintenance

Google Cloud provides enterprise-grade solutions leveraging cutting-edge technology for digital transformation across industries.
Sunnyvale, CA, USANew Albany, OH, USA
$168,000 - $252,000
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS

Description For Technical Program Manager, Machine Learning Operations and Maintenance

Google Cloud is seeking a Technical Program Manager to lead Machine Learning Operations and Maintenance initiatives. This role sits within the Central Operations team, which globally manages programs and standards for data center site operations. The position requires a blend of technical expertise in ML infrastructure, data center operations, and program management skills. You'll be responsible for developing and implementing maintenance policies, managing critical infrastructure dependencies, and ensuring smooth operations of ML workloads. The role offers competitive compensation ($168,000-$252,000 + bonus + equity) and the opportunity to work with cutting-edge technology at scale. You'll be part of Google's mission to build products that create opportunities for everyone, working in an inclusive environment that values diversity and belonging. The position requires significant travel (40-50%) and involves collaboration with multiple stakeholders across the organization. This is an excellent opportunity for experienced professionals who want to impact Google's global infrastructure while working with advanced ML systems and data center operations.

Last updated 16 days ago

Responsibilities For Technical Program Manager, Machine Learning Operations and Maintenance

  • Learn, document and align Machine Learning workload dependencies on power and cooling infrastructure
  • Develop and implement Maintenance SLO policy for Data Center Operations
  • Develop and implement global strategy for shutdown/turnaround maintenance for facilities operations
  • Design and implement planned downtime communications solution for internal and external Cloud customers
  • Work with partner teams to implement programmatic changes to supply chain and resource planning

Requirements For Technical Program Manager, Machine Learning Operations and Maintenance

Python
  • Bachelor's degree in a relevant field, or equivalent practical experience
  • 8 years of experience in critical operations, global change management, supply chain, risk management, or technical program management
  • Experience in managing and coordinating work with multiple vendors and external partners in a 24x7 environment
  • Experience in electrical/power and mechanical/cooling engineering and equipment
  • Experience with global change governance and maintenance in data centers
  • Ability to travel 40-50% of the time as needed
  • Excellent skills in problem-solving and advanced data analytics

Benefits For Technical Program Manager, Machine Learning Operations and Maintenance

Medical Insurance
Equity
  • Bonus
  • Equity
  • Comprehensive benefits package

Interested in this job?

Jobs Related To Google Technical Program Manager, Machine Learning Operations and Maintenance

Senior Research Scientist

Senior Research Scientist position at Google Research, focusing on machine learning and AI systems development, requiring PhD and research experience.

Senior Research Scientist, Deep Learning Data

Senior Research Scientist position at Google focusing on Deep Learning Data, graph algorithms, and Gemini Data infrastructure development.

Senior Research Scientist, Google Cloud AI

Senior Research Scientist position at Google Cloud AI focusing on advancing AI research and development across various industries with competitive compensation and benefits.

Senior Technical Program Manager I, Machine Learning, Google Cloud Platforms

Lead complex machine learning programs at Google Cloud, driving technical innovation and strategic initiatives with competitive compensation and benefits.

Group Product Manager Lead, End-to-End Workflows, Google Cloud

Lead Product Manager role at Google Cloud focusing on GenAI workflows and AI/ML technologies implementation across Google products.