Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure, supporting all AWS data centers worldwide.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS · Cloud

Description For Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering

AWS Infrastructure Services (AIS) is seeking a Senior Software Development Engineer to join their ML Ops team within the Infrastructure Science Engineering group. This role sits at the intersection of infrastructure management and machine learning, focusing on optimizing power and cooling across AWS's global data center network.

The position offers a unique opportunity to work with a diverse team of scientists, program managers, and data engineers to build and scale machine learning workflows that directly impact AWS's infrastructure efficiency. You'll be part of the Lanner team, a tight-knit group of eight developers who combine technical excellence with a strong emphasis on work-life balance.

As a senior engineer, you'll lead the development of platforms for deploying and managing ML models, with particular focus on model retraining and monitoring systems. Your work will be crucial in ensuring the efficient placement of server demand by modeling power and cooling loads across AWS's global data center network.

The role combines technical leadership with hands-on development, requiring expertise in both software engineering and machine learning operations. You'll be responsible for building infrastructure that supports all phases of ML models, from R&D to production, while ensuring scalability and reliability.

AWS offers comprehensive benefits, including medical coverage, financial benefits, and career development opportunities. The company's culture strongly emphasizes work-life harmony and inclusive practices through various employee-led affinity groups and ongoing learning experiences.

This position is ideal for someone who combines strong software engineering fundamentals with ML expertise, enjoys solving complex infrastructure challenges, and wants to make a lasting impact on AWS's global infrastructure while working in a collaborative environment that values both technical excellence and personal growth.

Last updated a day ago

Responsibilities For Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering

  • Lead the design and implementation of training and inference infrastructure for ML models
  • Collaborate with scientists and data engineers to develop improved training infrastructure
  • Engineer solutions for robust and fault-tolerant rack planning and forecasting workflows
  • Build and scale machine learning workflows and platform services
  • Develop platforms for deploying and productionalizing ML models

Requirements For Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering

Python
Java
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language
  • 5+ years of leading design or architecture of new and existing systems
  • Experience as a mentor, tech lead or leading an engineering team

Benefits For Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering

Medical Insurance
401k
Vision Insurance
Dental Insurance
Parental Leave
  • Full range of medical benefits
  • Financial benefits
  • Work-life balance
  • Career development and mentorship
  • Inclusive team culture

Interested in this job?

Jobs Related To Amazon Senior Software Development Engineer, ML Ops, AWS Infrastructure Science Engineering

Sr. Machine Learning Engineer, AGIF | Finetuning

Senior Machine Learning Engineer position at Amazon's AGI Finetuning team, focusing on developing and maintaining evaluation systems for advanced AI models.

Sr. Software Development Engineer, Artificial General Intelligence

Senior Software Development Engineer role at Amazon's AGI team, focusing on developing advanced conversational AI capabilities for Alexa using LLMs and Gen AI.

Sr. Research Engineer, Machine Learning, AGI Foundations

Senior Research Engineer position at Amazon's AGI team, focusing on developing advanced multimodal ML systems and scaling pre-training workflows for LLMs and Generative AI.

Senior Delivery Consultant - Application Developer, Data & Machine Learning, WWPS ProServe

Senior ML and cloud architecture role at AWS ProServe, combining technical expertise with consulting to help customers implement AWS solutions, focusing on machine learning and data processing systems.

Sr. Machine Learning Engineer, Amazon Q in QuickSight

Senior Machine Learning Engineer position at Amazon working on Q in QuickSight, focusing on LLM and NLP applications for business intelligence solutions.