Sr. Software Development Engineer, ML Infrastructure Team

A subsidiary in AWS that builds software and hardware that make ML on EC2 work
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Sr. Software Development Engineer, ML Infrastructure Team

Join AWS's Machine Learning Infrastructure team at Annapurna Labs, where innovation meets scale in cloud computing. As a Senior Software Development Engineer, you'll lead the development of critical infrastructure that powers AWS's ML and High Performance Computing technologies. This role combines deep technical expertise with leadership opportunities, focusing on building and maintaining sophisticated monitoring and automation systems that ensure peak performance of AWS ML technologies.

The position offers a unique opportunity to work with cutting-edge technologies including AWS Trainium, Graviton, and Elastic Fabric Adapter (EFA). You'll be responsible for developing infrastructure that handles massive testing workloads, creating efficient automation systems, and building comprehensive monitoring solutions using advanced tools like AWS Managed Grafana and Athena.

Your work will directly impact the efficiency and reliability of AWS's ML and HPC offerings, as you develop solutions that help teams deliver better software faster. The role requires expertise in Python, TypeScript, and Linux, combined with strong experience in CI/CD pipelines and cluster management. You'll work in a collaborative environment where innovation is encouraged, and your ideas can shape the future of cloud computing.

AWS offers competitive compensation, comprehensive benefits, and a culture that values work-life harmony. You'll be part of a diverse, inclusive team that embraces continuous learning and professional growth. The position provides opportunities for mentorship, both giving and receiving, and allows you to work on projects that directly influence how customers implement ML and HPC workloads in the cloud.

If you're passionate about building scalable infrastructure, automating complex systems, and working with cutting-edge ML technologies, this role offers the perfect blend of technical challenge and career growth. Join us in making AWS the most efficient and cost-effective platform for AI at scale.

Last updated 17 minutes ago

Responsibilities For Sr. Software Development Engineer, ML Infrastructure Team

  • Lead engineer for infrastructure team building and maintaining monitoring systems
  • Automate delivery of software using CI/CD tools
  • Write Python code for large cluster management and benchmarks
  • Create dashboards using AWS Managed Grafana and Athena
  • Develop automatic mechanisms for regression detection
  • Manage complex infrastructure across multiple instance types and systems
  • Write technical documentation and communicate with stakeholders
  • Mentor other engineers

Requirements For Sr. Software Development Engineer, ML Infrastructure Team

Python
TypeScript
Linux
  • 5+ years of non-internship professional software development experience
  • 5+ years of leading design or architecture experience
  • 5+ years of full software development life cycle experience
  • Experience as a mentor, tech lead or leading an engineering team
  • 5+ years experience coding in Python, Typescript, CDK
  • Experience developing highly automated CI/CD pipelines
  • Proficiency working with Linux, including Containers
  • Experience with Clustered ML or HPC Applications or Benchmarks

Benefits For Sr. Software Development Engineer, ML Infrastructure Team

Medical Insurance
401k
  • Medical Insurance
  • 401k

Interested in this job?

Jobs Related To Annapurna Labs (U.S.) Inc. Sr. Software Development Engineer, ML Infrastructure Team

Senior Product Engineer, AI

Senior Product Engineer role at Intercom focusing on AI product development, requiring 5+ years experience in shipping high-quality products, based in Dublin, Ireland.

Senior Computer Vision Engineer (Autonomous Driving)

Senior Computer Vision Engineer position at 42dot, focusing on autonomous driving technology development using advanced computer vision and machine learning techniques.

Senior Machine Learning Engineer, Neural Simulator

Senior Machine Learning Engineer position at Path Robotics, developing AI and neural simulation systems for intelligent industrial robots.

Senior Software Engineer, Machine Learning, Google Ads

Senior Software Engineer position at Google focusing on machine learning applications in advertising technology, offering competitive compensation and opportunities to work on large-scale systems.