Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Annapurna Labs designs silicon and software that accelerates innovation for AWS, creating cloud solutions and custom chips for machine learning.
$151,300 - $261,500
Machine Learning
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Annapurna Labs, an Amazon company, is seeking a Senior Machine Learning Engineer to join their AWS Neuron Distributed Training team. This role focuses on developing and optimizing machine learning solutions for AWS's custom silicon accelerators - Trainium and Inferentia.

The position involves working with cutting-edge ML technologies, particularly in distributed training of large-scale models including LLMs like GPT and Llama, as well as Stable Diffusion and Vision Transformers. You'll be collaborating with chip architects, compiler engineers, and runtime engineers to create efficient distributed training solutions.

As a senior engineer, you'll lead efforts to implement distributed training support in major frameworks like PyTorch and JAX, working with XLA, the Neuron compiler, and runtime stacks. The role requires deep expertise in both software development and machine learning, with a focus on performance optimization and system efficiency.

The team maintains a supportive environment that values knowledge-sharing and mentorship, with opportunities for career growth and skill development. AWS, as the world's leading cloud platform, offers the chance to work on innovative solutions that impact global customers from startups to Fortune 500 companies.

The position offers competitive compensation ranging from $151,300 to $261,500 based on location, plus equity and comprehensive benefits. You'll be part of an inclusive culture that embraces diverse experiences and promotes work-life harmony, with ongoing learning opportunities and career advancement resources.

This role is perfect for experienced engineers who are passionate about machine learning, distributed systems, and high-performance computing, offering the opportunity to work on next-generation AI infrastructure at scale.

Last updated 7 hours ago

Responsibilities For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

  • Lead efforts to build distributed training support into PyTorch and JAX using XLA
  • Optimize models for peak performance on AWS custom silicon
  • Work with chip architects, compiler engineers and runtime engineers
  • Create, build and tune distributed training solutions with Trainium instances
  • Develop and enable performance tuning of ML model families including LLMs

Requirements For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Python
Java
  • Bachelor's degree in computer science or equivalent
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming experience
  • 5+ years of leading design or architecture experience
  • 5+ years of full software development life cycle experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Experience in machine learning, data mining, statistics or natural language processing

Benefits For Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Medical Insurance
401k
  • Full range of medical benefits
  • Financial benefits
  • Work-life harmony

Interested in this job?

Jobs Related To Amazon Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Software Development Engineer, AGI Sensory ASR Inference

Senior Software Engineering role at Amazon's AGI team focusing on high-performance inference software development and AI system optimization.

Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Senior Software Engineer position at AWS focusing on AI/ML distributed training solutions using AWS Neuron technology stack.

Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Senior Software Engineer position for AWS Neuron Distributed Training team, focusing on AI/ML development for cloud-scale Machine Learning accelerators.

Sr. Software Engineer- AI/ML, AWS Neuron Apps

Senior Software Engineering role at AWS focusing on machine learning infrastructure and optimization for cloud-scale ML accelerators.

Sr. Software Engineer- AI/ML, AWS Neuron Apps

Senior Software Engineer position at AWS focusing on AI/ML infrastructure development and optimization, working with cutting-edge machine learning technologies and custom silicon accelerators.