Annapurna Labs, an Amazon company, is seeking a Senior Machine Learning Engineer to join their AWS Neuron Distributed Training team. This role focuses on developing and optimizing machine learning solutions for AWS's custom silicon accelerators - Trainium and Inferentia. The position involves working with cutting-edge ML technologies, including Large Language Models (LLMs) like GPT and Llama, as well as other ML model families such as Stable Diffusion and Vision Transformers.
The role requires expertise in distributed training frameworks and collaboration with cross-functional teams of chip architects, compiler engineers, and runtime engineers. You'll be responsible for implementing distributed training support in major frameworks like PyTorch and JAX, while optimizing performance on AWS's custom silicon platforms.
Annapurna Labs, acquired by AWS in 2015, has a strong track record of delivering innovative infrastructure solutions including AWS Nitro, Graviton, and ML accelerators. The team culture emphasizes knowledge-sharing, mentorship, and continuous learning, with a strong focus on work-life harmony and career development.
The position offers competitive compensation ranging from $129,300 to $223,600 based on location and experience, plus additional benefits. The team values diverse experiences and backgrounds, fostering an inclusive environment through employee-led affinity groups and ongoing learning opportunities.
This is an excellent opportunity for experienced software engineers with ML expertise to work on cutting-edge technology that powers AWS's machine learning infrastructure, while being part of a supportive team that prioritizes both technical excellence and professional growth.