Annapurna Labs, an Amazon company, is seeking a Senior Machine Learning Engineer to join their AWS Neuron Distributed Training team. This role focuses on developing and optimizing machine learning solutions for AWS's custom silicon accelerators - Trainium and Inferentia.
The position involves working with cutting-edge ML technologies, particularly in distributed training of large-scale models including LLMs like GPT and Llama, as well as Stable Diffusion and Vision Transformers. You'll be collaborating with chip architects, compiler engineers, and runtime engineers to create efficient distributed training solutions.
As a senior engineer, you'll lead efforts to implement distributed training support in major frameworks like PyTorch and JAX, working with XLA, the Neuron compiler, and runtime stacks. The role requires deep expertise in both software development and machine learning, with a focus on performance optimization and system efficiency.
The team maintains a supportive environment that values knowledge-sharing and mentorship, with opportunities for career growth and skill development. AWS, as the world's leading cloud platform, offers the chance to work on innovative solutions that impact global customers from startups to Fortune 500 companies.
The position offers competitive compensation ranging from $151,300 to $261,500 based on location, plus equity and comprehensive benefits. You'll be part of an inclusive culture that embraces diverse experiences and promotes work-life harmony, with ongoing learning opportunities and career advancement resources.
This role is perfect for experienced engineers who are passionate about machine learning, distributed systems, and high-performance computing, offering the opportunity to work on next-generation AI infrastructure at scale.