Annapurna Labs, an Amazon company, is seeking a Senior Machine Learning Engineer to join their AWS Neuron Distributed Training team. This role sits at the intersection of cutting-edge AI/ML technology and cloud infrastructure, focusing on developing and optimizing distributed training solutions for AWS's custom silicon accelerators.
The position involves working with state-of-the-art machine learning models, including Large Language Models (LLM) like GPT and Llama, as well as Stable Diffusion and Vision Transformers. You'll be responsible for implementing distributed training support in major frameworks like PyTorch and JAX, while collaborating closely with chip architects and compiler engineers to maximize performance on AWS's custom silicon platforms.
As a senior engineer, you'll lead technical initiatives and work with cross-functional teams to solve complex challenges in machine learning infrastructure. The role requires deep expertise in both software development and machine learning, with a focus on distributed systems and performance optimization.
The team operates within AWS's innovative culture, emphasizing mentorship, knowledge-sharing, and career growth. You'll be part of an organization that values diverse experiences and perspectives, with access to various employee-led affinity groups and ongoing learning opportunities.
AWS offers a comprehensive benefits package, including competitive base pay ranging from $151,300 to $261,500 depending on location, plus equity and other compensation components. The company emphasizes work-life harmony and provides extensive resources for professional development.
This is an excellent opportunity for experienced software engineers passionate about machine learning to work on cutting-edge technology that powers some of the world's most advanced AI infrastructure. You'll be contributing to solutions that help customers solve previously unimaginable challenges while working with a team that's dedicated to innovation and technical excellence.