AWS Neuron is seeking a Senior Software Development Engineer to join their Machine Learning Inference Model Enablement team. This role focuses on developing and optimizing large-scale machine learning models, particularly LLMs like the Llama family and DeepSeek, for AWS's cloud infrastructure.
The position involves working with AWS's proprietary Inferentia and Trainium accelerators, requiring expertise in both software development and machine learning optimization. You'll collaborate closely with compiler and runtime engineers to create distributed inference solutions, using technologies like Python, PyTorch, and JAX.
As a senior engineer, you'll lead initiatives to build distributed inference support for PyTorch in the Neuron SDK, focusing on maximizing performance and efficiency for customer workloads. The role demands strong software development skills in Python and deep knowledge of machine learning systems.
The team operates in a startup-like environment, prioritizing high-impact solutions for AWS's large customer base. You'll participate in design discussions, code reviews, and cross-functional collaboration while working with cutting-edge ML infrastructure.
Amazon offers competitive compensation, including a base salary range of $151,300 to $261,500 depending on location, plus equity and comprehensive benefits. The position is based in Cupertino, CA, and offers opportunities for career growth through mentorship and hands-on experience with advanced ML systems.
This role is ideal for experienced engineers passionate about machine learning infrastructure who want to impact how large-scale AI models are deployed and optimized in production environments. Join a team that values knowledge-sharing, mentorship, and technical excellence while working on some of the most challenging problems in ML infrastructure.