AWS Neuron is seeking a Machine Learning Compiler Engineer II to join their team working on the SDK that optimizes ML models for AWS Inferentia and Trainium custom chips. This role focuses on building the next generation Neuron compiler, transforming ML models from frameworks like PyTorch, TensorFlow, and JAX for deployment on AWS hardware. You'll tackle complex compiler optimization challenges for various ML model families, including large language models, stable diffusion, and vision transformers.
The position requires deep technical expertise in compiler design and optimization, working closely with chip architects and ML teams to achieve optimal performance. You'll be responsible for implementing innovative solutions, collaborating with internal and external stakeholders, and contributing to pre-silicon design and new product features.
AWS Neuron is part of AWS's broader mission to democratize AI infrastructure and make deep learning accessible to everyday developers. You'll work in a startup-like environment within AWS Machine Learning, focusing on high-impact projects that directly influence the performance and usability of AWS's ML acceleration solutions.
The role offers excellent growth opportunities, with exposure to cutting-edge ML technologies and hardware acceleration. You'll be part of a team that values knowledge-sharing, mentorship, and continuous learning. The position includes competitive compensation, comprehensive benefits, and the chance to work on technology that's shaping the future of cloud computing and AI infrastructure.
Key technologies you'll work with include compiler optimization, ML frameworks, OpenXLA, StableHLO, MLIR, and AWS's custom ML acceleration chips. The role combines software engineering excellence with specialized knowledge in ML systems and compiler technology, making it ideal for engineers passionate about both high-performance computing and machine learning.