AWS Neuron is seeking a Machine Learning Compiler Engineer II to join their team working on the SDK that optimizes ML models for AWS Inferentia and Trainium custom chips. This role focuses on building the next generation Neuron compiler, transforming ML models from frameworks like PyTorch, TensorFlow, and JAX for deployment on AWS accelerators.
The position involves solving complex compiler optimization problems to achieve optimal performance for various ML model families, including large language models, stable diffusion, and vision transformers. You'll work closely with chip architects, runtime engineers, and ML teams to optimize state-of-the-art ML models for AWS accelerators.
As part of AWS's Utility Computing organization, you'll contribute to foundational services and product innovations that define AWS's industry leadership. The role combines deep technical work with collaborative partnerships across teams and external stakeholders.
Key responsibilities include designing and implementing compiler optimizations, building developer-friendly features, and working with open-source technologies like StableHLO, OpenXLA, and MLIR. You'll join a supportive team environment that values knowledge-sharing, mentorship, and career growth.
The position offers competitive compensation ranging from $129,300 to $223,600 based on location, plus equity and comprehensive benefits. This is an opportunity to be at the forefront of AI innovation while working in a startup-like environment within AWS, focusing on impactful projects that democratize access to AI infrastructure.
The ideal candidate will have strong programming skills in C++/Java, and while compiler or ML framework experience is preferred, it's not required. This role offers the chance to shape the future of machine learning infrastructure while working with cutting-edge technology and world-class teams.