The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. As a ML Compiler Engineer, you'll be part of the Acceleration Kernel Library team, working at the forefront of maximizing performance for AWS's custom ML accelerators.
The role involves crafting high-performance kernels for ML functions, combining deep hardware knowledge with ML expertise to optimize AI acceleration. The AWS Neuron SDK includes an ML compiler, runtime, and application framework that integrates with popular ML frameworks like PyTorch. Working across multiple technology layers, from frameworks and compilers to runtime and collectives, you'll help shape the future of AI acceleration technology.
This position offers a unique opportunity to work on cutting-edge products at the intersection of machine learning, high-performance computing, and distributed architectures. You'll architect and implement business-critical features, publish research, and mentor experienced engineers. The team operates in a startup-like environment with small, agile teams focused on innovation and experimentation.
Key responsibilities include optimizing machine learning workloads, implementing compiler optimizations, and working directly with customers to enable and optimize their ML models on AWS accelerators. You'll collaborate across compiler, runtime, framework, and hardware teams, bringing expertise in low-level optimization, system architecture, and ML model acceleration.
The role emphasizes work-life balance and professional growth, with opportunities for mentorship and career development. You'll be part of AWS's inclusive culture, working with diverse teams and contributing to innovative solutions that power businesses worldwide. The position offers flexibility in working hours and a supportive environment that celebrates knowledge sharing and continuous learning.