The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators. Working at the hardware-software boundary, our engineers craft high-performance kernels for ML functions, ensuring optimal performance for customer workloads.
The role involves working at the intersection of machine learning, high-performance computing, and distributed architectures. As a Sr. ML Kernel Performance Engineer, you'll be responsible for architecting and implementing business-critical features, publishing cutting-edge research, and mentoring experienced engineers. The team operates in a startup-like environment with small, agile teams focused on innovation and experimentation.
Key responsibilities include designing and implementing high-performance compute kernels, optimizing performance across multiple hardware generations, conducting detailed performance analysis, and working directly with customers to optimize their ML models. You'll collaborate across compiler, runtime, framework, and hardware teams to deliver optimal performance for machine learning workloads.
The position offers unique opportunities to:
AWS values diverse experiences and inclusive culture, offering flexibility in working hours and strong support for work-life balance. The team provides extensive mentorship opportunities and focuses on career growth through challenging projects and knowledge sharing.
The role is part of AWS's larger mission to pioneer cloud computing and continue pushing the boundaries of what's possible in AI acceleration. You'll be joining a team that combines deep hardware knowledge with ML expertise to deliver solutions for the most demanding AI workloads.