Taro Logo

ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing innovation.
Machine Learning
Senior Software Engineer
Hybrid
5,000+ Employees
3+ years of experience
AI · Enterprise SaaS

Description For ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The Acceleration Kernel Library team is at the forefront of maximizing performance for AWS's custom ML accelerators. Working at the hardware-software boundary, our engineers craft high-performance kernels for ML functions, ensuring every FLOP counts in delivering optimal performance for our customers' demanding workloads.

The AWS Neuron SDK is a comprehensive toolkit that includes an ML compiler, runtime, and application framework, seamlessly integrating with popular ML frameworks like PyTorch. As part of the Neuron Compiler organization, the team works across multiple technology layers - from frameworks and compilers to runtime and collectives, optimizing current performance and contributing to future architecture designs.

The role offers a unique opportunity to work at the intersection of machine learning, high-performance computing, and distributed architectures. Engineers collaborate across compiler, runtime, framework, and hardware teams to optimize machine learning workloads for global customers. The position involves designing and implementing high-performance compute kernels, analyzing and optimizing kernel-level performance, and working directly with customers to enable and optimize their ML models.

The team values work-life balance and offers flexibility in working hours. They embrace diversity and inclusion, with ten employee-led affinity groups reaching 40,000 employees globally. Career growth and mentorship are prioritized, with projects assigned to help team members develop into better-rounded professionals. The hybrid work model allows engineers to choose between full office presence or flexible arrangements near US Amazon offices.

This role is perfect for someone passionate about pushing the boundaries of AI acceleration technology, combining deep hardware knowledge with ML expertise to deliver optimal performance for demanding workloads.

Last updated 2 days ago

Responsibilities For ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

  • Design and implement high-performance compute kernels for ML operations
  • Analyze and optimize kernel-level performance across multiple generations of Neuron hardware
  • Conduct detailed performance analysis using profiling tools
  • Implement compiler optimizations such as fusion, sharding, tiling, and scheduling
  • Work directly with customers to enable and optimize their ML models
  • Collaborate across teams to develop innovative kernel optimization techniques

Requirements For ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

Python
  • 3+ years of non-internship professional software development experience
  • 3+ years of non-internship design or architecture experience
  • Experience programming with at least one software programming language
  • Experience with GPU kernel optimization and GPGPU computing
  • Proficiency in low-level performance optimization for GPUs
  • Knowledge of ML frameworks (PyTorch, TensorFlow)

Benefits For ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs

  • Work-life balance
  • Flexible working hours
  • Mentorship & Career Growth
  • Hybrid work options

Interested in this job?

Jobs Related To Amazon ML Kernel Performance Engineer, AWS Neuron, Annapurna Labs