The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The Inference Enablement and Acceleration team is at the forefront of running a wide range of models and supporting novel architecture alongside maximizing their performance for AWS's custom ML accelerators.
As a Senior Software Development Engineer, you will work across multiple technology layers - from frameworks and kernels to compiler, runtime, and collectives. You'll be responsible for development, enablement, and performance tuning of various LLM model families, including massive scale large language models. The role combines deep hardware knowledge with ML expertise to push the boundaries of AI acceleration.
Key responsibilities include architecting and implementing business-critical features, mentoring experienced engineers, and working directly with customers on model enablement. You'll collaborate with compiler engineers and runtime engineers to create, build and tune distributed inference solutions with Trainium and Inferentia.
The team operates in a startup-like development environment, emphasizing collaboration, technical ownership, and continuous learning. You'll work at the intersection of machine learning, high-performance computing, and distributed architectures, helping shape the future of AI acceleration technology.
The position offers competitive compensation ranging from $129,300 to $223,600 per year based on location, plus equity and comprehensive benefits. This is an excellent opportunity for someone passionate about AI/ML infrastructure and optimization who wants to make a significant impact in the field of machine learning acceleration.