Software Development Engineer - AI/ML, AWS Neuron Apps

Amazon is a global technology company known for e-commerce, cloud computing, and artificial intelligence.
$129,300 - $223,600
Machine Learning
Senior Software Engineer
Hybrid
5,000+ Employees
3+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior Game Developer - AI/ML

Senior Game Developer role at Amazon Games focusing on AI/ML integration in game development, offering competitive salary and benefits.

Senior Software Developer, Amazon Games AI

Senior Software Developer role at Amazon Games focusing on implementing ML, RL, and Generative AI techniques for game development, offering competitive salary and benefits.

Sr. Software Dev Engineer, NGDE Science

Senior Software Engineer role at AWS focusing on generative AI development, machine learning solutions, and leading technical innovations in cloud computing.

Sr. Machine Learning Engineer (MLE), Creative-X, Amazon Advertising

Senior Machine Learning Engineer role at Amazon Advertising, focusing on AI-driven creative solutions and large-scale model deployment.

Software Dev Engineer, Amazon

Senior Software Engineer role at Amazon focusing on AI/ML development, building large-scale machine learning infrastructure and improving core services.

Description For Software Development Engineer - AI/ML, AWS Neuron Apps

AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine learning accelerators and the Trn1 and Inf1 servers that use them. This role is for a software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. You will be responsible for development, enablement, and performance tuning of a wide variety of ML model families, including massive scale large language models like Llama2, GPT2, GPT3, and beyond, as well as stable diffusion, Vision Transformers, and many more.

Key responsibilities include:

  • Leading efforts to build distributed inference support into PyTorch, TensorFlow using XLA, and the Neuron compiler and runtime stacks
  • Tuning models to ensure highest performance and maximize efficiency on AWS Trainium and Inferentia silicon and the TRn1, Inf1 servers
  • Designing and coding solutions to drive efficiencies in software architecture
  • Creating metrics, implementing automation, and resolving root causes of software defects
  • Building high-impact solutions for a large customer base
  • Participating in design discussions, code reviews, and communicating with internal and external stakeholders
  • Working cross-functionally to drive business decisions with technical input

The ideal candidate will have strong software development skills using C++/Python and deep ML knowledge. Experience optimizing inference performance for both latency and throughput on large models using Python, PyTorch, or JAX is essential. Familiarity with DeepSpeed and other distributed inference libraries is crucial.

You'll be working in a startup-like development environment, always focusing on the most important tasks. The team is dedicated to supporting new members, with a mix of experience levels and tenures. They celebrate knowledge-sharing and mentorship, with senior members providing one-on-one mentoring and thorough code reviews.

Join AWS Neuron and be at the forefront of cloud-scale machine learning acceleration!

Last updated a month ago

Responsibilities For Software Development Engineer - AI/ML, AWS Neuron Apps

  • Develop and enable ML model families, including large language models
  • Build distributed inference support into PyTorch, TensorFlow using XLA
  • Tune models for performance on AWS Trainium and Inferentia silicon
  • Design and code solutions for software architecture efficiency
  • Create metrics and implement automation
  • Resolve root causes of software defects
  • Participate in design discussions and code reviews
  • Communicate with internal and external stakeholders
  • Work cross-functionally to drive business decisions

Requirements For Software Development Engineer - AI/ML, AWS Neuron Apps

Python
  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture experience
  • Experience programming with at least one software programming language
  • Strong software development skills using C++/Python
  • Deep ML knowledge
  • Experience optimizing inference performance for large models using Python, PyTorch, or JAX
  • Familiarity with DeepSpeed and other distributed inference libraries

Benefits For Software Development Engineer - AI/ML, AWS Neuron Apps

Medical Insurance
401k
Equity
  • Medical Insurance
  • 401k
  • Equity

Interested in this job?