AWS Neuron is seeking a Senior Software Development Engineer to join their Neuron Inference (ML Apps) team. This role focuses on developing and optimizing machine learning model performance for AWS Inferentia and Trainium cloud-scale accelerators. The position involves working with massive scale language models like Llama3, DBRX, and various ML applications including Stable Diffusion and Vision Transformers.
The role is part of AWS Utility Computing (UC), which provides foundational services like S3 and EC2, along with continuous product innovations. You'll work alongside compiler and runtime engineers to create distributed inference solutions, focusing on both latency and throughput optimization.
As a senior engineer, you'll lead technical initiatives, participate in architecture decisions, and work in a fast-paced, startup-like environment. The position requires strong expertise in both software development and machine learning, with particular emphasis on performance optimization for large-scale ML models.
Amazon offers a comprehensive benefits package and values work-life harmony. The company promotes an inclusive culture through employee-led affinity groups and ongoing learning experiences. Career growth opportunities include mentorship programs and resources for professional development.
The role combines technical leadership with hands-on development, requiring expertise in C++/Python and deep understanding of ML frameworks like PyTorch and TensorFlow. You'll be working at the cutting edge of ML acceleration technology, helping to shape the future of cloud-based machine learning infrastructure.