AWS Neuron is at the forefront of cloud-scale machine learning acceleration, providing the complete software stack for AWS Inferentia and Trainium accelerators. This Senior Software Engineering role is part of the Machine Learning Inference Applications team, where you'll be instrumental in pushing the boundaries of LLM performance optimization.
The position offers an exciting opportunity to work with cutting-edge technology, focusing on the development and optimization of core LLM inference components including Attention, MLP, Quantization, and Speculative Decoding. You'll be working with state-of-the-art models like Llama 3.3 70B, 3.1 405B, DBRX, and Mixtral, ensuring they perform optimally on Neuron devices.
What makes this role particularly appealing is the collaborative nature of the work - you'll be working directly with chip architects, compiler engineers, and runtime engineers, bridging the gap between hardware capabilities and software optimization. The team culture strongly emphasizes knowledge-sharing and mentorship, making it an ideal environment for both personal and professional growth.
The compensation package is highly competitive, ranging from $151,300 to $261,500 based on location and experience, plus additional benefits including equity, sign-on bonuses, and comprehensive medical coverage. Amazon's total compensation approach ensures you're well-rewarded for your contributions.
The role requires strong technical expertise with at least 5 years of software development experience and a deep understanding of machine learning fundamentals. You'll be joining a team that values both technical excellence and collaborative spirit, working on projects that directly impact the performance of AWS's machine learning infrastructure.
This position offers the unique opportunity to work at the intersection of machine learning and high-performance computing, making a significant impact on how AI models are deployed and optimized in production environments. If you're passionate about pushing the boundaries of ML performance and working with cutting-edge technology, this role provides the perfect platform to advance your career while contributing to groundbreaking developments in AI acceleration.