Join Annapurna Labs, a crucial part of AWS, as a Senior Software Development Engineer focusing on distributed AI/ML systems. This role puts you at the forefront of AI/ML development, working on features for the largest clusters and AI models. You'll be developing collective operations that enable AI to scale across multiple accelerators and servers, primarily using C/C++ in a low-level environment.
The position requires strong expertise in Linux, kernels, and performance optimization. Your work will directly impact AWS's EC2 infrastructure, as every instance runs on hardware designed by Annapurna Labs. You'll collaborate with a diverse, international team of infrastructure experts, hardware engineers, RTL engineers, scientists, and architects.
The role offers significant growth opportunities, working alongside principal-level engineers and directors. The team values mentorship, knowledge-sharing, and maintains a strong work-life balance. You'll be part of a fast-paced environment focused on the latest AI/ML advancements while enjoying flexible working hours and a supportive team culture.
Key responsibilities include developing networking solutions for Machine Learning and High-Performance Computing workloads, mentoring junior engineers, and contributing to the full software development lifecycle. The ideal candidate brings 3+ years of software development experience, strong system architecture skills, and preferably experience with embedded systems and high-speed networking.
This is an excellent opportunity for someone passionate about low-level systems programming, distributed computing, and AI/ML infrastructure. You'll be working on cutting-edge technology that powers some of the world's largest AI workloads while being part of a team that values continuous learning and professional development.