Microsoft's Azure HPC/AI Software team is expanding their engineering presence in Dublin to accelerate innovation in Azure HPC/AI Images and Microsoft HPC Pack. This role focuses on developing and maintaining performance-optimized OS images for high-performance computing and AI workloads.
As a Software Engineer II, you'll work with cutting-edge technologies including MPI libraries, GPU computing frameworks (CUDA, NCCL, ROCm, RCCL), high-speed networking (NVLink, InfiniBand, RDMA), and parallel file systems. You'll be responsible for integrating and optimizing these technologies for Azure customers while also supporting Microsoft HPC Pack's job scheduling and cluster management capabilities.
The position involves collaborating with experienced engineers and industry partners to power some of the world's most demanding workloads - from physics simulations and climate modeling to AI training on thousands of GPUs. You'll contribute to development, testing, and maintenance of Azure HPC/AI Images, ensuring optimal performance and minimal setup time for customers.
Key responsibilities include determining user requirements, contributing to design documents, implementing code, breaking down work items, and participating in on-call rotations as a Designated Responsible Individual (DRI). The role requires staying current with technological developments to improve system availability, reliability, efficiency, and performance.
This is an excellent opportunity to work with enterprise-class cluster management systems and contribute to infrastructure supporting scientific research and AI innovation worldwide. The position offers comprehensive benefits including healthcare, educational resources, investment options, parental leave, and work-life balance.