Taro Logo

Distinguished Software Architect - Deep Learning and HPC Communications

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.
Principal Software Engineer
Remote
5,000+ Employees
15+ years of experience
AI · Enterprise SaaS

Job Description

NVIDIA, the pioneer in GPU technology and accelerated computing, is seeking a Distinguished Software Architect to lead their Deep Learning and HPC Communications initiatives. This role is crucial for scaling Deep Learning and HPC applications across massive GPU clusters. The position involves working with cutting-edge technologies like NCCL, NVSHMEM & GPUDirect, and co-designing next-generation data center platforms that can scale to thousands of GPUs.

The ideal candidate will be an industry-recognized leader in HPC/DL communications with a proven track record of innovation. They will be responsible for researching new communication technologies, designing features for communication libraries, and proposing innovative hardware and software solutions. The role requires deep expertise in parallel programming, high-performance networking, and GPU architecture.

Working at NVIDIA means joining one of technology's most desirable employers, with some of the industry's most forward-thinking professionals. The company is at the forefront of groundbreaking developments in Artificial Intelligence, High Performance Computing, and Visualization. Their work enables everything from artificial intelligence to autonomous cars.

This position offers the opportunity to shape the future of large-scale computing systems, working with cutting-edge technology and collaborating with diverse teams across the globe. The role combines deep technical expertise with strategic thinking, requiring both hands-on development skills and the ability to drive adoption of new technologies across different application verticals.

Last updated 2 days ago

Responsibilities For Distinguished Software Architect - Deep Learning and HPC Communications

  • Research new communication technologies and design new features for communication libraries
  • Propose innovative solutions in HW and SW for next-gen platforms
  • Co-design solutions with GPU, Networking, and SW architects
  • Inspire changes based on quantitative data and technical analysis
  • Drive adoption of new communication technologies across application verticals
  • Collaborate with DL researchers and customers
  • Keep up with latest DL research

Requirements For Distinguished Software Architect - Deep Learning and HPC Communications

  • PhD in Computer Science, Computer Engineering or related field or equivalent experience
  • 15+ years of relevant experience in academia or industry
  • Expert in HPC, parallel programming models, communication runtime
  • Deep understanding of high performance networking
  • Strong knowledge of ML/DL fundamentals
  • Programming fluency with C or C++ for systems software development
  • Ability to work and communicate effectively across different teams and timezones