Taro Logo

Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA is seeking a Senior System Software Engineer to join their NCCL team, which is crucial for scaling Deep Learning and HPC applications. This role focuses on partner enablement, working with NVIDIA's groundbreaking GPU communication libraries. The position offers an exceptional opportunity to understand the AI networking stack comprehensively, working with cutting-edge technology in high-performance computing.

The role involves engaging directly with partners and customers to optimize NCCL performance, troubleshoot issues, and provide expert guidance on HPC implementations. You'll be working with state-of-the-art GPU clusters, developing tools for issue isolation, and conducting performance analysis across various cloud platforms including Azure, AWS, and GCP.

As a senior engineer, you'll be instrumental in supporting NVIDIA's mission in advancing AI and HPC capabilities. The position requires strong expertise in C/C++ programming, parallel computing, and high-performance networking protocols. You'll work with modern technologies including containers, cloud infrastructure, and machine learning frameworks.

The ideal candidate will have 5+ years of relevant experience, strong communication skills, and a passion for technology innovation. This role offers the opportunity to work with cutting-edge technology while contributing to NVIDIA's vision of transforming computing. The position provides flexibility with remote work options across multiple European locations, making it an attractive opportunity for experienced engineers looking to make an impact in the AI and HPC space.

Last updated 8 minutes ago

Responsibilities For Senior System Software Engineer, NCCL - Partner Enablement

  • Engage with partners and customers to root cause functional and performance issues with NCCL
  • Conduct performance characterization and analysis of NCCL and DL applications on GPU clusters
  • Develop tools and automation to isolate issues on new systems and platforms
  • Guide customers and support teams on HPC knowledge
  • Document and conduct trainings/webinars for NCCL
  • Engage with internal teams on networking, GPUs, storage, infrastructure and support

Requirements For Senior System Software Engineer, NCCL - Partner Enablement

Python
Linux
Kubernetes
  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience
  • Experience with parallel programming and communication runtime
  • Excellent C/C++ programming skills
  • Experience working with engineering or academic research community supporting HPC or AI
  • Practical experience with high performance networking
  • Expert in Linux fundamentals and Python
  • Familiar with containers, cloud provisioning and scheduling tools
  • Flexibility to work across different teams and timezones

Interested in this job?

Jobs Related To NVIDIA Senior System Software Engineer, NCCL - Partner Enablement