Taro Logo

Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.
$148,000 - $287,500
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA, the pioneer in GPU technology and AI solutions, is seeking a Senior System Software Engineer to join their GPU Communications Libraries and Networking team. This role focuses on NCCL (NVIDIA Collective Communications Library) partner enablement, working at the intersection of deep learning and high-performance computing.

The position offers an exceptional opportunity to work with cutting-edge technology in AI and HPC, specifically focusing on communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. You'll be working with large-scale GPU clusters utilizing high-speed networking technologies including Infiniband, RoCE, and Ethernet.

As a Partner Enablement Engineer, you'll be responsible for guiding key partners and customers with NCCL implementation, conducting performance analysis, and developing tools for issue isolation across various platforms. The role requires strong technical expertise in C/C++ programming, parallel computing, and high-performance networking, combined with excellent communication skills to work effectively with partners and internal teams across different time zones.

The position offers competitive compensation with a base salary range of $148,000 to $287,500 USD, plus equity benefits. NVIDIA's work environment is at the forefront of technological innovation, where you'll contribute to groundbreaking developments in artificial intelligence, high-performance computing, and visualization technologies.

This role is perfect for someone who has a deep understanding of HPC systems, strong programming skills, and a passion for working with cutting-edge technology. The opportunity to work with NVIDIA's industry-leading GPU technology and contribute to the advancement of AI and HPC makes this an exciting position for the right candidate.

Last updated 2 months ago

Responsibilities For Senior System Software Engineer, NCCL - Partner Enablement

  • Engage with partners and customers to root cause functional and performance issues reported with NCCL
  • Conduct performance characterization and analysis of NCCL and DL applications on GPU clusters
  • Develop tools and automation to isolate issues on new systems and platforms
  • Guide customers and support teams on HPC knowledge
  • Document and conduct trainings/webinars for NCCL
  • Engage with internal teams on networking, GPUs, storage, infrastructure and support

Requirements For Senior System Software Engineer, NCCL - Partner Enablement

Linux
Python
  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience
  • Experience with parallel programming and communication runtime
  • Excellent C/C++ programming skills
  • Experience working with engineering or academic research community supporting HPC or AI
  • Practical experience with high performance networking
  • Expert in Linux fundamentals and Python
  • Familiar with containers, cloud provisioning and scheduling tools
  • Adaptability and passion to learn new areas and tools
  • Flexibility to work and communicate effectively across different teams and timezones

Benefits For Senior System Software Engineer, NCCL - Partner Enablement

Equity
  • Equity

Interested in this job?