Taro Logo

Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA is the world leader in accelerated computing and GPU technology.
$148,000 - $287,500
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior Software Engineer

Senior Software Engineer role at Microsoft Azure, focusing on cloud infrastructure and distributed systems, with competitive pay and remote work options.

Sr. Systems Engineer

Senior Systems Engineer role at Qualcomm focusing on 5G/6G wireless technologies, machine learning, and network optimization with competitive compensation and benefits.

Senior Software Engineer, Systems Infrastructure

Senior Software Engineer position at LinkedIn focusing on building and maintaining large-scale distributed systems and infrastructure platforms that power LinkedIn's core applications.

Senior Software Engineer, Systems Infrastructure

Senior Software Engineer position at LinkedIn focusing on building next-generation infrastructure and platforms including distributed systems and data storage solutions.

Senior Software Engineer - Distributed Systems

Senior Software Engineer position at Datadog focusing on building and maintaining distributed systems that process billions of events in real-time, offering competitive compensation and hybrid work environment.

Description For Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA, a pioneer in GPU technology and accelerated computing, is seeking a Senior System Software Engineer to join their GPU Communications Libraries and Networking team. This role focuses on NCCL and NVSHMEM communication runtimes for Deep Learning and HPC applications. The position offers a unique opportunity to work at the intersection of AI and high-performance networking, supporting large-scale GPU clusters.

The role involves deep engagement with partners and customers, conducting performance analysis, and developing tools for cutting-edge GPU clusters. You'll be working with various networking technologies including Infiniband, RoCE, and Ethernet, while supporting applications across major cloud platforms like Azure, AWS, and GCP.

This is an ideal position for someone with strong technical expertise in parallel programming, high-performance computing, and networking. The role requires excellent C/C++ programming skills, Linux expertise, and experience with container technologies. Knowledge of CUDA programming and deep learning frameworks is a plus.

Working at NVIDIA means being part of a company that's leading groundbreaking developments in Artificial Intelligence, High Performance Computing, and Visualization. You'll have the opportunity to contribute to technologies that power everything from artificial intelligence to autonomous cars. The company offers competitive compensation including a base salary range of $148,000 - $287,500, plus equity and comprehensive benefits.

The position offers flexibility with remote work options and locations in major tech hubs. You'll be part of a diverse, inclusive work environment where innovation and technical excellence are highly valued. This role is perfect for someone who wants to make an impact in the AI and HPC space while working with cutting-edge technology and industry-leading partners.

Last updated 2 days ago

Responsibilities For Senior System Software Engineer, NCCL - Partner Enablement

  • Engage with partners and customers to root cause functional and performance issues reported with NCCL
  • Conduct performance characterization and analysis of NCCL and DL applications on GPU clusters
  • Develop tools and automation to isolate issues on new systems and platforms
  • Guide customers and support teams on HPC knowledge
  • Document and conduct trainings/webinars for NCCL
  • Engage with internal teams in different time zones

Requirements For Senior System Software Engineer, NCCL - Partner Enablement

Python
Linux
Kubernetes
  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience
  • Experience with parallel programming and communication runtime
  • Excellent C/C++ programming skills
  • Experience working with engineering or academic research community supporting HPC or AI
  • Practical experience with high performance networking
  • Expert in Linux fundamentals and Python
  • Familiar with containers, cloud provisioning and scheduling tools
  • Flexibility to work and communicate effectively across different teams and timezones

Benefits For Senior System Software Engineer, NCCL - Partner Enablement

Equity
  • Equity
  • Benefits package

Interested in this job?