Taro Logo

Senior HPC Performance Engineer

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.
$148,000 - $287,500
Senior Software Engineer
In-Person
5,000+ Employees
3+ years of experience
AI · Enterprise SaaS

Description For Senior HPC Performance Engineer

NVIDIA, the pioneer in GPU technology and accelerated computing, is seeking a Senior HPC Performance Engineer to join their GPU Communications Libraries and Networking team. This role focuses on delivering critical libraries like NCCL, NVSHMEM, and UCX for Deep Learning and HPC applications. The position involves working with cutting-edge technology in AI and high-performance computing, specifically optimizing communication performance between GPUs across massive-scale systems.

The ideal candidate will be responsible for conducting detailed performance analysis on multi-GPU clusters, optimizing communication libraries, and solving complex performance challenges. This role requires expertise in parallel programming, HPC systems, and performance engineering. You'll work with state-of-the-art hardware including NVLink, PCIe, and high-speed networking technologies like Infiniband and Ethernet.

The position offers an opportunity to influence the roadmap of NVIDIA's communication libraries and directly impact the performance of large-scale AI and HPC applications. You'll be working with a dynamic team across multiple time zones, solving challenging problems in GPU communication and networking. The role combines technical depth in performance engineering with the excitement of working on next-generation AI and computing technologies.

NVIDIA offers a competitive compensation package including a base salary range of $148,000 to $287,500, equity, and comprehensive benefits. This is an excellent opportunity for someone passionate about high-performance computing and interested in pushing the boundaries of what's possible in GPU communication and networking technology.

Last updated 2 hours ago

Responsibilities For Senior HPC Performance Engineer

  • Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters
  • Study the interaction of libraries with HW and SW components in the stack
  • Evaluate proof-of-concepts, conduct trade-off analysis
  • Triage and root-cause performance issues reported by customers
  • Collect performance data and build tools for visualization and analysis
  • Collaborate with team across multiple time zones

Requirements For Senior HPC Performance Engineer

Python
Linux
Kubernetes
  • M.S. or PHD in Computer Science or related field
  • 3+ years experience with parallel programming and communication runtime
  • Experience with performance benchmarking on large scale HPC clusters
  • Understanding of computer system architecture and operating systems
  • Ability to implement micro-benchmarks in C/C++
  • Proficiency in Python
  • Experience with containers, cloud provisioning and scheduling tools
  • Strong communication skills and ability to work across different teams

Benefits For Senior HPC Performance Engineer

Equity
  • Competitive base salary
  • Equity compensation
  • Company benefits package

Interested in this job?

Jobs Related To NVIDIA Senior HPC Performance Engineer