Senior HPC Performance Engineer

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.

Santa Clara, CA, USA

$148,000 - $287,500

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Enterprise SaaS

Job Description

NVIDIA, the pioneer in GPU technology and accelerated computing, is seeking a Senior HPC Performance Engineer to join their GPU Communications Libraries and Networking team. This role sits at the intersection of high-performance computing and artificial intelligence, working on critical communication libraries like NCCL, NVSHMEM, and UCX that power massive-scale GPU deployments.

The position offers a unique opportunity to influence the roadmap of communication libraries that are essential for modern deep learning and HPC applications running on tens of thousands of GPUs. You'll be working with cutting-edge technology, including high-speed interconnects like NVLink and PCIe, and networking solutions such as Infiniband and Ethernet.

As a Senior HPC Performance Engineer, you'll be responsible for conducting detailed performance analysis on large-scale GPU clusters, optimizing communication between GPUs, and ensuring optimal performance at scale. The role requires a deep understanding of computer architecture, parallel programming, and system performance optimization.

The ideal candidate brings strong technical expertise in HPC and performance engineering, with at least 3 years of experience in parallel programming and communication runtimes. You should be comfortable with both low-level programming (C/C++) and modern cloud technologies (containers, Kubernetes, SLURM). Experience with CUDA programming, GPUs, and deep learning frameworks would be particularly valuable.

NVIDIA offers competitive compensation, with a base salary ranging from $148,000 to $287,500 USD depending on level and experience, plus equity and comprehensive benefits. You'll be joining a dynamic, global team that's pushing the boundaries of what's possible in AI and HPC, making this an excellent opportunity for someone passionate about high-performance computing and system optimization.

Last updated 4 days ago

Responsibilities For Senior HPC Performance Engineer

Conduct in-depth performance characterization and analysis on large multi-GPU and multi-node clusters
Study the interaction of libraries with hardware and software components
Evaluate proof-of-concepts and conduct trade-off analysis
Triage and root-cause performance issues reported by customers
Collect performance data and build tools for visualization and analysis
Collaborate with team across multiple time zones

Requirements For Senior HPC Performance Engineer

Python

Linux

Kubernetes

M.S. or PHD in Computer Science or related field
3+ years experience with parallel programming and communication runtime
Experience with performance benchmarking on large scale HPC clusters
Understanding of computer system architecture and operating systems
Ability to implement micro-benchmarks in C/C++
Proficiency in Python
Experience with containers, cloud provisioning and scheduling tools
Strong communication skills and ability to work across different teams

Benefits For Senior HPC Performance Engineer

Equity

Equity

NVIDIA

NVIDIA is the world leader in accelerated computing, pioneering GPU technology and AI solutions.

Santa Clara, CA, USA

$148,000 - $287,500

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Enterprise SaaS

Senior HPC Performance Engineer

NVIDIA

Job Description

Responsibilities For Senior HPC Performance Engineer

Requirements For Senior HPC Performance Engineer

Benefits For Senior HPC Performance Engineer

NVIDIA

Related Jobs