NVIDIA, the world leader in accelerated computing, is seeking a Senior Performance Research and Analysis Engineer to join their Performance group. This role focuses on profiling and analyzing AI workloads on large-scale GPU and CPU clusters, specifically for distributed Deep Learning LLM training.
The position offers a unique opportunity to work with cutting-edge hardware and platforms, including HCAs, Switches, CPUs, GPUs, and Systems. You'll be at the forefront of performance optimization for AI systems, developing and implementing analysis tools and methodologies to understand performance expectations, limitations, and bottlenecks.
Key responsibilities include researching AI workloads and DL models for large-scale training, conducting comprehensive performance analysis, and collaborating across hardware and software teams. The role requires expertise in high-performance networking, with a focus on RDMA and MPI, along with strong programming skills in Python, Bash, and C.
The ideal candidate will have at least 5 years of experience in high-performance networking, a strong background in computer science or software engineering, and demonstrated expertise with NVIDIA GPUs and deep learning frameworks. This position offers the opportunity to work with state-of-the-art technology and contribute to advancing AI computing performance.
NVIDIA provides a diverse and inclusive work environment, offering the chance to work on transformative technology that impacts major industries worldwide. The remote work option across multiple European locations provides flexibility while working with global teams on cutting-edge AI and computing challenges.