NVIDIA is seeking a Senior Performance Engineer to join their Deep Learning team, focusing on building and optimizing tools used by Deep Learning engineers worldwide. This role is at the forefront of AI platform development, working directly with premiere frameworks like PyTorch, JAX, and TensorFlow.
The position offers an opportunity to work with NVIDIA's cutting-edge AI platform, developing and optimizing critical libraries such as Transformer Engine for Large Language Model training and TensorFlow Distributed Embeddings for recommender systems. You'll be part of an ambitious, forward-thinking team that influences all areas of NVIDIA's AI platform.
As a Senior Performance Engineer, you'll collaborate with multiple teams both internally and externally, including the open-source community, to optimize the world's leading AI platform. The role involves hands-on work with large-scale Deep Learning training workloads, performance optimization of modern AI models, and contribution to community benchmarks like MLPerf.
The ideal candidate should have strong programming skills in C++ and Python, with experience in parallel programming and GPU computing. Knowledge of Computer Architecture and proven experience with large software projects is essential. Experience with Deep Learning frameworks, language model training, and performance analysis would be particularly valuable.
This position offers competitive compensation with a base salary range of $184,000 - $425,500 USD, plus equity and comprehensive benefits. Located in Santa Clara, CA, with hybrid work options, this role provides an excellent opportunity to work at the cutting edge of AI technology while contributing to NVIDIA's mission of transforming industries through accelerated computing.