Taro Logo

Senior AI Cluster Tools Developer

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
$148,000 - $287,500
Machine Learning
Senior Software Engineer
Hybrid
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Senior AI Cluster Tools Developer

NVIDIA, the world leader in accelerated computing, is seeking a Senior AI Cluster Tools Developer to join their sophisticated software team. This role focuses on developing critical analysis and debugging tools that enhance the performance and power efficiency of NVIDIA's products. The position involves working with various departments including Architecture and Software teams to provide intuitive insights into workload and system performance.

The ideal candidate will have strong expertise in Python/Go/C++ development, deep understanding of AI frameworks, and extensive knowledge of cluster computing environments. You'll be responsible for building performance profiling tools, debugging solutions, and collaborating with architects to improve hardware features based on real-world use cases.

This is an excellent opportunity to work at the forefront of AI and GPU technology, developing tools that directly impact NVIDIA's product development and optimization. The role offers competitive compensation, including a substantial base salary range of $148,000 - $287,500, plus equity and comprehensive benefits.

Working in NVIDIA's dynamic environment, you'll be part of a forward-thinking team that values creativity and innovation. The hybrid work model offers flexibility while maintaining collaborative opportunities with some of the most brilliant minds in the technology industry. If you're passionate about AI, hardware optimization, and building sophisticated developer tools, this role presents an exciting opportunity to make a significant impact in the field of accelerated computing.

Last updated 6 months ago

Responsibilities For Senior AI Cluster Tools Developer

  • Build internal perf/power profiling and analysis tools and platform for AI workloads at cluster scale
  • Build debugging tools for common encountered problems in GPU cluster
  • Work with users to build / calibrate perf/power models for next generation HW or system
  • Partner with architects to propose new HW features or improve existing features with real world use cases

Requirements For Senior AI Cluster Tools Developer

Python
Go
Linux
Kubernetes
  • BS+ in Computer Science or related (or equivalent experience) and 5+ years of software development
  • Strong software design and implementation ability with Python/Go/C++
  • Good understanding of Deep Learning and AI frameworks like Pytorch, TensorFlow
  • Knowledge of AI cluster job scheduling, storage management and networking management
  • Knowledge of Linux kernel
  • Excellent problem solving skills and project management skills
  • Flexibility for working in an evolving environment with changing requirements

Benefits For Senior AI Cluster Tools Developer

Equity
  • Equity
  • Competitive Benefits Package

Interested in this job?