Taro Logo

Senior Software Developer, HPC Cluster Management

NVIDIA is the world leader in accelerated computing, pioneering accelerated computing to tackle challenges no one else can solve.
Backend
Senior Software Engineer
Hybrid
5,000+ Employees
7+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Senior Software Developer, HPC Cluster Management

NVIDIA is seeking a Senior Software Developer for HPC Cluster Management. The role involves developing head node and compute node installation and provisioning processes, working on edge site deployment, integrating with latest hardware, developing features for composable infrastructure management, BIOS and firmware upgrade management, and improving cluster scalability. The ideal candidate should have 7+ years of experience in software development, proficiency in Python and Linux, and knowledge of object-oriented design and concurrent programming. Experience with Ansible, high-performance computing, and system administration is a plus. The position offers the opportunity to work on cutting-edge technology in a supportive environment, contributing to NVIDIA's mission of transforming industries through AI and accelerated computing.

Last updated a year ago

Responsibilities For Senior Software Developer, HPC Cluster Management

  • Development of head node and compute node installation and provisioning processes
  • Work on edge site deployment functionality
  • Integrate product with latest hardware (GPUs, DPUs, accelerators, high-speed interconnects)
  • Develop features for composable infrastructure management
  • Develop new features for BIOS and firmware upgrade management
  • Improve cluster scalability
  • Add support for new Linux distributions
  • Improve support for alternative CPU architectures like ARM
  • Work on Ansible collections for Cluster Installation and Management
  • Assist support team with customer requests

Requirements For Senior Software Developer, HPC Cluster Management

Python
Linux
  • Degree in Computer Science or related field
  • 7+ years of experience in software development
  • Strong familiarity with Linux operating system and networking concepts
  • Proficiency in Python
  • Knowledge of object-oriented software design, design patterns, and concurrent programming
  • Emphasis on high-quality work and clean code
  • Eagerness to learn and use new technologies