Principal Firmware Engineer - Data Center Server Management

NVIDIA is the world leader in accelerated computing, pioneering solutions for AI and digital twins that transform industries and society.
$272,000 - $471,500
Embedded
Principal Software Engineer
Hybrid
15+ years of experience
AI · Enterprise SaaS

Description For Principal Firmware Engineer - Data Center Server Management

NVIDIA, known as "the AI computing company," is seeking a Principal Firmware Engineer for Data Center Server Management. This role involves driving server management for large clusters and data centers deploying GPUs and Grace solutions. The ideal candidate will work with data center architects and cloud customers to define requirements, collaborate with internal teams for implementation, and own end-to-end manageability architecture for products in data centers.

Key responsibilities include:

  • Designing and building data center health management workflows
  • Driving reliability and optimization in firmware architecture
  • Working closely with cluster bring-up teams to resolve issues quickly
  • Owning firmware delivery to data centers in terms of quality, reliability, and telemetry performance

Requirements:

  • 15+ years of relevant experience in server firmware (BMC) and platform software development
  • BS, MS, or PhD in EE/CS or related field
  • Hands-on experience with data center health management workflow
  • Strong knowledge of data center management, server architecture, and server manageability
  • Proficiency in C/C++ and Python
  • Experience with SCM (e.g., Git, Perforce) and project management tools like Jira

The ideal candidate should be a self-starter, creative problem-solver, and have excellent communication skills. Experience with x86 or ARM system architecture and proven leadership in driving large, complex problems with 50+ engineers are considered advantages.

NVIDIA offers a competitive base salary range of $272,000 - $471,500 USD, along with equity and comprehensive benefits. Join NVIDIA at the forefront of technological advancement in AI and accelerated computing.

Last updated 25 days ago

Responsibilities For Principal Firmware Engineer - Data Center Server Management

  • Drive server management for large clusters and data centers deploying GPUs and Grace solutions
  • Work with data center architects and cloud customers to define requirements
  • Collaborate with internal teams for implementation
  • Design & build data center health management workflows
  • Drive reliability and optimization in firmware architecture
  • Work closely with cluster bring-up teams to resolve issues quickly
  • Own firmware delivery to data centers in terms of quality, reliability, and telemetry performance

Requirements For Principal Firmware Engineer - Data Center Server Management

Python
Linux
  • 15+ years of experience in server firmware (BMC) and platform software development
  • BS, MS, or PhD in EE/CS or related field
  • Hands-on experience with data center health management workflow
  • Strong knowledge of data center management, server architecture, and server manageability
  • Proficiency in C/C++ and Python
  • Experience with SCM (e.g., Git, Perforce) and project management tools like Jira
  • Excellent written and oral communication skills
  • Self-starter with creative problem-solving abilities

Interested in this job?

Jobs Related To NVIDIA Principal Firmware Engineer - Data Center Server Management

Software Architect, Automotive DriveOS

Principal Software Architect position at NVIDIA focusing on autonomous vehicle systems architecture, requiring 5+ years of embedded systems experience.

Senior Firmware Architect - Server Manageability

Senior Firmware Architect position at NVIDIA focusing on server manageability and GPU-based AI server solutions.

Principal System Architect - Tegra

Lead system architecture development for NVIDIA's Tegra SoCs, focusing on innovative solutions and management network design with 15+ years of experience required.

Principal Platform Software Engineer - OpenBMC Platform Architect

Lead next-generation data center server platform architecture and development at NVIDIA, focusing on OpenBMC platform architecture and firmware development.

System Software Architect, Programmable Vision Accelerator

Lead system software architecture for NVIDIA's Programmable Vision Accelerator, developing firmware and driver stack for advanced computer vision and ML applications.