Solutions Architect - AI and HPC Cloud

NVIDIA is the world leader in accelerated computing, pioneering accelerated computing to tackle challenges no one else can solve.
Santa Clara, CA, USA
$208,000 - $391,000
Cloud
Staff Software Engineer
In-Person
5,000+ Employees
10+ years of experience
AI · Enterprise SaaS

Description For Solutions Architect - AI and HPC Cloud

NVIDIA is seeking a Solutions Architect for their IPP's Cloud Infrastructure Team. This role involves working with various NVIDIA groups to cater to their infrastructure needs, providing cloud services that support almost half a million automated jobs daily on thousands of servers. The ideal candidate will be passionate about distributed infrastructure, ready to build the next generation of cloud services, and solve sophisticated, critical issues.

Key responsibilities include:

  • Collaborating with NVIDIA Product Teams to understand new product requirements, especially in HPC and AI/ML.
  • Designing optimum solutions for product deployment in datacenter or lab environments.
  • Assisting in the roll-out of new development features supporting the latest NVIDIA hardware and technologies.
  • Defining and implementing full-scale solutions for product onboarding into hosted and private cloud environments.
  • Solving complex problems involving multi-site deployments of NVIDIA products.
  • Collaborating with cross-functional teams to deliver reliable and robust platforms from concept to deployment.
  • Integrating and optimizing cluster deployment methods and managing software stack deployments.

Requirements:

  • Bachelor's or Master's in Computer Science or Software Engineering, or equivalent experience.
  • 10+ years of relevant experience.
  • 5+ years of Linux and scripting experience.
  • Strong background in OS kernels and system engineering.
  • Experience in deploying complex systems in fast-paced environments.
  • Excellent understanding of embedded systems, orchestration & automation systems, data centers, and cloud architecture.
  • Strong problem-solving skills and experience in product engineering/failure analysis.

Preferred qualifications:

  • Experience with compute clusters administration and automation.
  • Background in large-scale QA environments for product bring-ups.
  • Skills in large-scale computing, cluster computing, and data center design.
  • Strong background in Windows & Linux administration.

NVIDIA offers a competitive base salary range of $208,000 - $391,000 USD, along with equity and benefits. The company values diversity and is an equal opportunity employer.

Last updated 24 days ago

Responsibilities For Solutions Architect - AI and HPC Cloud

  • Work with NVIDIA Product Teams to understand new product requirements
  • Design optimum solutions for product deployment in datacenter or lab environments
  • Assist in roll-out of new development features
  • Define and implement full-scale solutions for product onboarding into cloud environments
  • Solve complex problems involving multi-site deployments of NVIDIA products
  • Collaborate with cross-functional teams to deliver reliable platforms
  • Integrate and optimize cluster deployment methods and manage software stack deployments

Requirements For Solutions Architect - AI and HPC Cloud

Linux
Kubernetes
  • Bachelor's or Master's in Computer Science or Software Engineering, or equivalent experience
  • 10+ years of relevant experience
  • 5+ years of Linux and scripting experience
  • Strong background in OS kernels and system engineering
  • Experience in deploying complex systems in fast-paced environments
  • Understanding of embedded systems, orchestration & automation systems, data centers and cloud architecture
  • Strong problem-solving skills and experience in product engineering/failure analysis

Benefits For Solutions Architect - AI and HPC Cloud

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Solutions Architect - AI and HPC Cloud

Onsite Construction Manager, Colocation Infrastructure Delivery

Oracle is seeking an experienced Onsite Construction Manager for Colocation Infrastructure Delivery to oversee major data center construction projects, ensuring timely completion, budget adherence, and high-quality standards.

Sr Manager, Tech Ops Eng, AWS Data Centers

Senior Manager role for Technical Operations Engineering in AWS Data Centers, responsible for managing and optimizing data center clusters and colocation operations.

Commissioning Program Manager - Data Center Infrastructure

Oracle is seeking a Commissioning Program Manager for Data Center Infrastructure to oversee commissioning processes, manage projects, and ensure quality in cloud infrastructure development.

Staff Software Engineer, Cloud Capacity Experience, Google Cloud

Staff Software Engineer role at Google Cloud, focusing on Cloud Capacity Experience and planning for GCP customers.

Senior Staff Software Engineer, Cloud Specialized Generative AI

Senior Staff Software Engineer role at Google Cloud, focusing on Specialized Generative AI in Zürich, Switzerland.