Taro Logo

Senior Software Engineer, Infrastructure

CentML develops AI infrastructure to reduce the cost of developing and deploying ML models, enabling widespread access to AI technology.
Cloud
Senior Software Engineer
Hybrid
4+ years of experience
AI · Enterprise SaaS

Description For Senior Software Engineer, Infrastructure

CentML is revolutionizing the AI infrastructure landscape with a mission to democratize AI by significantly reducing the costs associated with ML model development and deployment. The company is led by a distinguished team of experts from leading tech companies and is spearheaded by co-founder and CEO Gennady Pekhimenko, a renowned expert in ML systems.

As a Senior Software Engineer in Infrastructure, you will play a pivotal role in shaping the future of ML infrastructure. You'll be responsible for designing and developing the CentML platform's deployment infrastructure, which manages ML training and inference across multiple cloud providers including AWS, GCP, Azure, Coreweave, and OCI. This role combines deep technical expertise in containerization, cloud infrastructure, and GPU technologies with the leadership opportunity to guide a team of engineers.

The position offers an exciting opportunity to work on cutting-edge technology that directly impacts the accessibility of AI technology. You'll be working with state-of-the-art GPU clusters, implementing sophisticated scheduling solutions, and ensuring the platform's scalability and performance. The role requires a strong background in containerized deployment systems, cloud infrastructure, and programming languages like Python, Java, and Go.

Working at CentML means joining a company that values diversity, inclusion, and work-life balance. The company offers competitive benefits including equity options, comprehensive healthcare, and professional development opportunities. Whether you're based in Toronto or San Francisco, you'll be part of a team that's pushing the boundaries of what's possible in AI infrastructure.

Last updated 2 months ago

Responsibilities For Senior Software Engineer, Infrastructure

  • Design and lead the development of the deployment infrastructure of the CentML platform
  • Implementing GPU cluster scheduling solutions for large scale ML training and inference workloads
  • Communicate with product teams and define new features and goals for improving the CentML platform

Requirements For Senior Software Engineer, Infrastructure

Python
Java
Go
Kubernetes
  • 4+ years of experience working with containerized deployment systems
  • Experience with deploying and managing cloud infrastructure on AWS, GCP, Azure
  • Strong coding skills in languages like Python, Java, Go, and/or C/C++
  • Knowledge in GPU architecture and Nvidia GPU virtualization technologies is highly desirable
  • Past experience in building GPU clusters for large scale ML training and inference is desirable

Benefits For Senior Software Engineer, Infrastructure

Equity
Medical Insurance
Dental Insurance
Parental Leave
Education Budget
  • An open and inclusive work environment
  • Employee stock options
  • Best-in-class medical and dental benefits
  • Parental Leave top-up
  • Professional development budget
  • Flexible vacation time

Interested in this job?

Jobs Related To CentML Senior Software Engineer, Infrastructure