Taro Logo

Senior Platform Engineer

A research lab building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.
DevOps
Senior Software Engineer
In-Person
5+ years of experience
AI

Description For Senior Platform Engineer

Genmo, an innovative AI research lab, is seeking a Senior Platform Engineer to join their team in San Francisco. The company is focused on developing cutting-edge video generation models to advance artificial general intelligence (AGI). This role presents a unique opportunity to work at the intersection of infrastructure and AI, building and maintaining the systems that power next-generation video generation models.

As a Senior Platform Engineer, you'll be responsible for architecting and managing a sophisticated multi-cluster infrastructure that spans both cloud and on-premises GPU environments. Your work will be crucial in ensuring the reliable deployment and operation of AI models at scale, with a focus on zero-downtime deployments and optimal performance.

The ideal candidate brings strong expertise in distributed systems, with at least 5 years of experience building production-grade systems. You should be proficient in systems programming languages like Go or Rust, along with Python, and have deep knowledge of modern DevOps practices and tools including Kubernetes, infrastructure-as-code, and observability systems.

This role offers the opportunity to work on challenging technical problems at scale, including GPU capacity planning, global load balancing, and building robust observability systems for AI infrastructure. You'll also play a key leadership role, mentoring team members and influencing the technical direction of the platform.

Working at Genmo means being at the forefront of AI video generation technology, with the chance to shape the infrastructure that powers next-generation AI models. The company offers a collaborative environment where you can make significant contributions to the future of AI technology while working with state-of-the-art tools and systems.

Last updated 2 days ago

Responsibilities For Senior Platform Engineer

  • Architect multi-cluster infrastructure layer across clouds and on-prem GPU fleets
  • Automate deployment, rollout, and autoscaling workflows
  • Forecast & plan GPU capacity to meet latency SLOs while controlling cost
  • Shape traffic policy for secure, low-latency routing and global load balancing
  • Instrument & observe end-to-end telemetry and debuggability
  • Standardize infrastructure automation, disaster-recovery, and CI/CD practices
  • Drive reliability through post-incident review and continuous improvement
  • Mentor & lead - share distributed-systems best practices and influence roadmap

Requirements For Senior Platform Engineer

Go
Python
Rust
Kubernetes
  • BS/MS/PhD in CS, EE, or related field
  • 5+ years building production-grade distributed systems
  • Fluency in a systems language (Go or Rust) plus Python
  • Clear, concise communication and an ownership mindset
  • Experience with Kubernetes internals and multi-cluster operations
  • Knowledge of Infrastructure-as-code tools and GitOps workflows
  • Familiarity with service-mesh frameworks
  • Experience with observability stacks and GPU telemetry
  • Understanding of CI/CD tooling

Interested in this job?

Jobs Related To Genmo Senior Platform Engineer