Taro Logo

ML Systems Engineer

A research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.
Machine Learning
Senior Software Engineer
In-Person
4+ years of experience
AI
This job posting may no longer be active. You may be interested in these related jobs instead:

Description For ML Systems Engineer

Genmo is at the forefront of AI innovation, focusing on developing cutting-edge video generation models to advance artificial general intelligence (AGI). As an ML Systems Engineer at Genmo, you'll play a crucial role in building and optimizing the infrastructure that powers their state-of-the-art models. The position combines deep technical expertise in machine learning systems with high-performance computing, requiring experience with modern ML frameworks and serving architectures.

You'll be responsible for designing and implementing sophisticated model serving systems that can handle millions of daily requests with sub-second latency. The role involves working with cutting-edge technology, including H100 GPUs, and requires expertise in model optimization techniques using tools like TensorRT and torch.compile. You'll be building automated pipelines for model deployment and optimization while ensuring robust monitoring and observability of system performance.

The ideal candidate brings 4+ years of ML engineering experience, with a strong focus on model serving at scale. You should have hands-on experience with high-performance frameworks, deep understanding of Python and PyTorch, and proven ability to optimize large-scale inference systems. Knowledge of transformer architectures and GPU optimization is essential.

Working at Genmo means being part of a team that's pushing the boundaries of what's possible in video generation and AI. The company values open-source contributions and experience with advanced serving optimizations. This is an opportunity to work on challenging technical problems while contributing to the development of next-generation AI technology. The position is based at their San Francisco headquarters, where you'll collaborate with researchers and engineers to shape the future of AI video generation.

Last updated 9 days ago

Responsibilities For ML Systems Engineer

  • Design and implement high-performance model serving infrastructure supporting streaming, batching, and multi-modal inputs
  • Build automated model compilation and optimization pipelines using TensorRT, torch.compile, and other compilers
  • Optimize serving systems for throughput, latency, and GPU utilization across H100 fleet
  • Develop monitoring and observability for model-specific metrics
  • Collaborate with researchers to transition models from development to production
  • Implement A/B testing, canary deployments, and gradual rollout strategies for models
  • Integrate serving layer with platform infrastructure

Requirements For ML Systems Engineer

Python
  • Bachelor's or Master's degree in Computer Science or related field
  • 4+ years ML engineering experience with 2+ years focused on model serving
  • Production experience with high-performance model serving frameworks
  • Strong Python proficiency and PyTorch experience
  • Experience with model compilation and optimization
  • Track record of building inference systems at scale (10K+ QPS)
  • Understanding of attention mechanisms and transformer architectures
  • Experience with containerized deployment and orchestration

Interested in this job?