Engineering Manager, Distributed Systems

AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
$360,000 - $530,000
Distributed Systems
Principal Software Engineer
Hybrid
5+ years of experience
AI
This job posting may no longer be active. You may be interested in these related jobs instead:
Director of Systems Engineering

Lead systems engineering strategies and teams at LinkedIn, overseeing complex system solutions while driving innovation in the world's largest professional network.

Principal Software Engineer - Real Time Systems

Principal Software Engineer position at Anduril Industries, focusing on real-time systems and platform engineering for defense technology applications.

Principal Software Engineer

Lead and architect distributed systems at Coupang, designing scalable infrastructure for thousands of microservices while managing and mentoring engineering teams.

Principal Software Engineer - Observability

Principal Software Engineer role at Roblox focusing on Observability systems, requiring 8+ years of experience in distributed systems and offering competitive compensation.

Principal Software Engineer, Distributed Systems

Lead the evolution of Roblox's experimentation platform as Principal Software Engineer, building scalable systems handling millions of QPS for 80M+ daily users.

Description For Engineering Manager, Distributed Systems

The Platform Runtime team at OpenAI builds low-level framework components to power ML training systems. As an Engineering Manager for Distributed Systems, you'll manage teams responsible for delivering powerful APIs that orchestrate thousands of computers moving/persisting vast amounts of data. This role requires optimizing end-to-end systems, understanding high-performance I/O, and scaling experiences to new supercomputers while maintaining stability and performance. The team uses Python and Rust, and works in a fast-paced, iterative environment. You'll build teams to bring technology to millions of users worldwide, ensuring safety and reliability. The role is based in San Francisco, CA or remote in the United States (preferably West Coast), with a hybrid work model of 3 days in the office per week. Relocation assistance is offered.

Key responsibilities and qualifications:

  • 5+ years of management experience, including managing managers
  • Experience building large-scale systems to distribute workloads at industry-changing scale
  • Technical leadership with hands-on work and team management
  • Fostering a diverse, equitable, and inclusive culture
  • Collaborating with cross-functional teams on reliability and scalability
  • End-to-end problem ownership and willingness to learn
  • Excellent communication skills

OpenAI values diversity and is an equal opportunity employer, committed to providing reasonable accommodations to applicants with disabilities.

Last updated 7 months ago

Responsibilities For Engineering Manager, Distributed Systems

  • Manage teams responsible for distributed systems and APIs
  • Optimize end-to-end systems for performance and scalability
  • Lead technical teams to peak performance
  • Foster a diverse and inclusive culture
  • Collaborate with cross-functional teams on reliability and scalability
  • Ensure technology is delivered with safety and reliability

Requirements For Engineering Manager, Distributed Systems

Python
Rust
  • 5+ years of management experience, including managing managers
  • Experience building large-scale distributed systems
  • Technical leadership skills
  • Collaborative mindset
  • Problem-solving ability
  • Excellent communication skills

Benefits For Engineering Manager, Distributed Systems

Relocation Benefits
  • Relocation assistance

Interested in this job?