Taro Logo

Software Engineer, Inference - TL

AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits humanity
$460,000 - $685,000
Backend
Staff Software Engineer
In-Person
1,000 - 5,000 Employees
8+ years of experience
AI
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior/Lead, Technical Architect

Senior/Lead Technical Architect role at Salesforce, combining technical expertise with customer advisory to deliver enterprise software solutions using Salesforce's cutting-edge platforms.

Software Engineering SMTS

Senior-level software engineering position at Salesforce focusing on quality, automation, and technical leadership with competitive compensation and hybrid work options.

Software Engineering SMTS

Senior Software Engineering role at Salesforce focusing on analytics and AI capabilities, offering competitive compensation and hybrid work options in Bellevue, WA.

Senior Staff Engineer – Enterprise Applications

Senior Staff Engineer position at Qualcomm focusing on enterprise application development, requiring expertise in full-stack development, leadership skills, and 6+ years of experience.

Senior Lead Synthesis Engineer

Senior Lead Synthesis Engineer position at Qualcomm India, focusing on next-gen SoC development with emphasis on low power and machine learning systems.

Description For Software Engineer, Inference - TL

OpenAI is seeking a Technical Lead Software Engineer for their Inference team, focusing on high-performance model inference and infrastructure scaling. This role combines technical leadership with hands-on engineering, requiring expertise in CUDA optimization and distributed systems. The position involves leading the design and implementation of core inference infrastructure for frontier AI models, optimizing CUDA-based systems, and collaborating with researchers to build scalable solutions. The ideal candidate will have deep expertise in GPU computing, PyTorch, and NVIDIA's software stack, along with proven experience leading complex engineering initiatives. The role offers competitive compensation ($460K-$685K plus equity) and comprehensive benefits, including healthcare, 401(k) matching, and generous parental leave. This is an opportunity to shape the future of AI technology at one of the industry's leading companies, working on cutting-edge infrastructure that powers products used by consumers and enterprises worldwide. The position requires both technical depth in performance optimization and the ability to mentor and guide other engineers while working cross-functionally with research and product teams.

Last updated 3 days ago

Responsibilities For Software Engineer, Inference - TL

  • Lead the design and implementation of core inference infrastructure for serving frontier AI models in production
  • Own and optimize CUDA-based systems and kernels to maximize performance across our fleet
  • Partner with researchers to integrate novel model architectures into performant, scalable inference pipelines
  • Build tooling and observability to detect bottlenecks, guide system tuning, and ensure stable deployment at scale
  • Collaborate cross-functionally to align technical direction across research, infra, and product teams
  • Mentor engineers on GPU performance, CUDA development, and distributed inference best practices

Requirements For Software Engineer, Inference - TL

Python
  • Deep expertise in CUDA, including writing and optimizing high-performance kernels for inference or training workloads
  • Experience leading complex engineering efforts, particularly at the systems and performance layer of large-scale ML infrastructure
  • Understanding of the full inference stack - from model loading and memory management to communication libraries and deployment orchestration
  • Comfortable working in large, distributed GPU environments and debugging performance issues across hardware and software layers
  • Strong familiarity with PyTorch and NVIDIA's GPU software stack (NCCL, NVLink, MIG, etc.)
  • Systems-level view with ability to dive into low-level code for performance optimization

Benefits For Software Engineer, Inference - TL

Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
401k
Parental Leave
Education Budget
  • Medical, dental, and vision insurance for you and your family
  • Mental health and wellness support
  • 401(k) plan with 50% matching
  • Generous time off and company holidays
  • 24 weeks paid birth-parent leave & 20-week paid parental leave
  • Annual learning & development stipend ($1,500 per year)
  • Equity compensation

Interested in this job?