Luma AI is seeking a Principal Software Engineer specializing in Reliability to join their Infrastructure and Research teams. This role is crucial for managing and optimizing Luma's extensive GPU clusters, which consist of thousands of H100 GPUs across multiple providers. The ideal candidate will be responsible for ensuring cluster health, building monitoring and management tools, and solving complex performance and maintenance problems.
Key responsibilities include:
The ideal candidate will have:
Luma AI offers a competitive salary range of $200,000 - $250,000 per year, along with a significant equity grant. This is an exciting opportunity to work with cutting-edge technology and contribute to the growth of a rapidly scaling company in the AI space.