Crusoe is revolutionizing AI cloud infrastructure by building sustainable, high-performance computing solutions. As a Senior Site Reliability Engineer focused on Compute, you'll be instrumental in developing and optimizing the company's virtualization and compute infrastructure. The role combines deep technical expertise in Linux kernel internals, virtualization technologies, and system optimization with a focus on supporting modern AI and HPC workloads.
You'll work with cutting-edge technologies including SmartNICs, BlueField devices, and TPUs, while being responsible for critical infrastructure components from the kernel to orchestration layers. The position requires strong programming skills in languages like C, Go, or Rust, and extensive experience with system-level debugging and performance optimization.
The company offers a comprehensive benefits package including equity, competitive salary, and various health and wellness benefits. Working in a hybrid environment in San Francisco, you'll be part of a team that's setting new standards in sustainable AI infrastructure. This is an opportunity to make a significant impact at a well-funded technology company that's aligned with both technological advancement and environmental responsibility.
The role is perfect for an experienced SRE who is passionate about infrastructure optimization, has deep Linux expertise, and wants to work on challenging problems at the intersection of AI, cloud computing, and sustainability. You'll be contributing to a platform that's considered the "gold standard" for reliability and performance in AI infrastructure.