Taro Logo

Senior Site Reliability Engineer, Storage

AI-first Cloud infrastructure company pioneering vertically integrated, purpose-built AI infrastructure solutions powered by clean, renewable energy.
$183,000 - $210,000
Site Reliability
Senior Software Engineer
Hybrid
5+ years of experience
AI · Enterprise SaaS · Cloud

Description For Senior Site Reliability Engineer, Storage

Crusoe, an innovative AI-first Cloud infrastructure company, is seeking a Senior Site Reliability Engineer specialized in Storage to join their Cloud Engineering and Product department. This role is crucial in maintaining their AI-optimized cloud infrastructure, focusing on ensuring the availability, performance, and scalability of cloud storage products and services.

The position offers an exciting opportunity to work with cutting-edge technology in sustainable computing, where you'll be responsible for building and optimizing distributed, fault-tolerant storage systems at scale. You'll be working with state-of-the-art storage technologies, including NVMe and SSD-backed volumes, supporting large-scale AI compute clusters.

The ideal candidate brings 5+ years of professional experience in SRE or storage engineering, with deep expertise in distributed storage systems, Linux internals, and containerization technologies. You'll be working with modern technologies like Kubernetes, various programming languages (Python, Go, Java), and infrastructure-as-code tools.

This role offers a competitive compensation package ranging from $183,000 to $210,000 annually, plus bonus and RSUs. The position is based in San Francisco, CA, with a hybrid work arrangement, and comes with comprehensive benefits including health insurance, 401(k) matching, and various other perks.

At Crusoe, you'll be part of a mission-critical team that's redefining AI cloud infrastructure while maintaining a strong focus on environmental sustainability. The company is well-funded and trusted by Fortune 500 companies, offering an excellent opportunity for professional growth in the rapidly evolving field of AI infrastructure.

Last updated 6 hours ago

Responsibilities For Senior Site Reliability Engineer, Storage

  • Build automation and self-healing tools for distributed cloud storage infrastructure
  • Drive reliability initiatives for data replication, encryption, and backup strategies
  • Implement and maintain high-performance NVMe and SSD-backed volumes
  • Support user-facing storage services
  • Investigate and resolve storage-related incidents
  • Partner with hardware and kernel teams to optimize I/O paths
  • Contribute to architecture of fault-tolerant storage backends
  • Maintain performance and reliability of AI-optimized cloud infrastructure

Requirements For Senior Site Reliability Engineer, Storage

Python
Go
Java
Linux
Kubernetes
  • 5+ years of professional experience in SRE, systems, or storage engineering
  • Hands-on experience with distributed storage systems
  • Proficiency in Python, Go, Java, or C
  • Experience with Infrastructure as Code and deployment tooling
  • Deep knowledge of Linux internals
  • Familiarity with storage protocols
  • Strong experience with containerized workloads and orchestration platforms
  • Excellent incident response, troubleshooting, and documentation practices
  • Excellent communication skills
  • Must be able to pass a background check
  • Embody the Company values

Benefits For Senior Site Reliability Engineer, Storage

401k
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
Parental Leave
Education Budget
Commuter Benefits
  • Hybrid work schedule
  • Industry competitive pay
  • Restricted Stock Units
  • Health insurance package options (HDHP and PPO)
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Subscription to Calm app
  • MetLife Legal
  • Company paid commuter benefit ($50 per pay period)

Interested in this job?

Jobs Related To Crusoe Senior Site Reliability Engineer, Storage

Senior Site Reliability Engineer, Compute

Senior Site Reliability Engineer position at Crusoe, focusing on compute infrastructure optimization for AI workloads, requiring expertise in Linux kernel, virtualization, and system performance tuning.

Senior Site Reliability Engineer, Production Engineering

Senior Site Reliability Engineer position at Cisco ThousandEyes, focusing on production engineering and cloud infrastructure management in London with hybrid work arrangement.

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at Thomson Reuters, focusing on maintaining and improving system reliability and infrastructure scalability.

Senior Software Engineer, Site Reliability Tooling

Senior SRE Engineer role at Upstart, building and maintaining tooling for reliability and observability of AI-powered lending platforms. Remote-friendly with competitive compensation.

Senior Site Reliability Engineer, Compute

Senior Site Reliability Engineer position at Crusoe, focusing on compute infrastructure optimization for AI workloads, requiring expertise in Linux kernel, virtualization, and system performance tuning.