Taro Logo

Software Engineer (Site Reliability Engineer)

Anyscale commercializes Ray, an open-source project creating an ecosystem of libraries for scalable machine learning, making distributed computing accessible to developers.
$180,600 - $200,900
Site Reliability
Senior Software Engineer
Hybrid
3+ years of experience
AI · Enterprise SaaS
This job posting is no longer active. Check out these related jobs instead:

Job Description

Anyscale, backed by prominent investors with $250+ million in funding, is revolutionizing distributed computing through Ray, their open-source project. They're building a platform that allows developers and data scientists to scale ML applications from laptop to cluster without deep distributed systems expertise.

As a Site Reliability Engineer at Anyscale, you'll be instrumental in maintaining the reliability and performance of their production systems and user-facing services. The role combines engineering excellence with operational expertise, focusing on building robust systems for monitoring, observability, and deployment automation.

Key responsibilities include developing a comprehensive view of cloud component utilization, implementing effective deployment methodologies, and building sophisticated monitoring and alerting systems. You'll also be responsible for establishing testing infrastructure and defining organization-wide SLOs.

The position offers an attractive compensation package ranging from $180.6K to $200.9K, complemented by equity and comprehensive benefits including healthcare, 401k, and various stipends. The hybrid work environment in either San Francisco or Palo Alto provides flexibility while maintaining collaborative opportunities.

This role is perfect for experienced SREs who want to work at the intersection of distributed systems and ML infrastructure, helping shape the future of AI application deployment. You'll be joining a company that powers the ML infrastructure of major tech companies like OpenAI, Uber, and Spotify, making a significant impact on the AI ecosystem.

The ideal candidate should have at least 3 years of relevant experience and a passion for building reliable, scalable systems. Anyscale values diversity and inclusion, welcoming applications from all backgrounds and providing equal opportunities for growth and success.

Last updated 3 days ago

Responsibilities For Software Engineer (Site Reliability Engineer)

  • Develop unified perspective on cloud component utilization across company
  • Ensure deployment methodologies align with reliability goals
  • Build systems for production environment understanding and observability
  • Create monitoring and alerting systems at different levels
  • Establish testing infrastructure
  • Develop tools for measuring service level objectives (SLOs)
  • Implement best practices and on-call systems
  • Coordinate creation and deployment of cloud-based services

Requirements For Software Engineer (Site Reliability Engineer)

  • At least 3 years of relevant work experience in a similar role

Benefits For Software Engineer (Site Reliability Engineer)

Equity
Medical Insurance
401k
Education Budget
Parental Leave
Commuter Benefits
  • Stock Options
  • Healthcare plans with 99% premium coverage
  • 401k Retirement Plan
  • Wellness stipend
  • Education stipend
  • Paid Parental Leave
  • Flexible Time Off
  • Commute reimbursement
  • 100% of in office meals covered