Taro Logo

Site Reliability Engineer (Data)

A platform helping millions of businesses scale with automation and AI, making automation work for everyone.
$141,100 - $185,300
Site Reliability
Senior Software Engineer
Remote
1,000 - 5,000 Employees
4+ years of experience
AI · Enterprise SaaS
This job posting may no longer be active. You may be interested in these related jobs instead:
Senior Software Engineer, Site Reliability Tooling

Senior SRE Engineer role at Upstart focusing on building and maintaining tooling for site reliability, monitoring, and automation in a fintech environment.

Senior Software Engineer - Site Reliability Engineering

Senior SRE position at Roblox focusing on building reliable, scalable systems and tooling to support platform growth, offering competitive compensation and hybrid work environment.

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at NVIDIA, focusing on maintaining and improving system reliability and performance.

Senior Site Reliability Engineer - Infrastructure

Senior Site Reliability Engineer position at NVIDIA focusing on infrastructure management and system reliability.

Senior Site Reliability Engineer - Cisco ThousandEyes

Senior Site Reliability Engineer role at Cisco ThousandEyes, focusing on cloud infrastructure, Kubernetes, and maintaining large-scale distributed systems. Hybrid work in Oeiras, Portugal.

Description For Site Reliability Engineer (Data)

Zapier is seeking a Site Reliability Engineer to join their Data Platforms team, focusing on enhancing the reliability and operational maturity of their modern data stack. This role combines traditional SRE responsibilities with specialized focus on data infrastructure, making it an exciting opportunity for experienced engineers passionate about building reliable systems at scale.

The position offers a competitive compensation package ranging from $141,100 to $185,300, plus equity and bonus opportunities. As a remote-first company, Zapier emphasizes strong async communication and collaboration across time zones while providing the flexibility of remote work.

The ideal candidate will bring 4+ years of SRE experience, strong cloud infrastructure knowledge (particularly AWS), and expertise in observability and incident response. You'll work with technologies like Databricks, Airflow, and various LLMOps tools, while implementing infrastructure as code and automation to reduce toil and improve system reliability.

Key responsibilities include evolving data platforms with reliability best practices, implementing comprehensive monitoring solutions, automating operations, and participating in on-call rotations. You'll also contribute to security compliance and work closely with various engineering teams to ensure platforms are both reliable and user-friendly.

Zapier offers a unique culture focused on automation and efficiency, where your work directly impacts millions of businesses globally. The company values diversity, provides comprehensive benefits, and maintains a transparent, equitable compensation philosophy. This role presents an excellent opportunity to work with cutting-edge technologies while helping scale a platform that enables business automation worldwide.

Last updated 3 days ago

Responsibilities For Site Reliability Engineer (Data)

  • Level up reliability for modern data stack (Databricks, Airflow, LLMOps)
  • Improve observability and alerting systems
  • Automate and optimize operations through infrastructure-as-code
  • Participate in on-call rotation (one week per quarter)
  • Contribute to security and compliance readiness
  • Partner with Data Engineers, ML Engineers, and Backend Engineers
  • Implement monitoring and alerting systems
  • Maintain job orchestration logic and internal tooling

Requirements For Site Reliability Engineer (Data)

Python
TypeScript
Kubernetes
  • 4+ years of experience in Site Reliability Engineering roles
  • Experience with cloud-native architecture and services (AWS)
  • Knowledge of Terraform and infrastructure decisions
  • Strong observability and incident response experience
  • Coding skills in Python, TypeScript, or Bash
  • Strong communication skills in a remote-first environment
  • Experience with Infrastructure as Code
  • Familiarity with monitoring and alerting systems

Benefits For Site Reliability Engineer (Data)

Equity
Medical Insurance
  • Competitive base salary
  • Equity
  • Annual bonus
  • Comprehensive benefits

Interested in this job?