Great Question, a product-focused startup with 14 engineers, is seeking their first dedicated DevOps/Infra hire to take ownership of platform health, reliability, and scalability. This Site Reliability Engineer role offers end-to-end ownership of critical infrastructure systems and will partner directly with the engineering team to improve systems and reduce toil.
The role encompasses various crucial areas including observability, reliability, infrastructure management, capacity planning, developer experience, security compliance, and cloud cost optimization. You'll be responsible for maintaining service SLOs, improving incident response, managing Terraform infrastructure, and leading AWS migrations.
As a foundational hire, you'll have the opportunity to shape the systems and culture of how the company builds and runs software. The position offers clear growth paths into platform leadership, Head of Infra/SRE, or Principal Engineer roles as the company expands. The technical stack includes AWS, Terraform, GitHub Actions, Docker, Kubernetes, Datadog, PostgreSQL, Redis, and Rails.
The ideal candidate should have 4-8+ years of experience in DevOps or SRE roles, strong AWS expertise, and proficiency with infrastructure-as-code tools. You'll work in a high-autonomy environment with a team that values thoughtfulness, speed, and care. The role offers significant impact potential, trust in decision-making, and opportunities to grow with the company.
This remote position combines technical challenges with strategic platform development, making it perfect for someone who views infrastructure as a product and wants to build lasting foundations for a growing company. You'll have support from leadership while maintaining the freedom to chart your own path as the company grows.