Senior Site Reliability Engineer

Jobgether is a Talent Matching Platform that partners with companies worldwide to connect top talent with opportunities through AI-driven job matching.
United States
Site Reliability
Senior Software Engineer
Remote
5+ years of experience
Enterprise SaaS · AI

Description For Senior Site Reliability Engineer

Jobgether, an AI-driven Talent Matching Platform, is seeking a Senior Site Reliability Engineer to join their team in a remote capacity within the United States. This role is crucial for scaling, securing, and enhancing the company's cloud infrastructure. As an SRE, you'll be responsible for ensuring system reliability and scalability through proactive solutions and automation. The position requires expertise in Kubernetes, AWS services, and PostgreSQL administration, with a focus on implementing infrastructure as code and maintaining high-availability systems. You'll work closely with engineering teams, lead incident response, and drive continuous improvement in system performance and security. The role offers comprehensive benefits including equity, health coverage, unlimited PTO, and professional development opportunities. With 5+ years of experience required, this position is perfect for a seasoned SRE professional looking to make a significant impact in a growing technology company.

Last updated 3 hours ago

Responsibilities For Senior Site Reliability Engineer

  • Own initiatives related to system reliability and scalability
  • Participate in on-call rotations, responding to incidents, performing root cause analysis
  • Design, deploy, and manage Kubernetes clusters
  • Architect and maintain AWS infrastructure
  • Automate infrastructure provisioning using tools like Crossplane and Terraform
  • Enhance observability by improving monitoring systems using Datadog
  • Conduct post-incident reviews and document lessons learned

Requirements For Senior Site Reliability Engineer

Kubernetes
PostgreSQL
Python
Linux
  • Minimum of 5 years of experience in SRE, DevOps, or Infrastructure Engineering
  • Proficiency in Kubernetes, Helm, and networking security practices
  • In-depth experience with AWS services
  • Expertise in PostgreSQL administration
  • Familiarity with CI/CD tools like GitHub Actions and ArgoCD
  • Strong understanding and experience in Infrastructure as Code (IaC)
  • Experience in observability and monitoring with Datadog
  • Proficiency in Python and Bash scripting
  • Strong communication skills

Benefits For Senior Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
Parental Leave
401k
Education Budget
Equity
  • Competitive base salary and equity options
  • Comprehensive health, dental, and vision coverage
  • Life insurance and mental wellness coverage
  • Flex Time Off (unlimited)
  • Paid family leave, medical leave, and bereavement leave
  • Retirement saving plans
  • Home office setup allowance
  • Annual professional development stipend
  • Flexible remote work options

Interested in this job?

Jobs Related To Jobgether Senior Site Reliability Engineer

Service Reliability Engineer

Senior Service Reliability Engineer position at Jobgether, offering remote work across Asia, focusing on system stability and technical problem-solving with competitive benefits and equity.

Senior Site Reliability Engineer - (Remote - Europe)

Senior Site Reliability Engineer position at Jobgether, offering remote work across Europe, focusing on system reliability, cloud services, and infrastructure automation with comprehensive benefits.

Senior Software Engineer - Site Reliability Engineering

Senior SRE position at Roblox focusing on building resilient systems, automation tools, and monitoring solutions for a gaming platform serving millions of users.

Senior Site Reliability Engineer (Distributed Systems)

Senior Site Reliability Engineer position at Workday focusing on distributed systems and infrastructure reliability.

Senior Software Engineer, Site Reliability Tooling

Senior SRE Engineer role at Upstart focusing on building tooling and automation for monitoring infrastructure health and creating reliable systems.