Senior Site Reliability Engineer

Kontakt.io

Building the platform that care operations run on, using AI, RTLS, and EHR data to enable self-learning agents to automate workflows in healthcare.

New York, NY, USA

Site Reliability

Senior Software Engineer

Remote

101 - 500 Employees

5+ years of experience

Healthcare · Enterprise SaaS · AI

Description For Senior Site Reliability Engineer

Kontakt.io is revolutionizing healthcare operations with their innovative platform that leverages AI, RTLS, and EHR data to optimize care delivery. As a Senior Site Reliability Engineer, you'll play a crucial role in ensuring the reliability and performance of their cloud-based, real-time platform that serves healthcare facilities with a commitment to 99.99% uptime.

The position offers an opportunity to work on mission-critical systems that directly impact healthcare delivery efficiency. You'll be responsible for designing and implementing self-healing, fault-tolerant systems, managing containerized environments, and developing robust monitoring solutions using cutting-edge technologies like Prometheus, Grafana, and OpenTelemetry.

The role combines technical challenges with meaningful impact - you'll be working on systems that help reduce waste, optimize resources, and improve patient care while delivering 10X ROI to healthcare facilities. You'll join a high-performing team of engineers, AI experts, and healthcare innovators solving real-world challenges.

Key technical aspects include working with AWS cloud infrastructure, Kubernetes orchestration, infrastructure as code using Terraform, and implementing comprehensive observability solutions. The position requires expertise in distributed systems, security compliance (HIPAA, SOC 2), and automated deployment processes.

This remote position offers the chance to work on the East Coast/New York City, collaborating with cross-functional teams to align SRE initiatives with business goals. The role requires 5+ years of experience in SRE or Cloud Infrastructure, with a strong background in scaling high-traffic, mission-critical platforms.

If you're passionate about using technology to improve healthcare operations and want to work with cutting-edge automation and observability tools while ensuring critical healthcare services remain available 24/7, this role offers an excellent opportunity to make a significant impact in the healthcare technology sector.

Last updated 9 hours ago

Responsibilities For Senior Site Reliability Engineer

Ensure 99.99% uptime of cloud platform by maintaining highly reliable and resilient infrastructure
Design and implement self-healing, fault-tolerant systems
Define and maintain SLIs, SLOs, and SLAs
Architect and optimize scalable cloud infrastructure (AWS)
Improve and manage containerized environments (Kubernetes, Docker)
Implement and enhance infrastructure as code (Terraform)
Develop monitoring, alerting, and logging system using Prometheus, Grafana, OpenTelemetry, and Datadog
Participate in incident response and on-call rotations
Conduct blameless postmortems
Automate deployment, scaling, and failover mechanisms
Contribute to disaster recovery and business continuity planning
Work with Product, Engineering, and Infrastructure teams

Requirements For Senior Site Reliability Engineer

Kubernetes

Redis

PostgreSQL

5+ years of experience in Site Reliability Engineering or Cloud Infrastructure
Proven success scaling high-traffic, mission-critical platforms in SaaS, IoT, or healthcare
Deep expertise in cloud platforms (AWS), Kubernetes, and distributed systems
Strong background in monitoring, logging, and observability with Prometheus, OpenTelemetry
Deep knowledge of CI/CD automation, GitOps, and infrastructure as code (Terraform)
Strong understanding of network security, access management, and compliance frameworks (HIPAA, SOC 2)
Experience with healthcare IT, including EHR data, FHIR, and HL7 interoperability (bonus)
Expertise in real-time distributed systems, event-driven architectures, or large-scale data pipelines (bonus)

Kontakt.io

Building the platform that care operations run on, using AI, RTLS, and EHR data to enable self-learning agents to automate workflows in healthcare.

New York, NY, USA

Site Reliability

Senior Software Engineer

Remote

101 - 500 Employees

5+ years of experience

Healthcare · Enterprise SaaS · AI

Interested in this job?

Jobs Related To Kontakt.io Senior Site Reliability Engineer

Senior Site Reliability Engineer

Kontakt.io

Senior Site Reliability Engineer position at Kontakt.io, focusing on maintaining 99.99% uptime for healthcare operations platform using AWS, Kubernetes, and advanced monitoring tools.

Sr. Site Reliability Engineer - Top Secret Clearance

SpaceX

Senior Site Reliability Engineer position at SpaceX, requiring Top Secret clearance, focusing on infrastructure automation and DevOps practices for space flight systems.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Google

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5+ years of software development experience and strong system design skills.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Google

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5+ years of software development experience.

Senior Site Reliability Engineer

Pepperstone

Senior Site Reliability Engineer position at Pepperstone, focusing on building and maintaining highly available cloud infrastructure for a global fintech company.