Cloud Site Reliability Engineer (SRE)

Promise

Promise empowers utilities and government agencies to create flexible, affordable solutions for individuals struggling with debt.

Washington, DC, USA

$149,000 - $195,000

Site Reliability

Senior Software Engineer

Hybrid

4+ years of experience

Enterprise SaaS · Finance

This job posting may no longer be active. You may be interested in these related jobs instead:

Platform Reliability Engineer

NVIDIA

Senior Platform Reliability Engineer role at NVIDIA focusing on maintaining and improving the reliability of their Unified Commerce Platform through automated testing and monitoring solutions.

Site Reliability Engineer, AI/ML Platforms

Adobe

Senior Site Reliability Engineer role at Adobe focusing on AI/ML platforms, requiring expertise in Kubernetes, distributed systems, and DevOps practices.

Solutions Reliability Engineer III

Capital Group

Senior Solutions Reliability Engineer role at Capital Group in Singapore, focusing on system reliability and infrastructure management.

Senior DBA & Site Reliability Engineer

Oracle

Senior DBA & Site Reliability Engineer position at Oracle, focusing on cloud infrastructure and database management for healthcare applications with 5+ years experience required.

Senior Site Reliability Engineer

Salesforce

Senior Site Reliability Engineer position at Salesforce, responsible for maintaining and improving the reliability and performance of Salesforce's cloud infrastructure.

Description For Cloud Site Reliability Engineer (SRE)

Promise is an innovative company backed by $50M in funding that helps utilities and government agencies create flexible payment solutions for individuals struggling with debt. They're seeking a Cloud Site Reliability Engineer (SRE) to join their team and take charge of their infrastructure operations.

As an SRE at Promise, you'll be responsible for building, operating, and optimizing the cloud infrastructure that powers their products. This role combines traditional software engineering with systems administration, requiring expertise in both coding and infrastructure management. You'll work with cutting-edge technologies including Kubernetes, Terraform, and various cloud platforms to ensure high reliability, performance, and scalability of their systems.

The ideal candidate should have at least 4 years of experience in Linux system administration and be well-versed in cloud technologies, Infrastructure-as-Code, and containerization. You'll be working in a fast-paced environment where you'll need to balance technical excellence with business objectives, particularly in areas of security and compliance.

This is an excellent opportunity for an experienced SRE who wants to make a social impact while working with modern technologies. You'll be joining a team that includes alumni from prestigious companies like Palantir, Google, and Stripe, working on meaningful problems that help people manage their financial obligations more effectively. The role offers competitive compensation ($149K-$195K) and is based in Washington, D.C. with a hybrid work arrangement.

The position requires strong technical skills combined with excellent communication abilities, as you'll be bridging the gap between technical and non-technical stakeholders. You'll need to be self-sufficient, detail-oriented, and execution-driven, with a proven track record of maintaining and optimizing large-scale production environments. If you're passionate about using technology to solve social problems while working with a talented team on challenging technical problems, this role at Promise could be your next career move.

Last updated 6 days ago

Responsibilities For Cloud Site Reliability Engineer (SRE)

Design, implement, and manage cloud infrastructure to ensure reliability, scalability, and security
Automate infrastructure and operations using Terraform, scripting, and configuration management tools
Develop strong relationships with engineering teams to define system reliability goals and best practices
Troubleshoot and resolve complex network and system issues
Monitor and optimize system performance
Formalize and liaise with the Engineering team through security design review process
Ensure the security and stability of Linux-based production systems
Provide support in aligning technology projects with compliance requirements
Serve as a bridge between technical teams and non-technical stakeholders

Requirements For Cloud Site Reliability Engineer (SRE)

Linux

Python

Kubernetes

4+ years of experience in Linux system administration
Strong debugging skills
Hands-on experience with cloud platforms (AWS, Azure, or GCP)
Expertise in Infrastructure-as-Code (IaC) using Terraform or similar tools
Proficiency in monitoring tools (e.g., Prometheus, Datadog)
Experience with containerization (Docker, Podman, Kubernetes)
Scripting experience (Python, Bash, or equivalent)
Knowledge of networking and security best practices for cloud environments
Must be a US person (US citizen or permanent resident)
Must reside in the US