Senior Site Reliability Engineer

Multicloud solutions experts providing end-to-end solutions combining expertise with world's leading technologies across applications, data and security.
Giza, El Omraniya, Giza Governorate, Egypt
Site Reliability
Senior Software Engineer
Remote
5,000+ Employees
3+ years of experience
Enterprise SaaS · Cloud

Description For Senior Site Reliability Engineer

Rackspace Technology is seeking a Senior Site Reliability Engineer to join their Professional Services Center of Excellence focusing on Application Performance Monitoring Suites. This role combines modern SRE practices with observability using tools like Datadog, New Relic, AppDynamics, and Dynatrace to create exceptional customer experiences.

As an SRE at Rackspace, you'll be at the intersection of application performance, user experience, and business outcomes. You'll work with cutting-edge observability tools to help customers understand and optimize their applications. The role involves implementing sophisticated monitoring solutions, building scalable systems, and maintaining robust automation to support engineering goals.

The ideal candidate brings 3+ years of extensive experience in cloud infrastructure (AWS EKS, Azure AKS), Kubernetes, and observability tools. You'll need strong expertise in Kafka for large-scale environments, security operations, and disaster recovery strategies. The position requires proficiency in Python, Go, and bash scripting, along with deep knowledge of monitoring tools like Prometheus, Grafana, and Datadog.

Rackspace offers a collaborative environment where you'll work with development teams to implement new features while ensuring reliability and performance standards. The company has been consistently recognized as a best place to work by Fortune, Forbes, and Glassdoor, offering an inclusive culture that values diverse perspectives and innovative thinking.

This remote position offers the opportunity to shape the future of observability engineering while working with a leading multicloud solutions provider. You'll be part of a team that embraces technology and empowers customers to accelerate their digital transformation journey.

Last updated a day ago

Responsibilities For Senior Site Reliability Engineer

  • Work with customers and implement Observability solutions
  • Build and maintain scalable systems and robust automation
  • Develop and maintain monitoring tools, alerts, and dashboards
  • Analyze metric and log data for anomaly detection and performance tuning
  • Collaborate with development teams to implement and deploy new features
  • Document and share solutions
  • Maintain understanding of customer's business and technical environment
  • Identify performance bottlenecks and resolve root cause of service issues

Requirements For Senior Site Reliability Engineer

Kubernetes
Python
Go
  • 3+ years of experience designing and maintaining AWS EKS, Azure AKS infrastructure with Terraform
  • 3 years experience with Kafka in large-scale environments
  • 3+ years of experience designing and maintaining SaaS environments
  • 3+ years as a SRE with experience in Prometheus, Grafana, Datadog, ELK
  • 3 years experience building and running Kubernetes clusters
  • 3 years experience with observability (monitoring - logging, tracing, metrics)
  • 3 years experience with GitOps CI/CD processes
  • 3 years experience scripting with Python, Go (Golang), bash and AWS CLI tools
  • 3 years experience with security operations
  • 3 years experience with implementing disaster recovery strategies

Interested in this job?

Jobs Related To Rackspace Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at Rackspace, focusing on observability and monitoring solutions using modern SRE practices and tools like Datadog, New Relic, and AppDynamics.

Senior Site Reliability Engineer

Senior SRE role at Oracle focusing on designing and managing scalable infrastructure for enterprise applications using OCI across multiple regions.

Senior Software Engineer, Site Reliability Tooling

Senior Software Engineer position focused on Site Reliability Engineering tooling at Upstart, building and improving infrastructure monitoring and automation systems.

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at Rackspace, focusing on observability and monitoring solutions using modern SRE practices and tools like Datadog, New Relic, and AppDynamics.

Site Reliability Engineer (SRE)

Senior Site Reliability Engineer position at Air Apps in Lisbon, focusing on system reliability, automation, and infrastructure optimization for an AI-powered resource planning platform.