Senior Site Reliability Engineer II (Kafka)

Leading customer engagement platform that empowers brands to be absolutely engaging, helping companies build and maintain engaging relationships with customers that foster growth and loyalty.
Ontario, Canada
DevOps
Senior Software Engineer
Remote
1,000 - 5,000 Employees
5+ years of experience
Enterprise SaaS

Description For Senior Site Reliability Engineer II (Kafka)

Braze, a leading customer engagement platform, is seeking a Senior Site Reliability Engineer II with a focus on Kafka to join their team. This role combines software engineering and systems administration to ensure site reliability and infrastructure scalability. The position operates at impressive scale, handling over 3.3 billion monthly active users and processing hundreds of billions of data points monthly.

The role demands expertise in Kafka performance tuning, monitoring, and automation, with responsibilities spanning architecture design, debugging, and incident management. You'll work with a technology stack including Ruby on Rails, MongoDB, Redis, Kafka, and Kubernetes, creating infrastructure as code and developing deployment pipelines.

As an SRE at Braze, you'll be instrumental in maintaining high availability and meeting enterprise-grade SLAs. The position requires strong collaboration skills, as you'll work with engineering teams to architect scalable solutions and improve infrastructure reliability. You'll also participate in on-call rotations and contribute to incident prevention and resolution.

The ideal candidate brings 5+ years of SRE/DevOps experience, with specific expertise in Kafka streaming applications and performance tuning. You should be passionate about automation, have strong programming skills (particularly in Ruby/Go), and possess deep knowledge of Linux systems.

Braze offers an exceptional work environment with comprehensive benefits, including equity participation, flexible PTO, and extensive professional development opportunities. The company is recognized as a Great Place to Work® across multiple regions and consistently ranks among the best technology workplaces. This role offers the opportunity to make a significant impact while working with a passionate, collaborative team in a remote setting.

Last updated a day ago

Responsibilities For Senior Site Reliability Engineer II (Kafka)

  • Partner with engineering teams on architecting scalable and reliable products
  • Debug reliability and scalability issues across all stack layers
  • Implement monitoring and alerting systems
  • Ensure strict enterprise-grade SLAs are met
  • Create Infrastructure as code using Chef, Terraform, and Kubernetes
  • Develop deployment pipelines using Docker and Kubernetes
  • Manage incidents and be on PagerDuty rotation
  • Create system improvements and automation

Requirements For Senior Site Reliability Engineer II (Kafka)

Kafka
Kubernetes
MongoDB
Redis
Ruby
  • 5+ years of experience as a Software, DevOps, or Site Reliability Engineer
  • 3+ years of Data Streaming Reliability Engineering
  • 3+ years of Kafka performance tuning & automation
  • Strong background in scaling Kafka clusters
  • Experience with Docker, Kubernetes, Terraform
  • Experience with MongoDB, Redis, Kafka, Postgres
  • Strong programming skills - Ruby and/or Go preferred
  • Linux and Unix Shell expertise
  • Excellent ability to manage multiple tasks

Benefits For Senior Site Reliability Engineer II (Kafka)

401k
Dental Insurance
Education Budget
Equity
Medical Insurance
Mental Health Assistance
Parental Leave
Vision Insurance
  • Competitive compensation with equity
  • Retirement and Employee Stock Purchase Plans
  • Flexible paid time off
  • Comprehensive medical, dental, vision, life, and disability benefits
  • Family services including fertility benefits and parental leave
  • Professional development and yearly learning stipend
  • In-office employee experience
  • Volunteer opportunities and donation matching
  • Employee Resource Groups

Interested in this job?

Jobs Related To Braze Senior Site Reliability Engineer II (Kafka)

Senior Site Reliability Engineer II (Kafka)

Senior Site Reliability Engineer II position at Braze, focusing on Kafka infrastructure and streaming applications, offering remote work and comprehensive benefits.

Senior Site Reliability Engineer II (Kafka)

Senior Site Reliability Engineer role focused on Kafka at Braze, managing and scaling distributed systems and ensuring platform reliability.

Senior DevOps Engineer

Senior DevOps Engineer role at Mastercard in Pune, focusing on infrastructure automation, cloud platforms, and container orchestration to support global payment systems.

Production Service Developer 3

Senior Tech Architect role at Oracle focusing on Citrix and Windows environments, requiring 5-7 years experience in system administration and automation.

Senior Software Engineer, DevOps

Senior DevOps Engineer role at Capital One, focusing on cloud infrastructure, automation, and modern DevOps practices using Python, Golang, and AWS technologies.