Taro Logo

Site Reliability Engineer

Tecsys is a global supply chain technology company that helps organizations achieve operational excellence through smarter supply chains.
Site Reliability
Senior Software Engineer
Hybrid
501 - 1,000 Employees
5+ years of experience
Enterprise SaaS · Logistics

Job Description

Tecsys, a global supply chain technology company, is expanding its presence with a new office in Bangalore, India. They're seeking a Site Reliability Engineer to join their Network and Security Operations Center (NSOC) team. This role involves working with a high degree of autonomy while collaborating globally with teams across different time zones, particularly in North America. The position focuses on improving platform reliability and uptime through data-driven approaches, implementing automation, and maintaining critical infrastructure. The ideal candidate will have strong experience in systems engineering, cloud platforms (AWS/Azure), and automation tools. The role offers the opportunity to work on large-scale systems while contributing to the company's 24/7 "follow the sun" global support model. The position requires flexibility in working hours to accommodate international collaboration and includes on-call responsibilities. This is an excellent opportunity for an experienced SRE to join a growing global team that's transforming supply chain technology while working with modern tools and practices including CI/CD, monitoring systems like Datadog, and cloud platforms. The role combines technical expertise with cross-functional collaboration, making it ideal for someone who enjoys both technical challenges and team interaction.

Last updated a month ago

Responsibilities For Site Reliability Engineer

  • Collaborate with Engineering teams to support services through system design consulting, developing platforms and frameworks
  • Maintain services by measuring and monitoring availability, latency and system health
  • Develop tools & automation on top of Azure & AWS
  • Scale systems through automation and improve reliability
  • Practice sustainable incident response and blameless postmortems
  • Implement CI/CD automation
  • Implement monitoring, logging, alerting, and SLA Reporting
  • Create and maintain technical documentation
  • Take command of high-severity incidents
  • Collaborate with Platform Engineering team

Requirements For Site Reliability Engineer

Java
Kubernetes
Linux
  • Bachelor's degree in computer science or related technical discipline
  • 5+ years systems engineering experience
  • Experience designing and deploying large scale systems
  • Strong knowledge of system design and high performance computing
  • Experience with full stack automation
  • Knowledge of Datadog or similar tools
  • Knowledge and experience of AWS or Azure required
  • Basic knowledge of Java- or .Net-based development
  • Knowledge of GitLab or Jenkins
  • Proficient English communication skills
  • Experience with SaaS company preferred
  • Experience with FedRamp compliance is an asset