Site Reliability Engineer II

last updated June 5, 2026 6:35 UTC

more jobs in this category:

*Our roles are remote first, and can be based anywhere in India ().

Responsibilities

Monitor and continually improve the capacity of our production environment
Design and implement scalable, reliable, and efficient infrastructure using Kubernetes, Terraform, AWS resources.
Partner with development teams to improve services through rigorous testing and release procedures with CI pipelines (Github Actions, Dockerfiles)
Gain a deeper understanding of RudderStack infrastructure and help debug incidents
Proactively build software to help operations and support teams
Identify opportunities for process improvements, automation, and cost savings

Requirements

A Bachelor or Master degree in Computer Science or equivalent experience is required
5+ years of experience as a Site Reliability Engineer, Internal Platform Developer or similar role
Strong understanding of cloud computing, containers, and DevOps practices
Demonstrated Linux experience
Excellent debugging skills
Experience with Scripting and infrastructure automation
Familiarity with distributed systems design patterns using tools such as Kubernetes
Familiarity with AWS, Azure or Google Cloud Compute
Excellent verbal and written communication skills
Familiarity with Networking concepts like VPCs, proxies and CDNs

Here are examples of things we’ve worked on:

Build and maintain a Kubernetes platform to deploy all our applications with high availability
Build Kubernetes operator to automate 100s of deployments
Managed 100s of postgres with HA for our deployments
Provision and manage air-gapped on-premise deployments in diverse environments.
Manage multi-region multi-cluster environment with hundreds of customer deployments in single-tenant and multi-tenant models.
Complete Infrastructure as a code and enforced using GitOps model
Automated migrations of complex, highly available services
Working on compliance(i.e. SOC2 Type 2, HIPPA), security, scalability, and a lot more aspects to deliver top class, secure software
We follow FinOps and continuously optimize our cloud costs.

How we achieve results:

Empathy for the problems encountered by our customers.
Collaboration with engineering teams to achieve results.
Care deeply about the quality of your and the team’s code
Curiosity and understanding, for investigating causes and finding effective solutions.
Output driven to provide value to our customers in a significant, measurable, and positive way.
Focus on writing testable, performant, bug-free code to provide the right solutions to the problems.

$62,500 — $117,500/year

Apply info ->

To apply for this job, please visit the application page