Site Reliability Engineer II

last updated June 5, 2026 6:35 UTC

RudderStack

more jobs in this category:

  • -> Mentor - Cyber Security Career Track (Part-time/Remote) @ Springboard
  • -> Microsoft SQL Server Database Administrator DBA @ red9.com
  • -> MSSQL Database Administrator @ Paymentology
  • -> Senior Developer - Integrations Team (C#/.NET) @ Deel
  • -> Technical web manager (Remote, Europe or Asia Pacific) I @ Creative Force

*Our roles are remote first, and can be based anywhere in India ().

Responsibilities

  • Monitor and continually improve the capacity of our production environment

  • Design and implement scalable, reliable, and efficient infrastructure using Kubernetes, Terraform, AWS resources.

  • Partner with development teams to improve services through rigorous testing and release procedures with CI pipelines (Github Actions, Dockerfiles)

  • Gain a deeper understanding of RudderStack infrastructure and help debug incidents

  • Proactively build software to help operations and support teams

  • Identify opportunities for process improvements, automation, and cost savings

Requirements

  • A Bachelor or Master degree in Computer Science or equivalent experience is required

  • 5+ years of experience as a Site Reliability Engineer, Internal Platform Developer or similar role

  • Strong understanding of cloud computing, containers, and DevOps practices

  • Demonstrated Linux experience

  • Excellent debugging skills

  • Experience with Scripting and infrastructure automation

  • Familiarity with distributed systems design patterns using tools such as Kubernetes

  • Familiarity with AWS, Azure or Google Cloud Compute

  • Excellent verbal and written communication skills

  • Familiarity with Networking concepts like VPCs, proxies and CDNs

Here are examples of things we’ve worked on:

  • Build and maintain a Kubernetes platform to deploy all our applications with high availability

  • Build Kubernetes operator to automate 100s of deployments

  • Managed 100s of postgres with HA for our deployments

  • Provision and manage air-gapped on-premise deployments in diverse environments.

  • Manage multi-region multi-cluster environment with hundreds of customer deployments in single-tenant and multi-tenant models.

  • Complete Infrastructure as a code and enforced using GitOps model

  • Automated migrations of complex, highly available services

  • Working on compliance(i.e. SOC2 Type 2, HIPPA), security, scalability, and a lot more aspects to deliver top class, secure software

  • We follow FinOps and continuously optimize our cloud costs.

How we achieve results:

  • Empathy for the problems encountered by our customers.

  • Collaboration with engineering teams to achieve results.

  • Care deeply about the quality of your and the team’s code

  • Curiosity and understanding, for investigating causes and finding effective solutions.

  • Output driven to provide value to our customers in a significant, measurable, and positive way.

  • Focus on writing testable, performant, bug-free code to provide the right solutions to the problems.

$62,500 — $117,500/year

Apply info ->

To apply for this job, please visit the application page

Shopping Cart
There are no products in the cart!
Total
 0.00
0