Site Reliability Engineer

last updated June 30, 2021 16:15 UTC

HQ: Remote

Full-Time
Full-Stack Programming

more jobs in this category:

We are looking for Site Reliability Engineers to join our new SRE team. As part of the team you will be taking responsibility for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of customer applications and infrastructure.

What your day-to-day will look like:

Participate in your team’s effort to continuously improve our customer’s production environments
Own your team’s tech and tools stack and contribute to the relevant Open Source projects
Design, analyse and troubleshoot large-scale distributed systems
Being part of your team’s on-call rotation
Learn and share by being part of the Cloud Native community through blog post and conference talks
Automatealmostall the things

Skills and requirements:

Strong engineering OR operations background and the urge to master both disciplines
An analytical mind, debugging and problem solving skills

Strong written and spoken technical communication
Flexibility to learn about and work with different technical environments and teams

Bonus Points (we value curiosity and ability to learn over previous experience):

Strong understanding of the Kubernetes API, core principles and components
Strong knowledge of Linux networking and security related to containers
Production experience with at least one common CI/CD system
Production experience with at least one major cloud provider
Production experience with at least one modern infrastructure automation or configuration management system
Ability to contribute to polyglot code bases

We are building a remote first team across multiple time zones to allow a follow the sun on-call rotation. We are not hiring job descriptions. We hire humans. 🙂 We welcome applications from everybody, regardless of ethnic or national origin, religion, gender identity, sexual orientation or age.