You will be a part of a team that is responsible for deploying, supporting, monitoring and troubleshooting large scale micro-service based distributed systems host in AWS and bare metal with high transaction volume; documenting the IT infrastructure, policies and procedures.
Responsibilities
As a Senior SRE/DevOps engineer you will focus on supporting our team and augment our existing development infrastructure by implementing the automations necessary to streamline our pipeline. Tooling includes Terraform, Kubernetes, ELK and AWS services.
Requirements
A minimum of 5 years experience deploying, monitoring and troubleshooting large scale distributed systems
Background in Linux administration
Scripting/programming knowledge of at least Unix shell scripting
Good networking understanding (TCP/IP, DNS, routing, firewalls, etc.)
Good understanding of technologies such as Apache, Nginx, Databases (relational and key-value), DNS servers, etc
Understanding of cloud-based infrastructure, such as AWS
Experience with systems for automating deployment, scaling and management of containerised applications, such as Kubernetes
Experience with Terraform for infrastructure
Quick to learn and fast to adapt to changing environments
Excellent communication and documentation skills
Excellent troubleshooting and creative problem-solving abilities
Excellent communication and organisational skills in English
Ideally, candidates will also have
Experience deploying and supporting multiple staging/dev environments
Experience maintaining continuous integration and delivery pipelines with tools such as Jenkins and Spinnaker
Experience implementing, operating and supporting open source tools for network and security monitoring and management on Linux/Unix platforms
Experience with Postgres
Experience with security in AWS