Senior Site Reliability Engineer – Data (REMOTE)

last updated April 24, 2026 12:36 UTC

Discogs Inc

HQ: Beaverton, Oregon

more jobs in this category:

  • -> Mentor - Cyber Security Career Track (Part-time/Remote) @ Springboard
  • -> Microsoft SQL Server Database Administrator DBA @ red9.com
  • -> MSSQL Database Administrator @ Paymentology
  • -> Senior Developer - Integrations Team (C#/.NET) @ Deel
  • -> Technical web manager (Remote, Europe or Asia Pacific) I @ Creative Force

The Discogs Platform team is focused on several objectives: building and supporting performant, cost-effective, reliable infrastructure; developer experience tooling and mentorship; and creating "golden paths" for organization-wide standards and velocity. As a key member of the Platform team, the Senior Site Reliability Engineer – Data will be working closely with other Discogs engineering squads to develop and optimize scalable, well-planned relational database architectures, drive best practices and stability for our use of Kafka and change data capture, and contribute to the Platform team’s operations.

Location

This is a remote position. Open to candidates located in OR, WA, CA, CO, TX, IL

Compensation

Starting Base Salary Range: $130,000 – $140,000 yearly

What You’ll Accomplish

Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

  • Stewarding Discogs’ data stores as a key subject matter expert
  • Leading efforts on the reliability and design patterns of our Kafka and Kafka Connect implementations
  • Establishing data contracts and clear communication standards between CDC producers and consumers
  • Working closely with engineering squads to refactor and re-architect MySQL database schema and indexing for long-term scalability, performance, and cost effectiveness
  • Mentoring engineering squads on Platform best practices for MySQL, Kafka, and other software development lifecycle areas
  • Writing documentation and runbooks that contribute to the engineering organization’s knowledge base
  • Working in a containerized, orchestrated environment
  • Contributing to the Platform team’s disciplines of site reliability and operations, supporting both our squads and Platform’s central infrastructure
  • Participating in on-call rotation, responding to incidents, and troubleshooting data and other operations issues

What You’ll Contribute

Minimum Education and Experience

  • A Bachelor’s Degree in Computer Science or similar area of focus, or equivalent relevant work experience.
  • 5+ years of experience working with Kafka and relational database management systems (RDBMS).
  • 6+ years experience in Ops, DevOps, Site Reliability, Platform or other systems roles.

Required Skills & Abilities:

  • Relational database schema design, query performance optimization, administration (MySQL, Percona Server, AWS RDS)
  • Kafka: Cluster administration (Strimzi), Kafka Connect (Debezium, JDBC)
  • CI/CD (GitHub Actions)
  • GitOps (ArgoCD)
  • Kubernetes (EKS, Kustomize, Karpenter, administration, application manifests)
  • AWS and cloud development (VPC, EKS, RDS, S3)
  • Observability (Datadog, Sentry)
  • Scripting (Shell, Python)
  • Track record of collaboration and mentorship
  • Excellent written communication and documentation skills
  • Continuous learning
  • Ownership and proactive approach to solving large problems

Preferred:

  • Infrastructure-as-code (Terraform)
  • Elasticsearch (ECK administration, scaling, performance)
  • Python (SQLAlchemy, FastAPI)
  • GraphQL (schema design, Apollo federation)
  • REST API
  • Hashicorp Vault
  • Redis
  • Memcached
  • NoSQL Database
  • Data Lake/Warehouse
  • Data Governance
  • Data Security

The Platform team covers a wide range of technical topics and we’d love to hear about your skills beyond this list!


Apply info ->

To apply for this job, please visit the application page

Shopping Cart
There are no products in the cart!
Total
 0.00
0