Principal Data Engineer

last updated March 4, 2021 23:58 UTC

The API Services team is responsible for engineering and delivering cutting-edge services to aide in content delivery to end customers. These services support 110 news brands, and more than 110 million unique monthly visitors.

The Principal Data Engineer will play a key role in architecting, developing and maintaining the data architecture for Gannett’s new Content Platform that supports the content production & delivery systems that are consumed by both our network of 3000 journalists & our customer facing products. You will be expected to design & consume large scale, fault tolerant and highly available architectures. A large part of your role will be forward looking, with an emphasis on optimizing content structures & relationships. If you have a passion for rapid development, automation, learning, challenging and bettering your peers, with a strong desire to operate in a full stack environment, you’d probably fit in well here.

Responsibilities:

  • Collaborate with stakeholders & developers to identify data needs & ideal implementation.

  • Contribute to the architecture and vision of Gannett’s content data pipeline.

  • Track record of evolving complex data environments.

  • Continuously evaluate data usage patterns and identify areas of improvement.

  • Interface closely with data scientists and engineering to ensure reliability and scalability of data environment.

  • Drive future state technologies, designs and ideas across the organization.

  • Provide planning for two-week sprints.

  • Provide day to day operational support for our applications.

  • Improve and establish best practice around our application and infrastructure monitoring.

Automate everything:

  • Containerizing applications with Docker

  • Scripting new solutions/APIs/services to reduce toil

  • Research new tools to optimize cost, deployment speed and resource usage

  • Assist in improving our onboarding structure and documentation.

Responsibility Breakdown:

  • 30% – Data architecture design / review

  • 20% – Mentoring

  • 15% – Application Support

  • 15% – Planning / Documentation

  • 10% – Design applications / recommendations / poc

  • 10% – New Technology Evaluation

Technologies:

Systems:

  • Linux

  • Couchbase

  • Elastic Search

  • Solr

  • Neo4j

  • Other NoSQL Databases

Exciting things you get to do:

  • Engineering high-performant applications with an emphasis on concurrency

  • Agile

  • Amazon Web Services, Google Compute Engine

  • Google DataStore, Spanner, DynamoDB

  • Docker, Kubernetes

  • Database testing

  • GraphQL

  • Fastly

  • Terraform

  • Monitoring with NewRelic

Minimum Qualifications:

  • Deep experience in ETL design, schema design and dimensional data modeling.

  • Ability to match business requirements to technical ETL design and data infrastructure needs.

  • Experience using search technologies like Elasticsearch and Solr and designing the integration of search with a persistent data store.

  • Deep understanding of data normalization methodologies.

  • Deep understanding of both Relational and NoSQL databases.

  • Experience with data solutions like Hadoop, Teradata, Oracle.

  • Proven expertise with query languages such as SQL, T-SQL, NRQL, solr querying.

  • Self-Starter that can operate in a remote-friendly environment.

  • Experience with Agile (Scrum) and test driven development, continuous integration and version control (GIT).

  • Experience deploying to Cloud compute or container hosting.

  • Experience working with data modeling tools.

  • Basic understanding of REST APIs, SDKs and CLI toolsets.

  • Understanding of web technologies.

  • Experience with Data in a media industry is a plus.

Shopping Cart
There are no products in the cart!
Total
 0.00
0