We’re seeking an elusive hybrid, the Data Engineer, to work as our Engineering team’s lead data guru. This is a great opportunity for someone with software development experience who loves exploring data warehousing solutions technologies (data warehouses, columnar data stores, et cetera) to manage big data sets (e.g. Redshift, Vertica, HBase, Cassandra). We process 30+ million records a day, run distributed algorithms over terabytes of TV data on Redshift, and are changing how everyone from Fortune 100 companies to presidential candidates (Obama 2012!) target their TV advertising.
Responsibilities
Develop and maintain data pipelines for ETL (extract, transform, load) processes
Build tools (in Ruby and Python) to help end users access and process their data more efficiently
Design efficient schemas for new and existing datasets
Participate in architectural decisions for new software components
Implement new product features
Implement performance improvements to existing products (especially where accessing data is a bottleneck)
Serve as a bridge between the Engineering and Analytics teams
Advise and support the Analytics team on how to conduct their work in a scalable, reliable, and efficient way