SemanticBits is looking for a backend Pythondeveloper to help us build modern data pipelines for digital health services. The project involves analyzing, transforming, and loading interesting and complex datasets inthe health and life sciences domains. You will work with modern pipeline technologies like Luigi and MapReduce to efficiently process large numbers of datasets. This data will ultimately be loaded into ElasticSearch and exposed via Nodejs APIs to researchersinternationally.
Some of the responsibilities of this position include:
Participatewith a diverse Scrum team on a daily basis to build a modern digital solution
Design and developdata structures in an ElasticSearch NoSQL database
Implement scriptsin Python and integrate into Luigi for loading diverse datasets fromnumerous public sources
Create REST services in NodeJS to provideaccess to the data
Write unit tests with Nose
Perform regular peer code reviews
Document lightweight design on User Stories
Leverage virtualization tools like Docker and Vagrant to manage environments and Chef to automatically provision software
Facilitate automated building, testing, and deployment through continuous integration and continuous deployment withJenkins
Deploy to scalable, fault tolerant AWS infrastructure

