Java ‘big Data’ Engineer

last updated June 20, 2021 1:21 UTC

Spinn3r

HQ: Remote

more jobs in this category:

  • -> Website & App Tester @ PingPong
  • -> Entry Level Content Writer @ Jerry
  • -> Code Challenge Reviewer - Review Code In Your Spare Time - £50 Per Hour @ Geektastic
  • -> Frontend Developer (React) @ Cake
  • -> Frontend Engineer @ Torc

Company

Spinn3r is a social media and analytics company looking for a talented Java “big data” engineer.

As a mature, ten (10) year old company, Spinn3r provides high-quality news, blogs and social media data for analytics, search, and social media monitoring companies. We’ve just recently completed a large business pivot, and we’re in the process of shipping new products so it’s an exciting time to come on board!

Ideal Candidate

We’re looking for someone with a passion for technology, big data, and the analysis of vast amounts of content; someone with experience aggregating and delivering data derived from web content, and someone comfortable with a generalist and devops role. We require that you have a knowledge of standard system administration tasks, and have a firm understanding modern cluster architecture.

We’re a San Francisco company, and ideally there should beleast a 4 houroverlap with the Pacific Standard Time Zone (PST / UTC-8). If you don’t have a natural time overlap with UTC-8 you should be willing to work an alternative schedule to be able to communicate easily with the rest of the team.

Culturally, we operate as a “remote” company and require that you’re generally available for communication and are self-motivated and remain productive.

We are open to either a part-time or full-time independent contractor role.

Responsibilities

  • Understanding our crawler infrastructure;

  • Ensuring top quality metadata for our customers. There’s a significant batch job component to analyze the output to ensure top quality data;

  • Making sure our infrastructure is fast, reliable, fault tolerant, etc. At times this may involve diving into the source of tools like ActiveMQ to understand how the internals work. We contribute to Open Source development to give back to the community; and

  • Building out new products and technology that will directly interface with customers. This includes cool features like full text search, analytics, etc. It’s extremely rewarding to build something from ground up and push it to customers directly.

Architecture

Our infrastructure consists of Java on Linux (Debian/Ubuntu) with the stack running on ActiveMQ, Zookeeper, and Jetty. We use Ansible to manage our boxes. We have a full-text search engine based on Elasticsearch which also backs our Firehose API.

Here’s all the cool products that you get to work with:

  • Large Linux / Ubuntu cluster running with the OS versioned using both Ansible and our own Debian packages for software distribution;

  • Large amounts of data indexed from the web and social media. We index from 5-20TB of data per month and want to expand to 100TB of data per month; and

  • SOLR / Elasticsearch migration / install. We’re experimenting with bringing this up now so it would be valuable to get your feedback.

Technical Skills

We’re looking for someone with a number of the following requirements:

  • Experience in modern Java development and associated tools: Maven, IntelliJ IDEA, Guice (dependency injection);

  • A passion for testing, continuous integration, and continuous delivery;

  • ActiveMQ. Powers our queue server for scheduling crawl work;

  • A general understanding and passion for distributed systems;

  • Ansible or equivalent experience with configuration management;

  • Standard web API use and design. (HTTP, JSON, XML, HTML, etc.); and

  • Linux, Linux, Linux. We like Linux!

Cultural Fit

We’re a lean startup and very driven by our interaction with customers, as well as their happiness and satisfaction. Our philosophy is that you shouldn’t be afraid to throw away a week’s worth of work if our customersaren’t interested in moving in that direction.

We hold the position that our customers are our responsibility and we try to listen to them intently and consistently:

  • Proficiency in English is a requirement. Since you will have colleagues in various countries with various primary language skills we all need to use English as our common company language. You must also be able to work with email, draft proposals, etc. Internally we work as a large distributed Open Source project and use tools like email, slack, Google Hangouts, and Skype;

  • Familiarity working with a remote team and ability (and desire) to work for a virtual company. Should have a home workstation, and fast Internet access, etc.;

  • Must be able to manage your own time and your own projects. Self-motivated employees will fit in well with the rest of the team; and

  • It goes without saying; but being friendly and a team player is very important.

Compensation

  • Salary based on experience;

  • We’re a competitive, great company to work for; and

  • We offer the ability to work remotely, allowing for a balanced live-work situation.

Shopping Cart
There are no products in the cart!
Total
 0.00
0