Data Architect / Data Engineer in Hive
Job Published on: September 22nd, 2020
Available to applicants based in: Anywhere
Our missionOur company mission is to map the internet’s attention flows and create transparency about how society assigns credibility to information, people and institutions
What we're working onWe are building an influence algorithm. In other words, we are trying to find ways to describe groups of people mathematically. Many tried and failed before. But we think we can make it work. Our core hypothesis is that influence can be quantified by tracking attention flows. In order to do that, we ingest data streams from multiple sources (we started with Twitter and are now indexing podcasts and soon more). We then cross-reference these datasets in an attempt to continuously improve the accuracy of our algorithm. The accuracy of our work is being verified by members of the groups that we aim to describe. We publish our results in real-time and there are thousands of people already using our scores. It is hard to verify when we are right. But it is very easy to tell when we are wrong. This short feedback loop puts us in a unique position to work on problems that might be much harder or impossible to solve somewhere else.
Work setupWe are a small, VC-funded startup. We are a remote-first team. Most of the team is based in Europe (Berlin, London). You can make your own hours, but everybody is expected to be online during office hours in CET. We try to meet in person and work together for several days at least every 3 months. Other than that the company ‘lives’ in Slack, Notion and other tools enabling effective communication.
About this roleYour main responsibility will be the design, implementation and continuous development of our data architecture. We leverage heterogeneous data streams (Twitter API, RSS feeds, ...) and therefore rely on non-relational databases as the central technology in data warehousing and processing. Deep familiarity with NRDBS (e.g. ArangoDB) in clustered architectures is what we are looking for the most in a candidate. As owner of our data architecture, you will develop a deep understanding of the problems we are trying to solve with data, as well as our company's strategic direction and make design and implementation decisions accordingly. This role is crucial for our company and we will ensure that the successful applicant enjoys the full support of our experienced team of developers and algorithm architects. Additionally, you will also interface closely with our developers and algorithm architects in peripheral tasks up- and downstream from our DBS, such as
- Designing, launching and maintaining crawlers to tap into new data streams from various APIs
- Specifying data requirements and pre-processing routines as well as generating features from raw data
- Relating and matching entities from different data streams
- Measuring and ensuring data quality
- Developing solutions for automatic labeling of data based on machine learning
- Proficiency in *nix and Python
- Extensive experience with relational and non-relational databases (ideally ArangoDB) in clustered architectures
- Experience with the AWS ecosystem
- Experience utilizing 3rd party APIs for web scraping
- Good communication & writing skills
Great to have
- Experience working with API's & RSS feeds
- Interest and familiarity with latest developments in Deep Learning and general AI
Don't apply if
- You get defensive about your ideas
- You need somebody else to organize your work
- You'd rather not talk to people
Do apply if
- You use precise language and you insist that others do to
- You are happy to drop an idea if circumstances have changed and it's no longer the best solution
- You look for systemic flaws in systems and you are proactive in preventing them
- If you work from Berlin, you can work from our Berlin office, located in Mitte.
- Salary: 50.000 € to 70.000 € / year
Interested? Apply now!
Disclaimer: Remoters posts job listings for the convenience of job seekers. Remoters does not endorse or recommend employers, and a posting does not constitute an endorsement or recommendation. Remoters explicitly makes no representations or guarantees about job listings or the accuracy of the information provided by the employer. Read Remoters full disclaimer here.