Software Engineer (Large-scale crawling)

Wrapidity Ltd.
28 Feb 2018
19 Mar 2018
Contract Type
Full Time
Job Description

Main duties and responsibilities: What we'll challenge you with?

You will join a team of AI, machine learning, and big data graduates from the very top Universities world-wide, that all share a deep passion for both engineering and science. We take a highly collaborative approach and expect team members to help out where they feel they can contribute and enjoy doing so, all the while contributing to the team's core mission. You will take ownership of a core part of our mission:

  • Own and continuously improve a modern, flexible web technology stack for crawling

  • Design and develop cutting edge methods to scale up high-fidelity crawling based on full-fledged browsers by automatically translating it into crawlers that operate on plain HTML

  • Develop effective strategies and services for effective scheduling and monitoring of large scale (> 100k sources, >1M requests per day) crawling systems

  • Engage with the wider automated testing and crawling community and feed new trends and systems into the team that have the potential for real impact

  • Continually define your role as we grow

What you'll get out of it?

You will be working on some of the most challenging problems in data acquisition, knowledge base construction, and data integration with cutting edge techniques.

  • You will be challenged to expand your skill set constantly and together with your team members dive deep into challenging, often cutting-edge problems to find the best solution.

  • A sense of ownership you won't find elsewhere easily

  • Great people on teams all over the world

  • Get in early as a senior member of a growing department

  • Be at the heart of the growing data science community in London

  • Enjoy a great work environment in Shack 15 ( SHACK15 )

  • Located in the heart of Shoreditch in London

Skills, qualifications and experience: Selection Criteria

Our ideal candidate has (essential criteria):

  • 3+ years of experience (or equivalent) in developing crawlers, automated testing solutions for web technologies or similar applications, as witnessed by previous employment record, involvement in open-source projects, or published peer-reviewed work

  • Experience in running such technologies in a cloud environment, as witnessed by substantial contribution to at least one production-level system

  • Deep experience and understanding of Javascript equivalent to 2+ years of experience in Javascript front- and backend programming

  • Experience and passion for finding solutions to problems that haven't been solved before

  • Feels passionately about software quality and takes pride in their work

  • Believes in validation through software and rapid prototyping

  • Fluent in English (verbal and written) and loves to work, learn, and teach others

Bonus points for (desirable criteria):

  • Experience with Selenium, WebDriver, as well as state-of-the-art headless browsers

  • Experience in large-scale web scraping technologies and/or automated wrapper generation

  • Strong background in Java, e.g., through contributions to an open-source project

  • Experience with using continuous integration for crawler or other web technologies API to ensure continuous availability

  • Experience with big data systems

  • Experience in designing and developing REST APIs

Company Description Do you believe that the web is a largely untapped resource to help make sense of our complex, fast changing world? Have you been frustrated with the limited features of most search and shopping websites when you wanted to pick just the right purchase? Instead of tinkering with small scale scripts, do you want to tackle these challenges at web scale with some of the most cutting edge techniques from AI, big data processing, and natural language processing? Then come and join our team of scientists and engineers at Wrapidity. Together we will push the boundaries of data acquisition far beyond what people consider possible these days. Our mission within Meltwater is to automate data acquisition of data outside the corporate firewall to an unprecedented degree - whether in news, corporate information, or products. We are passionate about the value of this data in providing insights and leading indicators that will be critical to success for many businesses.

More searches like this