Data Scientist
Job Title - Data Scientist
About the Guardian News & Media
Here at the Guardian, we believe the open exchange of information, ideas and opinions has the power to change the world for the better. Our independent journalism holds power to account across the globe and brings information that's suppressed into the public domain.
We are at a fascinating juncture in our history. Our audience has never been larger; our journalism never more successful; our brand never stronger.
The Data & Insight function works in collaboration cross the both the editorial and commercial organisation at the Guardian. Data & Insight is a diverse and welcoming area comprised of Research & Insight, Analytics & Data Science, Data Solutions and Data Partners.
About Analytics
At the heart of the Data & Insight function, Analytics leads and innovates in analytics best practice and delivers data analytics in support of the business objectives of senior stakeholders across the Guardian. Analytics also plays an important role in using data to inform and guide strategic decisions and planning. The analytics and data science teams work both across and within Data & Insight communities. The communities are Journalism, Commercial, and Reader Revenues.
About the Data Science team
We work with the business to identify business problems and reframe them as data problems. Through a combination of maths and computer science we mine the data and apply (or develop) suitable algorithms to resolve these problems, and hence deliver value. To make sure our solutions can be applied in the real world we work closely with our engineering teams on productionisation.
The Guardian has some really great data which present the Data Science team with a fantastic opportunity to effect change across the organisation.
About the role
Work proactively with the GNM business to identify business problems and questions including:
- how to acquire, retain and monetize readers
- how to predict and classify reader behaviour
- how to anticipate and forecast performance changes due to internal and external changes
- Work with the product management team to prioritise business problems and define solutions
- Identify appropriate data sources to answer the question. Combine, wrangle and clean them, conducting deep-dive exploratory analysis where required
- Identify the optimal analytical or machine learning technique to answer the question. This might include techniques such as regression, decision trees, factor analysis and clustering.
- Present the results of analysis in easily digestible formats, using appropriate visualisation, for both technical and non-technical audiences
- Help to foster a culture of data-driven decision-making throughout GNM
- Be identified as a product analytics/data science expert across the business
Knowledge/ Experience/ Skills - Required:
- Experience of working in fast-paced digital organisation
- Strong experience in problem solving with data
- Expert knowledge in the various processes of mining structured and unstructured data (e.g. retrieval, combining, wrangling, cleansing).
- Expert knowledge in the application of analytical, statistical and machine learning techniques to real-world data problems.
- Comprehensive knowledge of standard machine learning algorithms: Ability to quickly and accurately recommend the best algorithm to solve any business problem, including the design of bespoke algorithms. Able to explain the often complex ML algorithms in simple terms.
- An inquisitive mind and a strong desire to learn are essential
- High proficiency working with relational databases (SQL) along with strong Python skills
- Must be able to interact and communicate effectively with diverse groups of technical and non-technical people, both junior and senior
- Graduate in a numerate discipline (e.g. computer science, maths, statistics)
Knowledge/ Experience/ Skills - Highly desirable:
- An understanding of digital product development
- Experience of productionising machine learning in Spark
- Experience of designing and managing successful A/B and multivariate tests