KDnuggets Home » News » 2018 » Apr » Opinions, Interviews » Hedge Yourself From a Risky Data Science Job ( 18:n17 )

Hedge Yourself From a Risky Data Science Job

This article covers why it's important to consider all the factors when being hired as a data scientist.

By Dror Atariah, Data scientist and thinker.

I recently came across a wonderful post by Talia Borodin titled "Think Your Company Needs a Data Scientist? You're Probably Wrong". If you didn't ready it yet, make sure you do it! It contains a wonderful collection of truths that every individual who has anything to do with data science must know by heart. Talia's post is aiming at the entrepreneurs and other managing functions in companies. However, the discussed messages and questions can, and should, be pivoted, thus becoming a valuable tool in the hands of data science professionals themselves. A data scientist who is hired by a company which doesn't manage to answer all four questions properly, is entering a world of pain.

world_of_pain [Source]

How much data is available?

A racing car driver would not join a team that does not have credit for gasoline. As data is the new oil, the data scientist should make sure that the potential organization has data. The quality and value of potential results of data science work are limited by the availability (and quality) of the data. Don't get confused; you will always want more/better data, and that's OK, but something to start with has to be around. Once expectations exist but data not then the track to fail is paved.

Are there established key performance indicators (KPIs) and regular business intelligence reporting?

Without having good KPIs in place the impact of the data scientists' work will not be measurable. Most likely, there are some KPIs defined and measured, but it is important to make sure that they are granular enough and can be used to effectively measure the impact of changes introduced by the data scientist. Note that while high level KPIs are crucial for the business, they are not the best fit when it comes to evaluation of data products. It is very unlikely that the first improvement resulting from a data science effort will increase the bottom line revenue. The first iteration is more likely to effect (and hopefully improve) a more "raw" KPI; something like click through rate or average basket size. Thus it is important to have an established system that can monitor KPIs on various resolutions.

What do they imagine you will do once hired?

This question is all about expectations management. If the company does not have expectations from its future data scientist, albeit this might sound nice, then the future hire is likely to work on the wrong problems. On the other hand, expectations of the type "boost efficiency" and "improve customers' satisfaction", are very hard to quantify and, let alone, achieve. Without an initial idea what problems the data scientist is expected to tackle, it is very likely that mutual frustration will join the party rather quickly. Moreover, clear expectations are likely to be backed up by KPIs, and thus also turning evaluation of the quality more accessible.

What support networks are provided?

Productization of data science projects is hardly every a one-man-show. At least Devops, developers and data engineers are needed to collaborate during the productization process. Without their support and involvement the data scientist will be blocked and the amazing solutions will be kept locked without having the mere chance of moving the needle. Be careful, this is not the only aspect of network that you will want to have in your next position. You would also need someone who will be ready to collaborate with you during the development process. Your fellow developers normally have a well structured reviewing process of their work. This process is there for a good reason; as you probably know, it is very hard to detect flaws, issues, and bugs in your own code. Duck debugging is great, but not as great as a peer (not necessarily a data scientist on his own). For example, a business stakeholder who is keen on solving the problem you were asked to solve and ready to dive a little deeper into the code is a great start. Other aspects of the needed network may have to do with established technology and workflows enabling a smooth transition from a local POC to a full fledged data science product.


The world is not black and white and shades of gray are ruling. Companies hire data scientists at different stages of evolution, and in turn their needs, expectations, readiness etc. may vary. However, it is crucial to align on the expectations. It can well be that some of the answers to the questions raised above are partial and it is not the end of the world, but you and the company should be aligned on what it means. Hiring data scientists is not a magical solution and must not be treated as one. A data scientist can and should add value to the company's business, but it is very unlikely to be a successful journey if treated as a uni-sided process. The success of the data scientist heavily depends on the commitment of the company and without it mutual frustration is awaiting just around the corner.

Original. Reposted with permission.