KDnuggets Home » News » 2019 » Apr » Opinions » The most desired skill in data science ( 19:n17 )

Platinum BlogThe most desired skill in data science

What is the biggest skill gap in data science according to hiring managers looking for hire recent graduates? Hint: it’s not coding.

By Kaiser Fung, Founder, Principal Analytics Prep

This skills gap is typically described as the lack of “critical thinking.”

It’s hard to imagine that someone with a STEM degree lacks critical thinking, so let’s unpack what this means.

Critical thinking in data science can be broken down into two aspects. First is the ability to develop the question. In practice, this involves extensive interviewing with users of the data science or analytics results to truly grasp the problems that need to be solved. Many practitioners, including several speakers at a conference I attended recently, note that users are frequently unable to express the problem properly.


That’s not quite how I’d describe it. The process of developing the question requires collaboration between the data scientists who know much more about the data and the analytical tools and the business owners who know much more about the business goals and metrics. The collaboration leads to sharing of knowledge and symbiotic problem solving.

The second aspect of critical thinking is the ability to question the data. Experienced analysts do not ever dump raw data into a software and see what comes out. Experience tells us what adjustments might be necessary to remove potentially distracting or misleading features in the data.

STEM training is particularly lacking in these two aspects of critical thinking. A typical problem in a math, science or engineering class includes (a) a well-posed question, and (b) nicely-shaped data, and the student’s challenge is to figure out which formula or method can use the provided data to answer the specified question. There is no need to develop the question further; in fact, any student trying to change the question will be penalized! There is also no need to question the data. If the data should be questioned, then the problem will have no single correct answer, which doesn’t fit well with traditional academic STEM training. (By contrast, social science graduates are better trained to handle complexity and incomplete data.)

In a recent blog post, I showed how a data analyst can use critical thinking to question the data and avoid making embarrassing erroneous conclusions. Analysts at the National Highway Traffic Safety Administration (NHTSA) failed to notice gaping holes in the data submitted by Tesla when they endorsed Tesla’s claim that the auto-pilot feature would reduce crash rates by 40 percent. An independent consultant succeeded in getting the data released, and noticed a large number of blank entries. When the missing values were imputed using a standard method (mean imputation), the reported benefit of auto-pilot vanished entirely.

Junkcharts Tesla Imputed

Bridging this skill gap is a key goal of mine when I started Principal Analytics Prep. We accomplish this by fostering in-class, hands-on learning with practitioners who have years of practical work experience, and finding students with diverse backgrounds to consider both science and social science approaches to problem solving.

In Part 2, I provide materials to help you prepare for case interviews that hiring managers use to test critical thinking.

This post was originally published at Kaiser Fung’s blog (https://www.principalanalyticsprep.com/news/critical-thinking-the-most-desired-skill-in-data-science) and is slightly modified.

Bio: Kaiser Fung is the founder of Principal Analytics Prep, a leading data analytics bootcamp; best-selling author of Numbers Rule Your World; and the author of Junk Charts (https://www.junkcharts.com), the popular data visualization blog.

Twitter: @junkcharts

Youtube: bitly.com/fungwithdata-1



Sign Up

By subscribing you accept KDnuggets Privacy Policy