KDnuggets Home » News » 2015 » Jul » Opinions, Interviews, Reports » The missing D in Data Science ( 15:n22 )

The missing D in Data Science

Data science is often talked in terms of tools, insights and emerging use cases, but one of its important pillars domain expertise is left out. Find out why you should be concerned about it.

By Debleena Roy (Bridgei2i).

Imagine being inside a child’s brain as she takes her major life decisions. Imagine the feelings of joy, sadness, fear, disgust and anger jostling with each other as she goes through the one of the biggest changes in her life, leaving her home. Imagine taking the Train of Thought over the Imagination Island and creating long-term core memories that define her growing personality traits (I mean) islands.

Well, the imagination just came alive in Disney’s latest movie “Inside Out”. But of course, this is not meant to be a movie review. Or a recommendation. Though, you must see the movie, if you haven’t already. (That’s a subtle hint)

Now imagine, actually, why imagine, you must be doing it today itself!
Just about to take major a decision for your company based on Data Science.

You believe in data with all your heart and in your dreams you quote Sherlock Holmes:

“Data! data! data!” he cried impatiently. “I can’t make bricks without clay.”― Arthur Conan Doyle, The Adventure of the Copper Beeches

So what’s your character you ask in this imagination game? Of course, you are the celebrated hero of today’s world, not the Superhero clad in the dark cloak with superpowers even nursery kids are no longer in awe of but the new superhero with a growing armor of super-powers like Cloud, Big Data, Python, IOT, (the list continues); the much talked about and much in demand, Data Scientist.

Now just imagine, if you could go behind the mind of the data scientist just the same way as Disney went behind the mind of the child in the movie, Inside Out?

What levers would you find? Which emotions would jostle with each other as the data scientist would show her super-powers and take the decision that will spell impact and change for her company?

Is it all about data? Or are there other levers that you need to push in order to make sure you reach the right decision? Are anger, fear, sadness, disgust and joy jostling with each other inside her brain as she takes her data-driven decision?

It is here I would like to make my main point.

Data Science is as much about change management as it is about solving business problems.

For data science driven decisions to succeed then, knowing the data is not enough. The missing “D” in Data Science is a word we do not often associate with data science armor or job descriptions.


Drew Conway spoke about it as substantive experience in his Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram.

Dr. Vincent Granville spoke about in Data Science Central and so did other people such as Jeff Heaton and Nathan Brixius. But it still seems to be a seldom voiced and unresolved debate, at best.

My point in this debate is that the best data scientists who can really drive business decisions and navigate the organizations through a culture of change are the ones who are not just data but also domain experts. Let’s look at 2 such far-reaching data-driven change decisions from history:

Florence Nightingale’s famous Coxcomb Chart showing deaths from diseases started with her understanding of the prevailing healthcare domain.

Alan Turing’s war-time breaking of the Enigma code owed its success as much to his reasoning powers as it did to his programming expertise.

The people mentioned may not be the usual data scientists but more fit the pattern of what Gartner is terming as the increasing crowd of “Citizen Data Scientists”.

So what’s going behind the Data Scientists” mind now? Confusion? Joy? Fear? Just like each superhero, each Data Scientist could have her own route to discovering her superpowers whether or not she has the official label and designation. The key point is that Data Scientists who are at this point of time dreaming of the next big “Beer and Diaper” Eureka insights moment, might do well to pause a while in the “Train of Thought” and look for the station called “Domain Central” before they continue their data journey.

Remember this dialogue from the same Mr. Holmes?

sherlock-holmes“‘Is there any point to which you would wish to draw my attention?’

‘To the curious incident of the dog in the night-time.’

‘The dog did nothing in the night-time.’

‘That was the curious incident,’ remarked Sherlock Holmes.'”

Exchange between Inspector Gregory & Sherlock Holmes -Silver Blaze

Which brings home the point about domain in a very curious way. Sherlock Holmes knew his domain. Hence he knew what data mattered and what didn’t. Knowing the domain, helps us create the right hypotheses which data science can help us test and prove or disprove. Without domain understanding, data science could become a long fishing expedition in the ever-increasing data-lake and the Iceberg of Business could start melting long before the changes needed are really implemented.

“The Game is On”. Let’s play this data science game, “Inside Out”. Would love to hear your thoughts on this debate.

Debleena Roy is an Analytics Leader at BRIDGEi2i Analytics Solutions in Bengaluru Area, India, focusing on Financial Services.