KDnuggets Home » News » 2016 » Oct » Opinions, Interviews » Battle of the Data Science Venn Diagrams ( 16:n36 )

Gold BlogBattle of the Data Science Venn Diagrams

First came Drew Conway's data science Venn diagram. Then came all the rest. Read this comparative overview of data science Venn diagrams for both the insight into the profession and the humor that comes along for free.

7. In February 2014, Michael Malak added a fourth bubble, claiming Conway didn't mean domain knowledge when he saidSubstantive Expertise.

According to Malak, he's Inigo Montoya and we're all Vizzini when it comes to Substantive Expertise: "You keep using that word. I do not think it means what you think it means." Malak split it into Domain Expertise, and...er, knowledge of a domain, like Social Sciences. Maybe I'm dense, but I don't get the distinction. I'm also not sure what he's getting at with Holistic Traditional Research that, unlike Traditional Research, according to its placement doesn't include knowledge of the science you're researching? Am I reading that wrong? Holistic science is a thing, but it's not that thing. Anyways, Data Science is once again back in the unicorn position, and there are three danger zones (one of them double danger). Everyone be hatin' on the hackers.

8. My next example comes via Vincent Granville in April 2014, but he's reposting something by Gartner; I don't know the date of the original.

This is a Venn Diagram of Data Science Solutions, not data science itself; as such, Data Science is one of the circles, with other expertises (often not residing in the same person, but hopefully on the same team) being IT Skills and Business Skills. It kinda bothers me that the text labels are pointing to very specific positions in each slice, but the actual positions are arbitrary. That's business infographics for you.

9. Shelly Palmer guest-blogged for the Huffington Post in 2015, including this figure from a book he wrote:

Pretty standard computer-math-domain triad straight from Conway, but there's one revolutionary element: no danger zone. Now computer-and-domain geeks without stats can do Data Processing without everything going all to hell. Seems reasonable. EDIT: Sorry Shelly, Geringer beat you to it, you're just not very noteworthy anymore.

10. In November 2015, StackExchange Data Science user Stephan Kolassa came up with my personal favorite, addingCommunications to Conway and changing his Substantive Expertise to Business:

For all his effort, he was rewarded with only 21 (I'm one of them) upvotes in this beta-release forum. His categories are pretty good, too. I think I fall under The Good Consultant. Or possible The Mediocre Consultant. The Consultant Who Tries Really Hard? And yes, that's what a four-set Venn diagram looks like, not four circles like Malak's above, which does not contain all the combinations of intersections.

11. In 2016, Matthew Mayo blogged a diagram by Gregory Piatetsky-Shapiro:

Okay, this owes a debt to Tierney from four years prior, and although it purports to be a Venn diagram of data science, (a) it's not a Venn diagram, and (b) Data Science is inside one of the circles. It's good to see Big Data acknowledged, though. But...Calibri? Really? You went with the default font?

12. Finally (and I'm sure I don't have them all; If you know of any Venn diagrams I missed, please let me know!), later in 2016 Gartner redid their busy Data Solutions diagram, and made it prettier and confined to data science, as blogged by Christi Eubanks:

We've come full circle, back to Conway, except again Danger Zone is replaced, this time by Data En gineer. I like the callouts pointing to the edges better than their previous mess, as well.

13. Data Science Venn diagrams of the future:

Wikipedia's page on data science has the following totally-not-a-Venn-diagram:

Really, in my opinion, this is the way to look at data science. Maybe not these exact skills, but it really is a synergy of different disciplines. Unfortunately, skill in one discipline can sometimes mask serious deficiencies in another and give data science a bad name. (I may or may not have contributed somewhat to this phenomena in my misspent youth, like, last year.)

Of course, then you'd need a really complicated Venn diagram. They do exist: here's one for seven sets:

Anyone want to give it a try?

Original. Reposted with permission.