Gold Blog, Jul 2017What Advice Would You Give Your Younger Data Scientist Self?

I was asked this question recently via LinkedIn message: "What advice would you give your younger data scientist self?" The best piece of advice I honestly think I can give is this: Forget about "data science."

I get asked data science career advice on LinkedIn regularly. As much as I would like to use my limited knowledge to answer peoples' specific questions, it genuinely becomes too time-consuming. I have created posts in the past to try to shed light on topics of interest to wider numbers of people, such as this one, but individual, personalized requests are, unfortunately, not something I can usually partake in.

I hesitated to even write the above paragraph, since I think it may paint me as a bit of a jackass, quite honestly. I don't posses any special insight beyond what my limited experience has allotted me; however, my association with KDnuggets and my activity on LinkedIn perhaps makes people think that the individual behind that activity might be worth reaching out to. I'm really the furthest thing from conceited that there is, and would be happy to share what I do know with everyone, time permitting.

Anyhow, moving on... I was asked this question recently via LinkedIn message:


What advice would you give your younger data scientist self?


While I did not think about the question much at the time, I find that it's popping into my head more and more over the past few days. And I think the reason it does so is for one piece of advice I honestly think I can give people, which is this:


Forget about "data science."


Yep, that's right. You should forget all about data science. Let me explain.

I am a "data scientist" by job title, and I am the editor of a "data science" website, and I deliberately pursued a career of "data science," and I am interested in all things "data science."

But I don't really like the term "data science."

Data science careers
From 5 Career Paths in Big Data and Data Science, Explained.

The reason I don't like the term "data science" is because it is so broad and unspecific.

Data science means nothing, and also everything.

So, what it comes down to, is that I would suggest you forget the data science forest and find a tree that interests you. What aspect of data science caught your attention? Is it machine learning? Is it descriptive statistics? Is it data visualization? Is it distributed computing? Do you like algorithm design? Do you have a background in business, but have sufficient knowledge to talk predictive analytics to clients, and would enjoy interfacing with "the back room?"

This is akin to separating the signal from the noise, and boosting the signal. Finding that personal signal should be your very first task in pursuing "data science."

Understanding what it is you are inherently bringing to "data science" is also valuable. Are you approaching data science from a computer science perspective? From a mathematics background? As a web developer? As a business analyst? A particle physicist?

Realistic expectations are a must, too. Do you have a PhD? A Master's? An undergraduate degree? Are you self-taught, with MOOCs, textbooks, and the like? Don't let anyone tell you that the world of "data science" is unreachable for someone who is self-taught... at the same time, don't expect that you will be doing research at DeepMind after a couple of MOOCs. Know where it is that you might fit in.

Data scientist Maurice Ewing, in his Quora answer to the question "Is data science too easy?" says:

Insofar as R and Hadoop, they’re just part of the data science toolkit. They don’t constitute “data science” any more than a scalpel constitutes “surgery.” In the same way that physics relies upon mathematics, data science relies upon statistical tools for handling large and small data sets, structured and unstructured data, etc. But the mathematics of physics is not a substitute for scientific thinking, analysis, approach or method—and neither are Hadoop and R substitutes for understanding behaviour in data.

He is, of course, absolutely correct. However, his point can also be made from the opposite direction: just as not a single (or pair, in this case) skill does not a data scientist make, data scientists do not possess any single, unified body of skills, with each individual bringing their own set of competencies to the profession.

Calling "data science" a profession also strikes me as odd. It seems rather like grouping all of the possible roles and tasks that all doctors, nurses, physiotherapists, personal support workers, and all other health professionals could assume and labeling them all "health scientists" (yes, I know health science is a thing).

Data science puzzle
From The Data Science Puzzle, Revisited.

I digress, however, after which I reiterate: forget data science. Focus on the piece of the data science puzzle in which you are most interested, and things will come together.

I like to equate data scientists to mixed martial artists: just as the MMA fighter may have Brazilian jiu-jitsu or Muay Thai at their core, and gradually adds additional combat disciplines and techniques as required (boxing, wrestling, etc.), the real world data scientist may start out with a core competency of machine learning, gradually sharpening their visualization, distributed computing, storytelling, and statistical skills as they progress in their career.

Back to data sciencing...