KDnuggets Home » News » 2016 » Aug » Opinions, Interviews » How to Become a Data Scientist – Part 2 ( 16:n32 )

How to Become a Data Scientist – Part 2

Check out part 2 of this excellent series of articles on becoming a data scientist, written by someone who spends their day recruiting data scientists. This installation focuses on learning.

University books

Other University Degrees

So a PhD is not for you – perhaps it is the cost, or perhaps you have not yet developed the expertise necessary for research of this nature. Whatever the reason, there is no need to panic, because many universities are now offering Bachelors, Masters and Diplomas specifically designed for data science, where both computer science and mathematics/statistics are on the curriculum (the attentive reader will remember we discussed this in Chapter Two).

Courses like these will certainly take you in the right direction, but take note: they won't be enough to convert you into a ready-made data scientist, because as we know – that takes experience.

Learning Resources (Online Courses and Books)

In a similar sense, solely reading books or completing online courses will not make you an expert, and remember: this is an expert field. However, for arguments sake, let’s say you come from another quantitative field, and hypothetically, this was all you needed to master a chosen subject. That’s great, but don’t forget: you will still face competition, who – in all likeliness – will have far more practical and commercial experience in these areas. This is really important to be conscious of, and so we will return to this concept in Chapter Four.

All this being said, books and online courses are incredibly useful tools to help kick-start your journey, and begin learning new areas or technologies (e.g. deep learning and Spark, respectively). And this takes us back to Sean McClure, who has already been referenced several times in this series. After the release of Part One, we got speaking and he highlighted the following article, initially posted on Quora and since summarised on KDnuggets by Matthew Mayo: How to Learn Machine Learning.

With contributions from Sean and two other well-known machine learning personalities, it proved to be very popular and is an excellent resource for specific recommendations on books, online courses and general advice. If you don’t want to miss out on any valuable advice, I recommend reading the full post on Quora.

Outside of this, I also asked our resident panel of data scientists on their book recommendations, and they came back with:

  • Pattern Recognition and Machine Learning by Christopher Bishop
  • Machine Learning: a Probabilistic Perspective by Kevin P. Murphy
  • An Introduction to Statistical Learning by James, Witton, Hastie and Tibshirani, which, according to Dylan Hogg: “is a great introduction to statistical learning and is an accessible version of the more advanced classic”: The Elements of Statistical Learning
  • Why: A Guide to Finding and Using Causes by Samantha Kleinberg (if you want to know why this is important, take a look at Yanir Seroussi’s post on: Why You Should Stop Worrying About Deep Learning and Deepen Your Understanding of Causality Instead)
  • For a different suggestion, Will Hanninger recommended The Pyramid Principle by Barbara Minto. It does not cover data science specifically, but is valuable for problem solving and presenting
  • And finally, for improving those all-important soft skills, Yanir recommended the classic book by Dale Carnegie: How to Win Friends and Influence People

Of relevance to this section, it is also worth noting that the Experfy has launched a Data Science Certification Course out of Harvard Innovation Launch Lab.

Presenting / Communicating

Even if you have a natural disposition for communicating with different groups of people including the non-technical, this should not be taken for granted. Quite frankly, an absence of effective communication in a commercial environment can be a death sentence to your work, and harm your chances of gaining employment in the first place. In short: it is one of the most crucial aspects of data science, yet it is often overlooked.


All this being said, how can you actually develop and improve your communication, outside of reading the relevant books outlined in the previous section? Ultimately, it goes back to the notion of experience, experience, experience, so grasp every opportunity you can to gain practice, and obtain feedback from others – it really is the only way.

To add to this, I would like to direct you to a post that was written by yours truly in response to feedback received after this was originally published via Experfy. Essentially, the first iteration lacked sufficient detail to do this topic justice, so I returned to our favourite data scientists and spun the answers into a short article dedicated solely to this fundamental skill: Data Science: The Art of Communication.

Stay tuned for Part Three in which ‘The Job Market’ is examined, and this has relevance not just for those aspiring to the field, but for any data scientist seeking a career move.

Bio: Alec Smith is a specialist recruiter within the field of data science and engineering. The position of an agency recruiter offers a unique, cross-sector perspective of commercial analytics and he leverages this viewpoint to write about various topics within data science, technology and hiring. Originally from the UK, he is currently plying his trade in Sydney, Australia. Follow Alec on Twitter @dataramblings.

This post was originally published on Experfy's Blog.


Sign Up