5 Ways Data Scientists Keep Learning After College

Taken from the answers experts gave, here is a compiled list of 5 essential actions and attitudes that keep data scientists learning after their degrees.

By Daniel Levine

Graduation-ceremonyData science not only requires many areas of expertise, but is also constantly changing. As Amy Heineike explained, “The technologies and what we’re building on is evolving so so rapidly that even if you have a really good understanding of something right now, in two years you’re going to be out of date.”

In our interviews with data science experts, they made it clear that a good deal of upkeep is required in order to stay current in the field. Randy Bartlett believes that even after a master’s degree, “you have to learn about 50% of data science on your own.” Edwin Chen went even farther to say that “data science still isn’t really something you learn in school, though more and more schools are offering data science programs.”

To learn more about what’s required of data scientists, we spoke with a number of experts in the field:

  • Randy Bartlett has held analyst roles at Citibank, WellsFargo, PWC, and AstraZeneca, has authored A Practitioner’s Guide to Business Analytics, and holds two patents for predictive modeling.
  • Edwin Chen has worked on ads quality at Twitter, quantitative analysis at Google, and data science at dropbox. His blog is a must-read among data enthusiasts.
  • Jason Dolatshahi created a data science curriculum for General Assemb.ly and taught the first session of the introduction to data science course. He is currently the Manager of Data Science at Bonobos.
  • Amy Heineike co-authored Data Scientists at Work, was the Head of Mathematics at Quid, and is now the VP of technology at a stealth startup.
  • Rob Hyndman has written more than 100 research papers and 5 books. He is currently the Editor-in-Chief of the International Journal of Forecasting.
  • Mark Madsen has received numerous information management awards including the Smithsonian/Computerworld award for innovative use of information technology. He is the President at Third Nature.
  • Andreas Weigend is the former Chief Scientist at Amazon. He’s written over 100 scientific papers on machine learning techniques and is currently a professor at the UC Berkeley Social Lab.
Taken from the answers our panel gave when asked whether you can teach yourself data science, we’ve compiled a list of five essential actions and attitudes that keep data scientists learning long after their degrees.

1. Go to events and join communities

Rob Hyndman provided insight into why it’s crucial for data scientists to connect with their peers through data communities and conferences:

Hanging around Q&A sites like crossvalidated.com is really useful. Typically someone who’s practicing in data science will also be attending conferences like useR! or their local data science meetup group or their local R user group. There’s often speakers coming through that they’re getting new ideas from, or they’re discussing some package that they’ve heard of. There’s a lot of self-learning happening that way.

In both these events and Q&A sites, data enthusiasts are able to connect with one another and discuss their latest findings and roadblocks.

2. Focus on asking the right question, not how to use the right tool

Weigend believes that in order to keep learning, it’s important you avoid getting bogged down in software:

Don’t be swayed by consultants that tell you Hadoop is data science. It’s not about the plumbing; it’s what you do with it. Many consultants make money by selling you systems, but instead you should ask the right questions. That’s why data scientists come from other fields like physics; they are used to carrying out experiments, forming hypothesis. Know what tool to pick for a given problem, and formulate the question.

If you’re stuck on which software to use, read Which Big Data, Data Mining, and Data Science Tools go together?

3. Participate in Kaggle competitions

Kaggle is a platform where data scientists take data posted by companies, and compete to see who can produce the best models for that data. Chen spoke about how open-source competitions like these are a fantastic way to practice your skills:

There’s plenty of data online just waiting to be analyzed (e.g., Kaggle competitions for machine learning, interesting public datasets through a bunch of initiatives), so just start doing it. If you’re looking to start competing, read the Quora answers to What do top Kaggle competitors focus on?