5 steps to actually learn data science

Data science is a broad and varied field, and hence the path to becoming a unicorn is full of darkness. To light up your path and guide you to become one, here are 5 simple steps to be followed.

4. Learn from peers

It’s amazing how much you can learn from working with others. In data science, teamwork can also be very important in a job setting.

Some ideas here:

  • Find people to work with at meetups.
  • Contribute to open source packages.
  • Message people who write interesting data analysis blogs seeing if you can collaborate.
  • Try out Kaggle, a machine learning competition site, and see if you can find a teammate.

 5. Constantly increase the degree of difficulty

Are you completely comfortable with the project you’re working on? Was the last time you used a new concept a week ago? It’s time to work on something more difficult. Data science is a steep mountain to climb, and if you stop climbing, it’s easy to never make it.

If you find yourself getting too comfortable, here are some ideas:

  • Work with a larger dataset. Learn to use spark.
  • See if you can make your algorithm faster.
  • How would you scale your algorithm to multiple processors? Can you do it?
  • Uunderstand the theory of the algorithm you’re using more. Does this change your assumptions?
  • Try to teach a novice to do the same things you’re doing now.

The bottom line

This is less a roadmap of exactly what to do that it is a rough set of guidelines to follow as you learn data science. If you do all of these things well, you’ll find that you’re naturally developing data science expertise.

I generally dislike the “here’s a big list of stuff” approach, because it makes it extremely hard to figure out what to do next. I’ve seen a lot of people give up learning when confronted with a giant list of textbooks and MOOCs.

I personally believe that anyone can learn data science if they approach it with the right frame of mind.

I’m also the founder of Dataquest, a site that helps you learn data science in your browser. It encapsulates a lot of the ideas discussed in this post to create a better learning experience. You learn by analyzing interesting datasets like CIA documents and NBA player stats. You also complete projects and build a portfolio. It’s not a problem if you don’t know how to code – we teach you python. We teach python because it’s the most beginner-friendly language, is used in a lot of production data science work, and can be used for a variety of applications.

Some helpful resources

As I worked on projects, I found these resources helpful. Remember, resources on their own aren’t useful – find a context for them:

This post is adapted from my Quora answer on how to become a data scientist.

Bio: Vik Paruchuri is a self-taught data scientist, and the founder of Dataquest.io, a platform for learning data science in your browser.