KDnuggets Home » News » 2018 » Mar » Tutorials, Overviews » 5 Things to Know Before Rushing to Start in Data Science ( 18:n11 )

5 Things to Know Before Rushing to Start in Data Science


Strong math understanding, computing skills, critical thinking and presentations skills provide a strong foundation for a career in Data Science.



Here are 5 important things that I wished I had known a year ago when I decided to start a Data Science journey:

 

1. High school Math is fundamental for Data Science.

Matrix calculations, derivatives, eigenvalues, Set Theory, functions, vectors, linear transformations, etc. are extremely important to understand the theory behind statistical methods and programming. Therefore, before starting your next MOOC or Machine Learning book it’s crucial to review all those concepts again. Most schools request students to be proficient at these methods in order to graduate, but the silver lining is that it won’t require too much of your time to refresh or obtain this knowledge.

 

There are plenty of resources to start, but what worked for me was The Manga Guide to Linear Algebra, which is very simple, graphic and provides a great foundation prior getting into more complex stuff.

 

Matrix

Fig. 1: Inversion of a 3x3 matrix

 

My suggestion is to schedule some weeks to review these concepts and to use the Feynman Technique to be able to explain in simple terms each of these topics.

 

2. Although there are many useful internet resources, books are still one of the best tools to learn from.

One of the issues people face today when trying to get into a field such as Data Science is Information Overload, a term used when talking in relation to the effect of having too many resources at the disposal. There are hundreds of MOOCs, online courses, specialisations, videos, etc., but the best use of the most valuable resource that we have, “time”, is to pick a book and start from the basics up to new concepts, and then keep filling the gaps with other books.

 

Learning Data Science should be seen like a building blocks game.

Fig. 2: Lego Blocks

 

I believe this analogy is the best for learning most of the things, but it is extremely useful in our Data Science journey:

  • First, you need to select the toy model you would like to build.
  • Open all the plastic bags and lay all the different pieces on a flat surface, so you can see all the different parts.
  • Understand how each part can be used. Learn about the characteristics: dimension, color, weight, shape.
  • Start building small chunks until you’ve mastered all the uses.
  • Finally, after you’ve followed the instruction manual and built the model you’ve wanted, take all the pieces apart and start experimenting.

The same should be done with all the techniques in each area of Data Science. Learn what most all the blocks are, learn how to use them and then when you want to create more complex stuff look for the missing parts that you don’t have.

 

3. Computing skills are essential, not just for Data Science but for tomorrow’s world.

Not until I started studying for my Data Science master did I realize something that has been whispered for some time through all the blog posts, books, and news and it is the following message:

“Computer Code attributes for more than 80% of our lives today.”

 

Code is in our smartphones, websites, cars, televisions, health system, public transportation, manufacturing of goods, etc.

 

Fig. 3: Word cloud of Programming Languages

 

Almost every job/profession in industry is directly impacted by some program that enables the input, transform and print process of information. Learning about programming and how code works is not only to make software, apps or create a great website. Learning how to program will give you the advantage to understand how technology impacts our lives. Instead of blaming the computer program for “not working”, you will now think systematically and understand where the problem may be. And who knows, maybe you’ll come with better ideas to improve technology from a user perspective.

4. Your critical and analytical skills are very important.

I am a big fan of TV-shows related to crime and problem solving. One example is Scorpion, which narrates the story of a group of geniuses who solve a different range of problems using technology and math skills. The highlight of these type of shows, apart from all the action, jokes, and hero-scenes, is the “Critical Thinking” used by the characters to find the solution to different kinds of problems. This is one thing that is not mentioned in most of the Data Science resources. The ability to find the correct angle to approach a problem will lead you to identify not only which tools to use for any problem, but will sometimes lead you to the most efficient solution.

5. Everyone likes a TED talk, everyone shares good keynotes about leaders. However, YOU must prepare to deliver your findings.

There are many visualization packages (seaborn, ggplot, matplotlib) and software (tableau, excel) that can help create wonderful crisp charts. So, avoid getting saturated with too many options. The most important thing is how the message is delivered. Sometimes the simplest tools will generate a clear, relevant outcome.

Related:


Sign Up