How to Become a Data Scientist: The Definitive Guide
Data science educator Jose Portilla provides this definitive guide on becoming a data scientist, which includes everything from resources for acquiring specific skills, to searching for the first job, to mastering the interview.
The Community
“Good company in a journey makes the way seem shorter.” — Izaak Walton
The job search for data scientist positions can take a while, its best to begin building out your network!
One of the best ways to begin to build out your network is to attend meetups that involve data science! But you don’t need to be limited strictly to data science, you should attend meetups with any topics that are related to data science, things like Python meetups, Visualization meetups, etc.
Conferences are another great way to connect to data scientists, while many conferences can be prohibitively expensive, conferences will often have a career fair as part of the event. If you only intend to visit for the career fair you can often get discounted or even free passes to the conference. Conferences also often host workshops for you to learn new skills!
You should also begin to check out online communities and resources, things like O’Reilly data newsletter, Kaggle, and KDnuggets are great resources to plug yourself into what is happening in the data science community. Podcasts are another great way to get started learning about the data science community. I recommend checking out Talking Machines, Partially Derivatives, and the O’Reilly Data Show.
It is also worth exploring general technology communities, such as Quora and HackerNews!
The Job Search and the Interview
“If we have data, let’s look at data. If all we have are opinions, let’s go with mine.” — James L. Barksdale
So you’ve learned your skills, networked, and are now ready to begin working as a data scientist!
The Job Search
The first step is to begin your search for a new job, a lot of this will vary depending on your personal circumstances and goals, so I’ll try to keep advice as general as possible.
One of the best ways to begin your search and practice your skills at the same time is to participate in Kaggle challenges and blog about your experience with them. Some Kaggle challenges can even directly lead to interviews as part of the prize! Even if nothing comes of the prize, its still valuable experience on a real data set! Note that Kaggle also has its own job board for data scientists.
Freelancing through sites like UpWork, contributing to open-source projects, and answering questions on StackOverflow is another great way to make your presence known to recruiters.
You will also want to make sure that your CV, LinkedIn, and Github are all updated to reflect your new skills and projects.
Make use of sites like Indeed or DataJobs for a general job search, of try out sites like Triplebyte that directly give you a series of technical interviews to quickly go through the initial interview phase for many companies at once. You can also check out startup jobs with the AngelList Job board and HackerNews Job Board.
The Interview
For better or for worse, many companies still rely on classic interview questions that involve Data Structures and Algorithms. To prepare for these sort of questions you should review topics such as Arrays,Graphs, Recursion, Linked Lists, Stacks, etc… you should reference a book or course, and go through lots of practice problems! I have courses on these topics, you can get a free viewing of some of the material by checking out my popular github repository containing lots of jupyter notebooks with practice questions and solutions!
jmportilla/Python-for-Algorithms--Data-Structures--and-Interviews
Python-for-Algorithms--Data-Structures--and-Interviews - Files for Udemy Course on Algorithms and Data Structures (github.com)
You can also check out a list of practice problems on leetcode
Level up your coding skills and quickly land a job. This is the best place to expand your knowledge and get prepared... (leetcode.com)
For more specific data science questions, you’ll need to familiarize yourself with a wide variety of topics, such as questions on probability, programming questions on R or Python, SQL queries, and possibly big data management (topics such as Spark). You should also familiarize yourself with modeling and the reasoning behind choosing parameters, for example the differences between L1 and L2 regularization.
Many companies also do take home tasks, this can be a great opportunity to get some extra practice in, even if the job offer itself doesn’t pan out.
If you are looking for corporate in-person training, feel free to contact me at: training AT pieriandata.com
Bio: Jose Portilla is a Data Science consultant and trainer who currently teaches online courses on Udemy. He also conducts training as the Head of Data Science for Pierian Data Inc.
Original. Reposted with permission.
Related: