Platforms for data science portfolio building. Image by Benjamin O. Tayo.
In the modern age of information technology, there is an enormous amount of free resources for data science self-study. As a matter of fact, you can even design your own data science curriculum from the innumerable amount of available resources. While knowledge acquired from course work is essential to lay a good foundation in data science, you need to remember that data science is a practical field. As such, hands-on skills are very important, especially if you are interested in working outside academia as a practicing data scientist.
This article will discuss 4 important platforms that will enable you to build a portfolio to showcase your experience in data science. A strong portfolio will give your employer an edge over the competition in attracting the best possible talent in the workforce. Keep in mind that employers interested in hiring you are going to ask you to provide evidence of completed data science projects. This famous quote from Elon Musk summarizes the mindset of employers in any technical discipline, including data science:
“Generally, look for things that are evidence of exceptional ability. I don’t even care if somebody graduated from college or high school or whatever… Did they build some really impressive device? Win some really tough competition? Come up with some really great idea? Solve some really tough problem?”
A strong portfolio highlighting a list of completed projects, recognitions, and awards will serve as evidence of your competence in data science.
Before delving into the topic of building a good data science portfolio, let’s first discuss 5 reasons why a data science portfolio is important.
5 Reasons why a data science portfolio is important
- A portfolio helps you showcase your data science skills.
- A portfolio enables you to network with other data science professionals and leaders in the field.
- A portfolio is good for bookkeeping. You can use it to keep a record of your completed projects, including datasets, codes, and sample output files. That way, if you have to work on a similar project, you can always use code that has already been written, with only minor modifications.
- By building a portfolio and networking with other data science professionals and leaders, you get exposed to technological changes in the field. Data science is a field that is ever-changing due to advances in technology. To keep up with the latest changes and developments in the field, it is important to join a network of data science professionals.
- A portfolio increases your chances of getting a job. I’ve had numerous opportunities from LinkedIn, for instance, recruiters reaching out to me for job opportunities in data science.
Let’s now discuss 4 important platforms for creating a data science portfolio.
Platforms for Building a Data Science Portfolio
GitHub is a very useful platform for displaying your data science projects. As a data science aspirant, GitHub should serve as the first platform that you use as a repository of completed projects throughout your data science journey. These projects could include projects from weekly assignments or capstone projects. This platform enables you to share your code with other data scientists or data science aspirants. Employers interested in hiring you would check your GitHub portfolio to assess some of the projects you’ve completed. So, it’s important for you to build a very strong and professional portfolio on GitHub.
To establish a GitHub portfolio, the first thing to do is create a GitHub account. Once your account is created, you may go ahead to edit your profile. When editing your profile, it’s a good idea to add a short biography and a professional profile picture. You may find an example of a GitHub profile here: https://github.com/bot13956.
Now let’s assume that you’ve completed an important data science project and you would like to create a GitHub repository for your project.
Tips for creating a repository: Make sure you choose a suitable title for your repository. Then include a README file to provide a synopsis of what your project is all about. Then you may upload your project files, including the dataset, Jupyter notebook, and sample outputs.
Here is an example of a GitHub repository for a machine learning project:
Repository Name: bot13956/ML_Model_for_Predicting_Ships_Crew_Size
Repository URL: https://github.com/bot13956/ML_Model_for_Predicting_Ships_Crew_Size
Author: Benjamin O. Tayo
We build a simple model using the cruise_ship_info.csv data set for predicting a ship's crew size. This project is organized as follows:
(a) data preprocessing and variable selection;
(b) basic regression model;
(c) hyper-parameters tuning; and
(d) techniques for dimensionality reduction.
cruise_ship_info.csv: dataset used for model building.
Ship_Crew_Size_ML_Model.ipynb: the Jupyter notebook containing code.
You can see from the sample README file that the file provides a good summary of what the project is all about, including goals and objectives, the dataset, and the Jupyter notebook file containing the code. When preparing a repository, always keep in mind that other users will have access to it since it is public, so you want to prepare it in such a way that it’s easy to understand.
Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges. On this platform, you can have access to datasets, courses, notebooks, and competitions. Again, as a beginner, you’ll have to create an account, then set up your profile, including a profile picture and a short bio.
One of the primary purposes of joining Kaggle is to network with other data science professionals. It doesn’t matter if you are new to data science or if you are a seasoned data scientist, you can find a suitable forum on Kaggle that would allow you to discover content and engage in discussion around topics that you’re interested in. Your end goal should be to enter and participate in data science competitions launched on this platform. Because most competitions encourage teamwork, it is important to build a network with other data science aspirants who can serve as team members for Kaggle challenge competitions. As you participate in Kaggle competitions, you can showcase your completed projects, including your datasets, Jupyter notebooks, and project reports on your public profile.
LinkedIn is a very powerful platform for showcasing your skills and for networking with other data science professionals and organizations. LinkedIn is now one of the most famous platforms for posting data science jobs and for recruiting data scientists. I’ve actually got numerous data science interviews via LinkedIn.
Make sure your profile is up-to-date at all times. List your data science skill sets, as well as your experiences, including projects that you’ve completed. It would be worthwhile to also list awards and honors. You also want to let recruiters know that you are actively searching for a job. Also, on LinkedIn, you want to keep up-to-date by following data science influencers and publications such as KDnuggets, Towards Data Science, and Towards AI. These companies post updates on interesting data science articles on various topics, including machine learning, deep learning, and artificial intelligence.
Find an example of my posts on LinkedIn from here: https://www.linkedin.com/in/benjamin-o-tayo-ph-d-a2717511/detail/recent-activity/shares/
Medium is now considered one of the fastest-growing platforms for portfolio building and for networking. If you are interested in using this platform for portfolio building, the first step would be to create a Medium account. You can create a free account or a member account. With a free account, there are limitations on the number of member articles that you can actually access per month. A member account requires a monthly subscription fee of $5 or $50/year. Find out more about becoming a Medium member from here: https://medium.com/membership.
Once you’ve created an account, you can go ahead and create a profile. Make sure to include a professional picture and a short bio. Here is an example of a Medium profile: https://medium.com/@benjaminobi.
On Medium, a good way to network with other data science professionals is to become a follower. You can also follow specific Medium publications that are focused on data science. The 2 top data science publications are Towards Data Science and Towards AI.
One of the best ways to enhance your portfolio on Medium is to become a Medium writer.
Why should you consider writing data science articles on Medium?
Writing Medium articles has 5 main advantages:
- It provides a means for you to showcase your knowledge and skills in data science.
- It motivates you to work on challenging data science projects, thereby improving your data science skills.
- It enables you to improve your communication skills. This is useful because it enables you to convey information in a way that the general public can understand.
- Every article published on Medium is considered intellectual property, so you can add a medium article to your resume.
- You can make money from your articles. By means of the Medium Partner Program, anyone who publishes on Medium can make their articles eligible for earning money.
If you are interested in becoming a data science Medium writer, here are some resources that can get you started:
Beginner’s Guide to Writing Data Science Blogs on Medium
Choose the Right Featured Image For Your Data Science Articles
In summary, we’ve discussed 4 important platforms that could be used for building a data science portfolio. A portfolio is a very important way for you to showcase your skills and to network with other data science professionals. A good portfolio will not only help you keep up-to-date with new developments in the field, but it will help increase your visibility to potential recruiters.