My Data Science Online Learning Journey on Coursera
Check out the author's informative list of courses and specializations on Coursera taken to get started on their data science and machine learning journey.
By Ruben Winastwan, Data Science Enthusiast
An Introduction: My Background
It was in 2016 where I started my journey to pursue my Master’s Degree in Computational Mechanics straight away after I finished my Bachelor’s Degree in Mechanical Engineering. Back then, I had limited programming knowledge, let alone knowing what data science and machine learning are.
The kick-start thing occurred when I got a classic Computer Vision project during my Master’s study, where I need to build object detection and object tracking algorithm using Python, C++, and OpenCV. That project really forced me to learn about Python and C++ the hard way as well as how to write a clean code properly.
Long story short, I find myself fascinated by the field of Computer Vision afterwards that leads me to this obsession: I want to become a Computer Vision Engineer.
But the high hopes turned into a dust after I read the job requirements of Computer Vision Engineer in all vacancies: They expect the candidates to know about Machine Learning and Deep Learning, in particular Convolutional Neural Network (CNN).
Back then, I didn’t even know what machine learning is, let alone CNN. Although my study program did touch the area of programming, math, and statistics, but we never talked about machine learning.
After some research, I found out that if you want to learn about CNN, you need to know about Deep Neural Networks (DNN) in general first. If you want to know about DNN, you need to know about classical Neural Networks first. If you want to know about Neural Networks, you need to know about machine learning first. If you want to know about machine learning in general, you need to know about the fundamental of data science first.
It’s like a video game, I need to step up level-by-level until I reach the topics that I want. Plus, I‘m a big fan of the bottom-to-top approach, hence I decided to learn the fundamental of data science first.
The question back then: how can I learn about all of them when my study program didn’t offer courses related to them?
I need to learn all of them by myself.
And that’s the first time I knew that Coursera existed.
Why Coursera, though?
First of all, I don’t mean to endorse Coursera in this article. I just find that they are the best online learning platform for me as there are plenty of courses in data science and machine learning from reputable institutions. Plus, you have the option to audit the course for free and you’ll still get access to the learning materials.
On top of that, if you really want to pursue the certificate for specialization, the overall cost for it is much cheaper compared to Udacity Nanodegree, especially if you’re still a student.
Okay, enough talk about it, let’s jump into my learning pathway.
My Data Science Learning Pathway
I think we all agree that the hardest part of everything is always in the beginning. Same as me when I wanted to get my hands dirty in data science. I kept asking a question: where do I start?
After some research, I finally came up with my online learning curriculum and here are the list of courses or specializations that I took on Coursera in chronological order.
I decided that I want to start learning data science at a very basic level because I don’t want to miss out some important concepts. That’s why I decided to take IBM Data Science as my very first specialization.
You don’t need to have any prior knowledge about data science, statistics, machine learning, or programming before taking this course. The very first course of this specialization is literally called ‘What is Data Science?’. I mean, you won’t get any more basic than this, right?
There are 9 courses in this specialization. It starts with the concept and methodology of data science before delving into programming stuff with Python and SQL. Next, it introduces you to the meat of data science — Statistics, Data Analysis, Data Visualization, and Machine Learning.
You won’t be an expert in data science after completing this specialization, as this specialization won’t teach you each topic in great detail. However, it gave me a very good overview of data science and what should I learn next.
Thanks to this specialization, I was able to create a roadmap for my data science and machine learning online learning journey as follows:
- Data Visualization
- Machine Learning
- Deep Learning
Which then leads me to the next specialization that I took.
This is a specialization offered by Cloudera which focused on utilizing SQL for Big Data analysis. In total, there are 3 courses in this specialization.
As we already know, the amount of data nowadays is just too big to be stored in traditional DBMS, hence knowledge and hands-on experience in dealing with data in distributed clusters are very important. And this course will teach you exactly that.
What I really like about this specialization is how hands-on it was. With the Virtual Machine from Cloudera, we have a chance to apply SQL query to retrieve or to store data with either Apache Hive, Apache Impala, MySQL, or PostgreSQL. You can always revisit the Virtual Machine even after you finished the specialization, so you will always able to revise your SQL skills and play around with the data.
Don’t worry if you know nothing about SQL, as this specialization will teach you from the basics.
I took this course to complement the material that I’ve learned from the previous specialization from Cloudera. While the specialization from Cloudera focused more on applying SQL in distributed clusters, this specialization gave me access to apply SQL on the cloud.
This specialization will teach you about how to retrieve or to store data on Google Cloud Platform (GCP) in BigQuery. You’ll get access to play around with Google public datasets like Google analytics and implement the SQL query by yourself.
Aside from that, what I like about this specialization is that you’ll learn more than just SQL and BigQuery. You’ll also learn about how to use Google Data Studio to create an interactive data visualization dashboard and how to create a simple regression or classification machine learning model directly in BigQuery.
After taking this specialization, I moved forward to learn about one of, if not, the most important concept behind data science and machine learning, which is statistics.
We can agree that statistics is the heart of data science. As I already know statistics before, I took this specialization with the expectation to refresh the fundamental theory of statistics. But in the end, I got more than I was expected.
The specialization really teaches you all you need to know about statistics, starting with the fundamental theory about probability, inferential statistics, and regression theory from both frequentist and Bayesian perspectives.
There are two things that I like about this specialization:
- All of the final projects are portfolio-worthy, which means that you need to do the real statistical data analysis work and don’t expect to finish them in 1 or 2 hours. After you finish the specialization, you will have 3 or 4 portfolio-worthy projects that you can put in your resume.
- You need to use R to finish the project in each course. This was good for me because I have never used R before. I think learning a new programming language will be beneficial in the long run and R is definitely a nice data science and statistical toolbox to add in your skillset.
After finishing the specialization, I felt like I want to dig a little bit deeper about Bayesian statistics, in particular about Markov chain Monte Carlo. That’s why I took one more course about statistics after this specialization, which was…
If you want to know the concept of Bayesian statistics in a comprehensive way, I think this will be the right course for you. In this course, you’ll learn about the concept regarding Markov chain Monte Carlo as well as how to solve regression problems with the Bayesian concept.
What I really like about this course is the balance between the theory and practical aspects.
For every material, the theory will be covered first, and then there will be a demonstration, in which the lecturer will show you how to implement the theory you’ve just learned in a code. In this course, you’ll learn how to implement Bayesian statistics in R and JAGS.
The final project for this course is also portfolio-worthy and pretty much similar to Statistics with R specialization above. You will be asked to do statistical analysis work with Bayesian concepts in R.
After finishing the course, I decided to move forward to the next topic, which is data visualization.
I would normally use Python when it comes to visualizing data, either with the help of Matplotlib, Seaborn, or Plotly. However, I wanted to learn something new — I wanted to learn how to visualize the data using Business Intelligence tools, either with PowerBI or Tableau. And then I found this specialization.
I would recommend this specialization if you are new to Tableau and want to learn to visualize the data with it.
There are 5 courses including a Capstone project in this specialization. The first three courses will give you a theoretical understanding of data visualization best practice and how to tell a story with your data. The fourth course is basically where you get your hands dirty with Tableau, as you will learn how to create an interactive data visualization dashboard and story with Tableau.
What I really like about this specialization is that when you’re enrolled in this specialization, you’ll get free access to use Tableau Desktop for 6 months.
This means that you can explore a lot of functionality of Tableau on your local machine and create a lot of interesting visualizations with it. If the license is expired after 6 months, you’ll have a chance to extend it for further 6 months.
At this point, I have learned about the overview of data science, Big Data analysis using SQL, statistics, and data visualization best practice. Next, it was finally the time for me to learn about machine learning.
As a total beginner in machine learning, I decided to take Andrew Ng’s Machine Learning course knowing that this course is the most well-known course on Coursera regarding machine learning.
And it is totally justified. I believe I couldn’t find a better machine learning course for a beginner than this one.
The course will teach you about the concept of classical supervised and unsupervised machine learning algorithms like Linear Regression, Logistic Regression, SVM, K-means clustering, as well as artificial neural networks. Not only that, Andrew also gave us tips and tricks for applying machine learning system in practice.
Basically, I liked everything about this course.
I liked how passionate Andrew Ng in teaching us about different types of machine learning algorithms. I liked how easy it was for him to explain and simplify difficult machine learning concepts to us. I also liked the programming assignment and how we had the opportunity to implement Neural Networks algorithms from scratch.
If you’re new to machine learning, for me this is the best course that you should take to get you started.
Finally, I was getting closer and closer to reach my initial goal — to learn about the concept of Convolutional Neural Networks.
I still remember how excited I was when I find out that Andrew Ng is the teacher of this Deep Learning specialization. It was not a difficult decision for me to take this specialization right after I finished the Machine Learning course.
The specialization is very well structured. The first course will teach you about the concept of Deep Neural Networks after you learned about the classic Neural Networks in the previous Machine Learning course. Next, it gives the important concepts of Convolutional Neural Networks and Sequence Models.
Andrew Ng as usual is perfect in teaching difficult concepts regarding deep learning algorithms. The programming assignments are interesting, which let you to implement various deep learning algorithms with TensorFlow, one of the most used deep learning frameworks in the industry right now.
However, most of the programming assignments in this specialization are still implemented in TensorFlow 1, which is pretty much outdated now.
I believe that this specialization was called TensorFlow in Practice before DeepLearning.AI changed its name to TensorFlow Developer Professional Certificate.
Anyway, the main reason I took this specialization straight away after finishing Deep Learning specialization is that I wanted to learn how to implement TensorFlow 2 for various deep learning algorithms. And this specialization totally delivered that.
This specialization is a pure hands-on exercise. You won’t find any theory regarding deep learning in it as its focus is to implement deep learning algorithm with the help of TensorFlow. Thus, it is suggested that you already know about deep learning concepts before taking this specialization.
It gives you hands-on experience on how to build deep learning models for image classification, sentiment analysis, poetry generation, and time series forecasting.
As a bonus, if you want to take the TensorFlow Developer Certificate in the future, this specialization would also be the best source for you to prepare for it. I recently took the certification and I can say that this specialization is the best source for the preparation. If you’re interested in my experience of taking the certification, you can read it in the link below.
My Story of Taking the TensorFlow Developer Certification Exam
My overall experience of taking the exam, how I prepared for it, and what I would’ve done differently if I had to take…
I would assume that you already know that taking data science and machine learning courses alone wouldn’t be enough to achieve your goal, whether it is to get a data science job or to master certain data science concepts. Same as me, having taken courses related to CNN doesn’t mean that I mastered it already.
These courses are a great source to give you the foundational knowledge of whatever topics you are interested in. Taking the course is just a starting point and whatever happens next is totally up to you.
Put the knowledge you’ve got from a course into practice to really solidify your new skill. Do some pet projects while or after taking these courses, upload the code on GitHub, and share the projects or the learning material that you’ve learned to other people with a blogpost.
All the best for your data science learning journey!
Bio: Ruben Winastwan is a data science enthusiast, with interests in machine learning and computer vision.
Original. Reposted with permission.
- Free From MIT: Intro to Computational Thinking and Data Science
- The Online Courses You Must Take to be a Better Data Scientist
- 10 Best Machine Learning Courses in 2020