15 Minute Guide to Choose Effective Courses for Machine Learning and Data Science

Advice for young professionals in non-CS field who wants to learn and contribute to data science/machine learning. Curated from personal experience.

Do as much side-study of mathematical basics as possible

This aspect of learning cannot be over-emphasized — especially for non-CS graduates and IT engineers who are not in touch with rigorous mathematics for some years into their professional lives. I even wrote a medium article on what mathematics knowledge is necessary to have for machine learning and data science.

Mathematics necessary to learn/refresh for gaining foothold in data science/machine learning

For this I chose few courses from Cousera and edX. Few of them stand out in their depth and rigor. Those are,

  • Statistical Thinking for Data Science and Analytics (Columbia Univ.):Foundation statistics course from Columbia University on their Data Science Executive certificate program on edX. Rigorous but drills down the concepts very well in a structured manner.
  • Computational Probability and Inference (MIT):This is a hard one from MIT, be aware! It covers advanced topics like Bayesian models and Graphical models in unparalleled depth.
  • Statistics with R Specialization (Duke Univ.): This is a 5-course (last one is a capstone project, you can ignore that) specialization from Duke University to enhance your statistics foundation along with hands-on programming exercise. Recommended for balanced difficulty level and rigor.
  • LAFF: Linear Algebra — Foundations to Frontiers (UT Austin): This is an amazing course in linear algebra foundation (along with deep discussion about high-performance computing of linear algebra routines) that you must give a try. Offered by University of Texas, Austin on edX platform. Trust me when I say, after taking this course, you will never want to invert a matrix to solve a linear system of equations even if that is tempting and easy to understand but you will try to find a QR factorizationor Cholesky decomposition to reduce the computation complexity.
  • Optimization Methods in Business Analytics (MIT): This is a course in optimization/operation research methods for business analytics from MIT. I signed up because this was the only highly-rated course on a good platform (edX) that I could find about linear and dynamic programming techniques. I believed that learning about those techniques could be immensely helpful as the optimization problem turns up in almost all machine learning algorithm.

Please note that I did not search and sign up for any calculus course as I was comfortable with the level of knowledge I could remember (from college days) and what I expected to be useful for any machine learning or data science study and practice. If you are rusty in that area, please search for a good one.

Machine Learning — various personalities make it a colorful affair

Somewhere among all these side-studies, I managed to complete the course that is considered as one of the pioneers of all MOOCs — Andrew Ng’s machine learning course on Coursera. I guess there are plenty of articles written about it already, and therefore, I will not waste any more of your time describing this course. Just take it, do all the homework and programming assignments, learn to think in terms of vectorized codes for all the major machine learning algorithms that you know of, and save the notes for ready reference for your future work.

Oh, by the way, if you want to brush up/ learn from scratch MATLAB (you will need to write MATLAB codes for this course, not R or Python), then you can check out this course: Introduction to Programming with MATLAB.

Now, I want to talk about personalities.

I took multiple machine learning courses and the aspect I enjoyed most was realizing how the treatment of the same fundamental subject becomes a function of the personality and worldview of different instructors :) This was a fascinating experience.

I am listing down the various machine learning MOOCs I signed up and covered…

  • Machine Learning (Stanford Univ.): Andrew Ng’s widely known course. Talked about it in the paragraph above.
  • Machine Learning Specialization (Univ. of Washington): This comes with a different flavor than Ng’s. Emily Fox and Carlos Guestrin present the concepts from a statistician’s and a practitioner’s perspective respectively. I could not install the Python package that Carlos’ company offers as a free license but this specialization is worth completing for its theory lectures alone. The proofs and discussion of some of the fundamental concepts like bias-variance trade-off, cost computation, and comparison of analytic vs. numerical approaches for cost function minimization, are more intuitively and carefully presented than even Prof. Ng’s course (and that’s saying something given the superb quality of Prof. Ng’s teaching).
  • Machine Learning for Data Science and Analytics (Columbia Univ.): This course had a little unusual syllabus for a general machine learning course by devoting the full first half on conventional algorithms lectures. It covered essential sorting, searching, graph traversing, and scheduling algorithms. There is not much one-to-one discussion about how these algorithms are exactly used in the machine learning problems but studying about them gives you an idea about the traditional computer science knowledge necessary to appreciate how large-scale data science problems are tackled. Think O(n^3) whenever you are about to multiply to matrices or think O(nlog(n)) whenever you are sorting a list. You may not exclusively use this knowledge in your day-to-day job, but knowing about these nuts and bolts of computation process certainly broadens your worldview about the problem at hand.
  • Data Science: Data to Insights (MIT xPro 6 weeks online course): This one is among the very few paid courses I have taken (I generally go Audit route for MOOCs). This is not available on public edX website although it uses the edX platform for delivering content. The 6-week course is well-structured and full of interesting content which opens up the wide world of data science and machine learning to the uninitiated. The case studies are very interesting but reasonably hard and time consuming to codify. Lectures are very engaging with the illustration of those case studies. My particular favorite module was the one about recommendation system. I literally started viewing the Netflix screen on my laptop in terms of adjacency matrix after taking this class!
  • Neural Networks for Machine Learning (Univ. of Toronto): This is a somewhat underrated course on Coursera, even with the neural network pioneer Jeff Hinton as the instructor. I realize that Andrew Ng’s new Deep Learning specialization will directly compete with this course and I would not be surprised if Coursera removes this in near future. However, while it is there, a deep learning enthusiastic should sit through this one, even if just to gauge the pattern of the historical development of deep networks.
  • Deep Learning Specialization (deeplearning.ai): This is the newest kid on the block but it stands of the very board shoulder of Andrew Ng, and therefore boasts of very strong legs :) I have finished the 2nd course and on to the 3rd now. Jury is still out there but definitely you should consider completing this series if you want to brush over the latest trends in deep learning. Even if the programming assignments look hard and you want to stay out of programming a deep network by hand (you can argue there are always excellent open-source packages like TensorFlow, Keras, Theranos, out there to take care of the nuts and bolts under the hood), it is imperative to have deep understanding of the essential concepts such as regularization, exploding gradient, hyperparameter tuning, batch normalization, etc. to effectively use those high-level deep learning frameworks.

Two umbrella data science MOOCs with R and Python

As we draw closer to the end of this long article, I wanted to list down two multi-course MOOCs I found interesting and useful to go along with the specific subject areas mentioned above.

  • Data Science Specialization (John Hopkins Univ.): This one is a well-known 10-course specialization offered on Coursera. Not every course will appeal to every leaner. I personally completed only 5 of the 10. The key thing is the timing i.e. when to start this specialization. Often this comes up at the top of the Google result when one researches about MOOCs for data science and therefore this becomes the first MOOC for many new learners. Personally, I would have had problem getting the full value from this course if I had done that. The introductory Microsoft and Udemy courses on R and few statistics and linear algebra courses before this helped me immensely to extract the full benefit from these set of courses. As the specialization is instructed by professors from bio-statistics department of JHU, one gets an excellent treatment of two aspects of data science which are often under-represented in many curriculum— research study and design of experiment.
  • Data Science Micromasters certificate program (UC San Diego): I have just enrolled and started the 1st of the 4 courses in this series/certificate program. I like the fact that this is similar in breadth and goals as the John Hopkins specialization, except it chooses Python as the working language for the hands-on portion. The structure and content seems well thought out covering basics of Python, Git, Jupyter all the way up to Big data processing with Apache Spark framework (statistics and machine learning courses thrown in the middle). The case studies and hands-on examples are drawn from real world application of data science such as wildfire modeling, cholera outbreak, or world development indicator analysis. One of the lead instructors is Ilkay Altintas, who has created amazing platform for helping wildfire dynamics prediction and is putting the fruits of data science research for pursuing societal good. I am sure my journey with this specialization will be an exciting and rewarding one. You are welcome to join the party!

Learning is pretty democratized — take advantage of it

With the advent of MOOCs, open-source programming platforms, collaboration tools, and virtually unlimited free cloud-based storage, learning is as democratized, ubiquitous, and universally accessible as it can get. If you are not a specialist on data science/machine learning but want to learn the subject, write some code for higher productivity at work, strive for a career enhancement, or just have some fun, now is the time to start learning. Few parting comments,

  • You are a data scientist: Do not let any so-called expert demoralize you by saying something like “MOOCs are for kids, you won’t learn real data science like that”. The very fact that you are trying to data science by enrolling in a MOOC means two things: (a) you already deal with data in your professional life and (b) you want to learn scientific, structured manner of extracting maximum value from your data and generate intelligent questions around that data. That means you, my friend, are already a data scientist. If still not convinced, read this blog by Brandon Rohrer, one of the most admired and inspirational data scientists that I know of.
  • You don’t have to spend a large sum for this learning: I know that I listed a lot of courses and they may look expensive to you. But, fortunately, most (if not all), can be enrolled into free of cost. edX courses are always free to enroll and they generally don’t have any restrictions in terms of course content i.e. you can view, execute, submit all the graded assignments (unlike Coursera, which let’s you watch all the videos but hides the graded material). If you think some certificate is worth showcasing on your resume, you can always pay for it in the middle of the course after you have completed some videos and judged the merit and utility.
  • Practice, code, and build things to supplement your online learning: There is a real algorithm called ‘online learning’ in the context of machine learning. In this technique, instead of processing a full matrix of millions of data points, the algorithm works with the latest few data points and updates the prediction. You can work in this mode too. The halting problem/parking problem is always a fascinating one and it applies to learning too. We always wonder how much to study and assimilate before building things i.e. where to halt the learning and start implementing. Don’t hesitate, don’t procrastinate. Learn a concept and test it by simple coding. Work with the latest trick or technique you watched video about, don’t wait for achieving mastery over the entire topic. You will be amazed by how simple 20 lines of coding can give you solid practice (and make you sweat enough) on the most complex concept you learned watching that video.

  • There is plenty of data out there: You will also be amazed how many rich sources of free data are out there on the web. Don’t go to Kaggle, try something different for fun. Try data.gov or United Nations data portal. Go to UCI machine learning repository. Feeling more adventurous? What about downloading data about various countries from CIA and try all the cool visualizations that you learned in the latest Matplotlib or ggplot2 lecture? If not anything else, download your own electricity usage data from your energy provider and analyze if you could save few bucks if you turned on the AC or dish washer at a different time.

The opinions expressed in this articles about various courses/instructors are entirely author’s own. If you have any questions or ideas to share, please contact the author at tirthajyoti[AT]gmail.com. Also you can check author’s GitHub repositories for other fun code snippets in Python, R, or MATLAB and machine learning resources. You can also follow me on LinkedIn.

Bio: Tirthajyoti Sarkar is a semiconductor technologist, machine learning/data science zealot, Ph.D. in EE, blogger and writer.

Original. Reposted with permission.