Gold BlogFree From MIT: Intro to Computational Thinking and Data Science

This free course from MIT will help in your transition to thinking computationally, and ultimately solving complex data science problems.

Programming is an important part of data science, as are the underlying concepts of computers science. If we plan to implement computational solutions to data science problems, it is clear that programming is an absolute necessity. To facilitate those looking to establish or solidify these skills, we recently shared a great free course from MIT's Open Courseware to start with.

After one learns the basic of programming, pivoting to thinking computationally is a good transition step toward solving complex real world problems, including from a data science perspective. Today we share Computational Thinking and Data Science, another top notch MIT Open Courseware offering freely-available to anyone interested in learning.

Computational thinking


The course website describes itself as:

[T]he continuation of 6.0001 Introduction to Computer Science and Programming in Python and is intended for students with little or no programming experience. It aims to provide students with an understanding of the role computation can play in solving problems and to help students, regardless of their major, feel justifiably confident of their ability to write small programs that allow them to accomplish useful goals. The class uses the Python 3.5 programming language.


The Fall 2016 iteration of this course is taught by Eric Grimson, John Guttag, and Ana Bell. The course is taught using only Python as the implementation programming language.

The lecture topics are shown below, taken from the syllabus:

  1. Introduction and Optimization Problems
  2. Optimization Problems
  3. Graph-theoretic Models
  4. Stochastic Thinking
  5. Random Walks
  6. Monte Carlo Simulation
  7. Confidence Intervals
  8. Sampling and Standard Error
  9. Understanding Experimental Data
  10. Understanding Experimental Data (cont.)
  11. Introduction to Machine Learning
  12. Clustering
  13. Classification
  14. Classification and Statistical Sins
  15. Statistical Sins and Wrap Up

I particularly like how this course is seemingly split into a few distinct sections. The first section (up to lecture 6) focuses on computational concepts; the next section (lectures 7-10) are statistical in nature; and the remaining lectures make up a final section on machine learning, though it never strays far from statistics, and appropriately circles back around at the very end.

This structure gives students the opportunity to learn these distinct concepts without confusing them. Thinking computationally has nothing to do with machine learning; it facilitates the separation of a problem into smaller problems and allows one to think about the most efficient ways to solve these smaller problems. It's a great skill to develop in any aspect of your life or work. However — though not intrinsically linked to machine learning — it does provide practitioners with the requisite insights to understand the inner workings of machine learning algorithms, the solutions to problems using these algorithms, and how to iterate and improve on these solutions to make them more efficient, accurate, and useful.

Statistics is never far from the center of a data science problem, or its solution. The discussion of sampling errors, confidence intervals, and the focus on understanding both experimental data and the potential misuse of statistical learning outcomes, are not generally given the attention they deserve in an introductory data science course, which sets Intro to Computational Thinking and Data Science apart from many others.

Computational thinking


The Open Courseware version of this course includes lectures slides and required files, problem sets, readings (unfortunately, the course text is not free), and — of particular note — lecture videos. In this sense, the course being freely-offered can truly be thought of as complete.

This material also forms the basis of the edX course of the same name. If you are interested in a more structured learning environment or a verified certificate when you are finished with the course material, you can enroll there and pursue this option.

When paired with MIT's Intro to Computer Science and Programming in Python, these free courses offer a powerful start to someone learning the fundamentals of programming, computer science, Python, computation, statistics, and machine learning — many of the ingredients to a successful data science career.