KDnuggets Home » News » 2015 » Sep » Courses, Education » Top 20 Data Science MOOCs ( 15:n30 )

Top 20 Data Science MOOCs


Looking out for the next data science MOOC? Checkout from our extensive list of MOOCs which covers all data science disciplines which are offered by leading organizations.



Data science is big landscape and self-learning is the necessary skill if anyone wants to become a good data scientist. MOOCs had been Major source of treasure for the data scientist. Though there are many sites offering MOOCs, but Coursera, Edx and Udacity have been leaders. Whether, your language is R, python, Java or C/C++ we have captured all of them. If, you are a beginner and understanding what data science is exactly or you are an expert looking for your next frontiers. You can search through this exhaustive list as per needed.

Is your favorite MOOC missing from this list, please let us know in the comments below.

In this post we have kept only single courses, will write a separate post for the specializations or degrees related to data science. There are some upcoming promising courses in this list too.

Some general guidelines about the source details:

  • The level of the course is decided by considering the prerequisites, the efforts required and duration of the course.
  • All courses assume basic background in the statistics.
  • The courses are arranged w.r.t. level of expertise, i.e. beginners courses are listed ahead of expert level courses.
  • The tools are considered as a programming language, or software tools used in the course.

The Analytics Edge (MIT)

Level: Beginners-Expert                                       Effort: 10-15 hrs/week
Status: Archived                                                   Duration: 12 weeks
Prerequisite: None                                               Tools: R

ananlytics-edgeOne of the best course to learn data science and analytics using R. The course provides in-depth lectures on multiple business cases, along with extensive exercises. Keep in mind, it is a very demanding course in terms of time commitment, but it is worth. The examples include Moneyball, eHarmony, the Framingham Heart Study, Twitter, IBM Watson, and Netflix. Through these examples and many more, we will teach you the following analytic methods: linear regression, logistic regression, trees, text analytics, clustering, visualization, and optimization.

Machine Learning (Stanford University)

Level: Beginners-Expert                                    Effort: 7-12 hrs/week
Status: On-demand                                            Duration: 11 weeks
Prerequisite: Programming                               Tools: Octave

Whenever you will listen about the machine learning MOOCs, this course has to be there. Excellent course taught by one of the best professors in machine learning domain, Andrew Ng. The way complete course is well-organized and covers all core concepts of machine learning.Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI).

Data Science and Machine Learning Essentials (Microsoft) (24 Sep 2015 onwards)

Level: Beginners-Intermediate                                 Effort: 3-4 hrs/week
Status: Upcoming                                                    Duration: 5 weeks
Prerequisite: None                                                 Tools: R

Learn data science essentials with experts from M.I.T and the industry, partnering with Microsoft to help develop your career as a data scientist. By the end of this course, you will know how to build and derive insights from data science and machine learning models. You will learn key concepts in data acquisition, preparation, exploration and visualization along with examples on how to build a cloud data science solution using the Azure Machine Learning, R & Python. This course is organized into 5 weekly modules, each concluding with a quiz.

Databases (Stanford University)

Level: Beginners                                                    Effort: 8-10 hrs/week
Status: Self-paced                                                 Duration: 10 weeks
Prerequisite: None                                                Tools: SQL, XML query

If you are dealing with data, databases are inevitable. This course covers database design and the use of database management systems for applications. It includes extensive coverage of the relational model, relational algebra, and SQL. It also covers XML data, including DTDs and XML Schema for validation, and the query and transformation languages XPath, XQuery, and XSLT. The course includes database design in UML, and relational design principles based on dependencies and normal forms.

Coding the Matrix: Linear Algebra through Computer Science Applications (Brown University)

Level: Beginner-Intermediate                                Effort: 10-14 hrs/week
Status: Archived                                                  Duration: 10 weeks
Prerequisite: None                                              Tools: Python

Linear algebra is one the important building block of not only computer science, but also machine learning, graphics and statistics. This is a brilliant course guides you through the real examples and excellent python assignments. You will write programs to  implement basic matrix and vector functionality and algorithms, and use these to process real-world data to achieve such tasks as: two-dimensional graphics transformations, face morphing, face detection, image transformations such as blurring and edge detection, image perspective removal, classification of tumors as malignant or  benign, integer factorization, error-correcting codes, and secret-sharing. Another, more basic course is LAFF by The University of The Texas Austin.

Learning From Data (California Institute of Technology)

Level: Intermediate-Expert                                    Effort: 10-14 hrs/week
Status: Archived                                                    Duration: 10 weeks
Prerequisite: probability, matrices, calculus         Tools: No restriction

caltech-learning-from-dataOne of the best MOOC ever for machine learning enthusiasts. This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. But  requires one to have good linear algebra, calculus and probability background, along with coding skills.The course is taught by Yaser S. Abu-Mostafa, who is a Professor of Electrical Engineering and Computer Science at the California Institute of Technology. He is the co-author of Amazon’s machine learning bestseller Learning From Data and great professor who simplifies the learning.

CSCI E-109 Data Science (Harvard Extension School)

Level: Beginners-Expert                                          Effort: 7-12 hrs/week
Status: Archived                                                     Duration: 16 weeks
Prerequisite: None                                                 Tools: Python, d3

Excellent course, recommended to all the data science aspirants. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.

Introduction to Data Science (University of Washington)

Level: Beginner-Intermediate                                  Effort: 10-14 hrs/week
Status: Archived                                                    Duration: 10 weeks
Prerequisite: Programming                                   Tools: Python, R, SQL

Introduce yourself to the basics of data science and leave armed with practical experience extracting value from big data. This course teaches the basic techniques of data science, including both SQL and NoSQL solutions for massive data management (e.g., MapReduce and contemporaries), algorithms for data mining (e.g., clustering and association rule mining), and basic statistical modelling (e.g., linear and non-linear regression).

Networks, Crowds and Markets (Cornell University)

Level: Beginners-Expert                                           Effort: 4-8 hrs/week
Status: Archived                                                     Duration: 10 weeks
Prerequisite: None                                                  Tools: None

The course examines the interconnectedness of modern life through an exploration of fundamental questions about how our social, economic, and technological worlds are connected. Students will explore game theory, the structure of the Internet, social contagion, the spread of social power and popularity, and information cascades. Another important source of knowledge for link analysis is SNAP.

Data Analysis: Take It to the MAX() (DelftX) (1 Sep 2015 onwards)

Level: Intermediate                                                  Effort: 4-6 hrs/week
Status: Upcoming                                                     Duration: 8 weeks
Prerequisite: Basic Spreadsheet exp.                    Tools: MS-Excel, python

Even in the era of the big data, there is a huge number of data analyst who rely heavily on the spreadsheets to gather the insights and its still relevant. This is an excellent course for those who want to enhance analytical skills using excel. You will take a deep dive into data analysis with spreadsheets: PivotTables, VLOOKUPS, Named ranges, what-if analyses, making great graphs – all those will be covered in the first weeks of the course. After that, you will investigate the quality of the spreadsheet model, and especially how to make sure your spreadsheet remains error-free and robust. Finally, you will also look into how Python, a programming language, can help us with analyzing and manipulating data in spreadsheets.

Text Mining and Analytics (University of Illinois at Urbana-Champaign)

Level: Intermediate-Expert                                         Effort: 5-10 hrs/week
Status: Archived                                                       Duration: 5 weeks
Prerequisite: Programming                                      Tools: C++

This course will cover the major techniques for mining and analyzing text data to discover interesting patterns, extract useful knowledge, and support decision making, with an emphasis on statistical approaches that can be generally applied to arbitrary text data in any natural language with no or minimum human effort. You will learn the basic concepts, principles, and major algorithms in text mining and their potential applications.