KDnuggets Top Blog Winner

3 Free Statistics Courses for Data Science

Statistics is one of the most in-demand data science skills. Master it for free with these online courses.



3 Free Statistics Courses for Data Science
Photo by Monstera 

 

Statistics is the core of predictive modeling, and knowledge of the subject is required when analyzing data to solve a business problem.

In this article, I will provide you with 3 free courses you can take to learn statistics for data science:

 

Statistical Learning (edX)

 

Statistical Learning is taught by well-renowned Stanford professors Trevor Hastie and Rob Tibshirani, Statistical Learning is a program that will teach you the theory behind machine learning models and provide you with an intuition of how they work.

After taking this course, you will be able to answer questions such as:

  • When should random forests be used over decision trees?
  • Why can’t linear regression be applied for classification problems?
  • How to perform feature selection for regression problems?
  • How to mitigate overfitting?
  • How to resample data when you don’t have sufficient data points to train your predictive model?

A solid understanding of the above will help you identify the best approach to prepare your data, perform feature engineering, and select a predictive modeling technique.

Here are some topics covered in this course:

  1. Regression and Classification
  2. Regularization
  3. Generalized Additive Models
  4. Tree Based Methods
  5. Support Vector Machines
  6. Principal Component Analysis
  7. Clustering

The course starts with a theoretical explanation of the concepts above with an example or two, followed by a code tutorial. 

All the programming lectures are conducted in R. However, you don’t need to know the language before enrolling into this course. The instructors will teach you how to code in R before taking you through practical implementation.

Statistical Learning is offered on an e-learning platform called edX, and you can audit it for free. This means that all the course material is available to you at no cost unless you want to purchase a certificate of completion. 

 

Probability and Statistics: To P or Not To P? (Coursera)

 

I recommend taking the probability and statistics course if you don’t come from a math or statistics background.

James Abdey, the instructor of this program, teaches every concept in plain English free from complex mathematical notation.

This course will introduce you to topics like probability distributions, descriptive statistics, inferential statistics, hypothesis testing, confidence intervals, and the central limit theorem.

Similar to the Statistical Learning course mentioned above, this program can be audited for free. 

This course will teach you the basics of probability of statistics, and will help you dip your toes into the subject as a beginner. However, to gain a stronger understanding of the topics covered, you should complement this course with a more rigorous one mentioned in this list.

 

Stat 110: Harvard University (YouTube)

 

Stat 110 is one of the most popular statistics courses offered by Harvard University. All the lectures in this class have been recorded and uploaded on YouTube for public access.

This course will provide you with an intuitive and mathematical explanation of statistical concepts. You need to be familiar with matrix manipulation and some calculus (derivatives and integrals) before taking this course.

If you don’t have the required math background, I suggest taking these two courses before following along to Stat 110: Linear Algebra and Introduction to Calculus.

Stat 110 covers topics such as probability, the Monty Hall problem, random variables, probability distributions, statistical tests, and Markov Chains. 

This course is the most in-depth resource provided on this list, and can be time-consuming to work through. Joe Blitzstein, the professor of Stat110, also suggests taking the edX Introduction to Probability program to complement the YouTube lectures.

Statistics can be an intimidating subject to learn at first, especially if you come from a non-math background. However, if you want to work as a data scientist to solve real-world business problems, it is sufficient to learn applied statistics. You don’t need to go deep into calculations or proofs. Instead, you should be able to apply the right tool to solve a problem with data.

The courses above will teach you to do just that by providing you with an intuitive understanding of statistical concepts.

 
 
Natassha Selvaraj is a self-taught data scientist with a passion for writing. You can connect with her on LinkedIn.