Silver BlogData Science Cheat Sheet 2.0

Check out this helpful, 5-page data science cheat sheet to assist with your exam reviews, interview prep, and anything in-between.



By Aaron Wang, Master of Business Analytics @ MIT | Data Science.

This Data Science cheat sheet covers over a semester of introductory machine learning and is based on MIT's Machine Learning courses 6.867 and 15.072. You should have at least a basic understanding of statistics and linear algebra, although beginners may still find this resource helpful.

Inspired by Maverick's Data Science Cheatsheet (hence the 2.0 in the name), located here.

Topics covered:

  • Linear and Logistic Regression
  • Decision Trees and Random Forest
  • SVM
  • K-Nearest Neighbors
  • Clustering
  • Boosting
  • Dimension Reduction (PCA, LDA, Factor Analysis)
  • Natural Language Processing
  • Neural Networks
  • Recommender Systems
  • Reinforcement Learning
  • Anomaly Detection
  • Time Series
  • A/B Testing

This cheat sheet will be occasionally updated with new and improved info, so consider a follow or star in the GitHub repo to stay up to date.

Future additions (ideas welcome):

  • Data Imputation
  • Generative Adversarial Networks

Download the Data Science Cheat Sheet 2.0

Why is Python/SQL not covered in this cheat sheet?

 

I planned for this resource to cover mainly algorithms, models, and concepts, as these rarely change and are common throughout industries. Technical languages and data structures often vary by job function, and refreshing these skills may make more sense on a keyboard than on paper.

 

License

 

Feel free to share this resource in classes, review sessions, or to anyone who might find it helpful :)

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Creative Commons License

Images are used for educational purposes, created by me, or borrowed from my colleagues here.

 

Original. Reposted with permission.

 

Bio: Aaron Wang is currently pursuing a Master's of Business Analytics at MIT, focusing on the intersection of business and data science. Aaron is passionate about the future of AI/ML in the tech space, and would love to chat about opportunities.

Related: