Open Source Data Science Masters Curriculum
Tags: MS in Data Science, Open Source
A good collection of open source resources for Data Science Masters Curriculum, covering Math, Algorithms, Databases, Data Mining, Machine Learning, Natural Language Processing, Data Analysis and Visualization, and Python.
Here is a great list of useful, open-source resources for a self-study towards Data Science MS, assembled by Data Scientist Clare Corthell,
@clarecorthell.
See also
- Harvard CS109 Data Science Course, Resources Free and Online
- Learning from Data at edX, taught by Caltech professor Yaser Abu-Mostafa,
- KDnuggets Home :: FAQ :: Learning Data Mining and Data Science
Intro to Data Science
UW / Coursera. Topics: Python NLP on Twitter API, Distributed Computing Paradigm, MapReduce/Hadoop & Pig Script, SQL/NoSQL, Relational Algebra, Experiment design, Statistics, Graphs, Amazon EC2, Visualization.
Math
- Linear Algebra / Levandosky Stanford / Book
- Linear Programming (Math 407) University of Washington / Course
- Statistics Stats in a Nutshell / Book
- Forecasting: Principles and Practice Monash University / Book *uses R
- Problem-Solving Heuristics "How To Solve It" Polya / Book
- Coding the Matrix: Linear Algebra through Computer Science Applications Brown / Coursera
- Think Bayes Allen Downey / Book
Computing
- Algorithms
- Algorithms Design & Analysis I Stanford / Coursera
- Algorithm Design Kleinberg & Tardos / Book
- Databases
- Introduction to Databases Stanford / Coursera
- SQL Tutorial W3Schools / Tutorials
- Data Mining
- Mining Massive Data Sets Stanford / Book
- Mining The Social Web O'Reilly / Book
- Introduction to Information Retrieval Stanford / Book
- Machine Learning
- Machine Learning / Ng Stanford / Coursera
- Programming Collective Intelligence O'Reilly / Book
- Statistics The Elements of Statistical Learning
- Probabilistic Graphical Models
- Probabilistic Programming and Bayesian Methods for Hackers Github / Tutorials
- PGMs / Koller Stanford / Coursera
- Natural Language Processing: NLP with Python O'Reilly / Book
- Data Analysis
- Python for Data Analysis O'Reilly / Book
- Big Data Analysis with Twitter UC Berkeley / Lectures
- Social and Economic Networks: Models and Analysis / Stanford / Coursera
- Information Visualization "Envisioning Information" Tufte / Book
- Learning Python: Learn Python the Hard Way, Google's Python Class
- Python (Libraries for Data Science)
- Basic Packages Python, virtualenv, NumPy, SciPy, matplotlib and IPython
- Data Science in iPython Notebooks (Linear Regression, Logistic Regression, Random Forests, K-Means Clustering)
- Bayesian Inference | pymc
- Labeled data structures objects, statistical functions, etc pandas (See: Python for Data Analysis)
- Python wrapper for the Twitter API twython
- Tools for Data Mining & Analysis scikit-learn
- Network Modeling & Viz networkx
- Natural Language Toolkit NLTK
For a final project, do a competition - plenty to choose on
For a full list and additional resources, see
Most popular last 30 days
Most viewed last 30 days
- 60+ Free Books on Big Data, Data Science, Data Mining, Machine Learning, Python, R, and more - Sep 4, 2015.
- The one language a Data Scientist must master - Sep 1, 2015.
- How to become a Data Scientist for Free - Aug 28, 2015.
- 50+ Data Science and Machine Learning Cheat Sheets - Jul 14, 2015.
- R vs Python for Data Science: The Winner is ... - May 26, 2015.
- Gartner 2015 Hype Cycle: Big Data is Out, Machine Learning is in - Aug 28, 2015.
- Top 20 Data Science MOOCs - Sep 5, 2015.
Most shared last 30 days
- 60+ Free Books on Big Data, Data Science, Data Mining, Machine Learning, Python, R, and more - Sep 4, 2015.
- The one language a Data Scientist must master - Sep 1, 2015.
- A Great way to learn Data Science by simply doing it - Sep 11, 2015.
- Top 20 Data Science MOOCs - Sep 5, 2015.
- SentimentBuilder: Visual Analysis of Unstructured Texts - Sep 18, 2015.
- Deep Learning and Artistic Style - Can art be quantified? - Sep 17, 2015.
- Salaries by Roles in Data Science and Business Intelligence - Sep 9, 2015.