Free Data Management with Data Science Learning with CS639
Learn Data Management with Data Science for FREE with CS639.
Image by Author
There are many elements of Data Science to learn. It all depends on which one of those areas you want to dive deeper into. One, in particular, is Data Management. The University of Wisconsin-Madison has a course called: Data Management for
Data Science.
The course is broken down into 6 sections and has several lectures. You can download the lectures as a PDF, and are also advised on extra reading material to help you.
If this is something you are interested in, keep reading!
CS639 Prerequisites
Course Prerequisites
To get the most out of this Data Management with Data Science course, CS 300 is essential to your learning. CS 400 will also be helpful.
Being proficient in Python is also necessary. If you are not, the university recommends you use the resources described here.
Textbooks
This course does not follow a textbook, however, these are recommended books:
- Python for Data Analysis, Wes McKinney, 2012
- Doing Data Science, Cathy O'Neil and Rachel Schutt, 2013
CS639 Lecture Plan
The course is broken down into 6 sections, with a final exam at the end.
Introduction to Data Science
- Lecture 1: Intro to Data Science and Class Logistics/Overview
- Lecture 2: Statistical Inference and Exploratory Data Analysis
- Class Demo: Getting Started with Data Analytics
Relational Databases and Relational Algebra
- Lecture 3: Principles of Data Management
- Lecture 4: Relational Algebra
- Lecture 5: SQL for Data Science
- Lecture 6: Key Principles of RDBMS
- Lecture 7: Wrapping up SQL and Databases
The MapReduce Model and No SQL Systems
- Lecture 8: Reasoning about Scale & The MapReduce Abstraction
- Lecture 9: Algorithms in MapReduce 1
- Lecture 10: Algorithms in MapReduce 2
- Lecture 11: Spark
- Lecture 12: NoSQL Systems: KeyValue Stores and Document Stores
After this topic, you will have 2 mid-term reviews.
Predictive Analytics
- Lecture 13: Statistical Inference
- Lecture 14: Sampling
- Lecture 15: Bayesian Methods
- Lecture 16: Intro to Machine Learning and Decision Trees
- Lecture 17: Wrap up from Lecture 16 and Linear Classifiers and Support Vector Machines
- Lecture 18: Evaluation of Machine Learning Models
- Lecture 19: Other Learning Methods: Unsupervised Learning & Ensemble Learning
- Lecture 20: Optimization/Gradient Descent
Information Extraction and Data Integration
- Lecture 21: Information Extraction
- Lecture 22: Data Integration and Entity Resolution
- Lecture 23: Data Cleaning
Communicating Insights
- Lecture 24: Intro to Visualization
- Lecture 25: Data Visualization/EDA
- Lecture 26: Data Privacy
Final Exam
You have come to the end of learning all the content around Data Management with Data Science. You will now go through a 3 part final review. Here are some Sample Questions along with the solutions.
After this, you will have a bonus project, which is open-ended you are asked to create cool visualizations with the data of the city of Milwaukee.
Conclusion
Structured curriculums are always great for your learning. And with these resources from the University of Wisconsin, you will be able to get that structure as well as university-quality resources - for FREE!
If you want more information on becoming a Data Scientists, I recommend reading: The Complete Data Science Study Roadmap
Nisha Arya is a Data Scientist and Freelance Technical Writer. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.