KDnuggets Home » Data Mining Course

Data Mining Course

Here are the teaching modules for a one-semester introductory course on Data Mining, suitable for advanced undergraduates or first-year graduate students. The teaching modules were created by
Gregory Piatetsky-Shapiro
Dr. Gregory Piatetsky-Shapiro

Gary Parker
Prof. Gary Parker

Connecticut College
Introductions | Course materials | Data Mining Course Modules | Assignments & Datasets | Extra Publications | Additional Lectures | Acknowledgments


Course introduction | For prospective students | For faculty

Course materials

Data Mining Course Modules

To get the presentations, add www.kdnuggets.com/data_mining_course/ in front of ppt files below
  • DM1: Introduction: Machine Learning and Data Mining, updated May 31, 2006.
  • DM2: Machine Learning and Classification, updated June 7, 2006.
  • DM3: Input: Concepts, Instances, Attributes.
  • DM4: Output: Knowledge Representation, updated June 7, 2006.
  • DM5: Classification - Basic Methods.
  • DM6: DM6: Classification: Decision Trees.
  • DM7: Classification: C4.5.
  • DM8: Classification: CART.
  • DM9: Classification: Rules, Regression, K-Nearest Neighbour.
  • DM10: Evaluation and Credibility, updated May 31, 2006.
  • DM11: Evaluation - Lift and Costs, updated May 31, 2006.
  • DM12: Data Preparation for Knowledge Discovery, updated June 7, 2006.
  • DM13: Clustering, updated May 31, 2006.
  • DM14: Associations Rules, updated May 31, 2006.
  • DM15: Data Mining and Visualization, (3.2MB), updated May 31, 2006.
  • DM16: Summarization and Deviation Detection.
  • DM17: Applications: Targeted Marketing, KDD Cup, and Customer Modeling, updated Oct 18, 2004.
  • DM18: Applications: Genomic Microarray Data Mining.
  • DM19: Data Mining and Society; Future Directions

Assignments, Mid-term Quiz, and Final Exam


Additional Publications

  • KEFIR Summarization system for health-care data:
    sample HTML report and
    KEFIR book chapter (PDF, 357KB, 19 pages),
    Used in module 16.
  • Capturing Best Practice for Microarray Gene Expression Data Analysis (PDF, 850KB, 9 pages), G. Piatetsky-Shapiro, T. Khabaza, S. Ramaswamy, KDD-03 Conference. Used in Module 18 and in final project.

Additional Lectures

  • Introductory Data Mining Tutorial, (90 slides).
  • Introduction to Data Mining (notes) a 30-minute unit, appropriate for a "Introduction to Computer Science" or a similar course.
  • Data Mining Module for a course on Artificial Intelligence: Decision Trees, appropriate for one or two classes. (See Data Mining course notes for Decision Tree modules.)
  • Data Mining Module for a course on Algorithms: Decision Trees, appropriate for one or two classes. See also data mining algorithms introduction and Data Mining Course notes (Decision Tree modules).
  • From Data Mining to Knowledge Discovery: an Introduction, Connecticut College, Oct 2003.
  • Data Mining in Genomics: The Dawn of Personalized Medicine, Connecticut College, Oct 2003.

How to get all this on a CD

Sorry, the CD is no longer available.

Acknowledgments and Funding

This project was funded by a grant from W. M. Keck Foundation, Los Angeles, CA and Howard Hughes Medical Institute, Chevy Chase, MD, as part of Connecticut College Series of Modules in Emerging Fields.

People involved in the course

Acknowledgments and Thanks

Web Mining Course unit on web log analysis
Education » online

KDnuggets Home » Data Mining Course