|
Data Mining is one of the hottest fields in Computer Science. Data has
been accumulating throughout the computer age in many forms, including
database systems, spreadsheets, text files, and recently web pages.
These data have been stored on hard drives and temporary storage media.
Database programs can query for specific information such as "how many
patients are over age 70," but there is potentially much more in the
data than such specific information. The real treasure could be some
interesting new patterns, that we don't even know that we should ask
for, for example, "the best predictor of Alzheimer disease for patients
over 70 is the ratio of Tau and Ab42 proteins".
Data mining programs are
intended to search through data for hidden relationships and patterns in
your data. This is particularly pertinent to marketing companies who
want to know what made a specific group of people buy their product. It
can also be very important in scientific fields such as medicine where
finding correlations in groups of people who are affected by a similar
disease could be very helpful. Data mining is needed to make sense and
use of the rapidly growing data and is an essential field of the 21st century.
Made possible through a generous grant from the Howard Hughes
Medical Institute and the W. M. Keck Foundation to Connecticut
College, this CD and website contain a set of modules for a complete
1-semester course in data mining. In addition, there are also modules
for individual lectures on data mining in the context of courses on
Algorithms, Artificial Intelligence, and Introduction to Computer
Science.
The grant provided support for
Dr. Piatetsky-Shapiro,
one of the leading Data Mining researchers in the world, to spend a
concentrated period of time at Connecticut College, co-teaching and
developing the course modules.
This period was followed by adjustments in the original modules and the
development of modules to cover one or two sessions for other courses.
Dr. Piatetsky-Shapiro and 3 computer science faculty members from
Connecticut College worked in conjunction with an instructional
designer to create these teaching modules. These modules are presented
in PowerPoint to facilitate individual modifications and are
distributed on CD and via the website, free of charge, to interested
professors and instructors.
Data Mining is a good field for computer science students to study.
It is both an active area of research and a great field for employment
opportunities. It is our hope that schools that do not have faculty
with the expertise to teach data mining will now have the opportunity
to offer a data mining course or at least be able to cover some part
of it in their curriculum.
The main part of this CD and website is a set of modules for a complete 300-level Machine Learning / Data Mining course, with instructional material for nineteen 75-minute classes.
In addition, there is instructional material for
- a module for a 30-minute segment on Data Mining as part of Introduction to Computer Science class.
- one or two units on Decision Trees (depending on how much advanced material is covered) as part of a 300-level class on Algorithms, focusing on decision tree algorithms.
- one or two units on Decision Trees as part of a 300-level class on Artificial Intelligence, focusing on decision tree usage and application.
Happy Discoveries!
See also → Intro for Faculty → Education → Education » online
|