KDnuggets : News : 2001 : n18 : item17    (previous | next)

Publications


From: Gregory Piatetsky-Shapiro
Subject: Teaching Data Mining to Business students

Recent dm-list had a very good discussion in responce to this question:

I teach information systems courses to business students. I want to do more with data mining -- take the instruction past basic database education.

How is data mining being taught to undergraduates? Is there a well known text or texts? Are there educational versions of commercial data mining software packages available?

Joe Brady College of Business University of Delaware

Here is a summary of responses which may be useful to many KDNuggets readers as well.

--- From: Monte Hancock Date: Thu, 30 Aug 2001 13:52:23 -0400

I just finished teaching a data mining course to undergraduates this past summer term, and used with great success the book by myself and Rhonda Delmater: "Data Mining Explained: A Managers' Guide to Customer Centric-Business Intelligence". The book contains lots of tutorial material, as well as some business-oriented philosophy for DM apps. There are nine case studies in different vertical-markets/industries. The forward is by Dorian Pyle.

The students liked the book, found it to be at about the right level of technical depth, and gave it high marks.

On August 30, the book is 9th in sales out of 141 titles in data mining on Barnes&Noble; it was 29th out of 561 title on business intelligence on Amazon.

Published by Digital Press January, 2001 ISBN 1-55558-231-1

Cost is around $30.

--- From: Sergei Ananyan Date: Mon, 3 Sep 2001

The best introductory textbook to use to teach data mining to undergraduates might be "Data Mining Techniques: for Sales, Marketing, and Customer Support" by Berry and Linoff.

Regarding educational versions of commercial data mining software, you can download a free evaluation copy of PolyAnalyst 4.4, a multi-strategy data mining system from Megaputer Intelligence, and use it for 60 days in your class. PolyAnalyst was recently selected as one of "Analyst's Top 5 Picks" in the enterprise infrastructure category: http://www.infoworld.com/articles/tc/xml/01/06/11/010611tcmario.xml When teaching a course on data mining, you might find it useful to utilize one or more of eight tutorials provided together with the system. You can download PolyAnalyst from the Megaputer web site www.megaputer.com.

In addition to the evaluation version of the system, Megaputer offers very special discounts on the full version of PolyAnalyst to universities, which use PolyAnalyst in courses taught to students. Please feel free to contact Megaputer directly for details.

Finally, when teaching a course on data mining, you might want to present some materials on text mining as well. Text mining is going to become a very important business technology in a near future. A good introduction to the subject is a book by Dan Sullivan "Document Warehousing and Text Mining" published this year.

Good luck with your course!

Sincerely,

Sergei Ananyan Megaputer Intelligence ----

From: Tom Munro Sent: Friday, August 31, 2001 3:25 p.m. Subject: DM: Teaching Data Mining to Business students

A colleague forwarded your message to me regarding teaching data mining to business students. I taught a data mining course last year to a group of graduate business students and will be interested in hearing about your experience as you move forward.

As far as textbooks go, I used two data mining texts:

Han, Jiawei and Micheline Kamber, "Data Mining: Concepts and Techniques," Morgan Kauffman Publishers, August 2000, 500 pp., ISBN 1558604898

Weiss, Sholom M. and Nitin Indurkhya, "Predictive Data Mining: A Practical Guide," Morgan Kauffman Publishers, August 1997, 225 pp., ISBN 1558604030

I like both of these books, but they may be better suited to computer-science or information technology students than to business students. When I teach the course this year, I'll more likely use something like:

Delmater, Rhonda and Monte Hancock, "Data Mining Explained, A Manager's Guide to Customer-Centric Business Intelligence," Digital Press, January 15, 2001, 352 pp., ISBN: 1555582311

Liautaud, Bernard and Mark Hammond, "E-Business Intelligence: Turning Information into Knowledge into Profit," McGraw-Hill Professional Publishing, October 12, 2000, 306 pp., ISBN: 0071364781

Hughes, Arthur Middleton, "Strategic Database Marketing: The Masterplan for Starting and Managing a Profitable, Customer-Based Marketing Program," McGraw-Hill Professional Publishing, 2nd edition (May 12, 2000), 400 pp., ISBN: 0071351825

Although it's difficult to discuss data mining without discussing algorithms, the students I had responded much better to hands-on work than to discussions of data manipulation. Further, the data mining course was designed to be part of a developing "e-curriculum," and many of the students were more interested in CRM-type mining than in fraud detection, for instance.

I'm not sure what to recommend for software. I had good success introducing the principles of handling large quantities of data by designing classroom exercises that used Microsoft Excel's built-in filtering and crosstab tools (pivot tables). Demonstrating slice and dice with Seagate's Crystal Analysis (which was a free download) also worked well.

There is (or was) a student pack of Clementine available from SPSS. I didn't use it, but I have considered it. The Clementine package is good and I ran a demo for the students during class. Jiawei Han, author of one of the books mentioned above, also has a package called DBMiner (http://www.dbminer.com). Last year, there was an education version available. I'm not sure that's still the case. I didn't use if for class because it required Win NT or 2000 and SQL Server 7. Nevertheless, it's a good package (I did review it) and considerably more affordable that Clementine. Sholom Weiss and Nitin Indurkhya, authors of another book mentioned above, also have a data mining toolkit called DMSK (http://www.data-miner.com). This one's not flashy and doesn't help much with visualization, but it is capable of some very powerful transformations.

One other tactic worked well for me . . . Since I had 3+ hour class sessions to work with, I invited outside speakers who presented data mining topics from a variety of perspectives. While matching speakers with the topics we were discussing in class was problematic, the students gave the speakers very high marks when it came time to review the strengths and weaknesses of the course.

This year I plan to spend more time with fundamental data-based decision-making using hands-on approaches in class. I plan to do this before jumping into the data mining topic.

I haven't decided yet on a data mining toolset. Excel is limited to conventional analysis; Clementine is relatively expensive; last year's version of DBMiner set too high a standard for hardware; and DMSK doesn't help with visualization or rapid model development.

This is a really interesting area. It was tempting to develop a tool-specific curriculum, but I decided that teaching the principles behind the techniques is the more important challenge.

I'll be curious to hear about your experiences with this . . . Good luck!

Regards,

Tom Munro Crummer Graduate School of Business Rollins College


KDnuggets : News : 2001 : n18 : item17    (previous | next)

Copyright © 2001 KDnuggets.   Subscribe to KDnuggets News!