KDD Nuggets -- August, 1993 Contents: * KDD successes -- example: Pattern Discovery in Data at IRS * Suggestion to define fundamental KDD terms * Description of MLNET activities * KDD-93 proceedings are available from AAAI The KDD Nuggets is an informal list for the dissemination of information relevant to Knowledge Discovery in Databases (KDD), such as announcements of conferences/workshops, tool reviews, application success/failure stories, interesting ideas, outrageous opinions, etc. If you have such a contribution, please email it to kdd%eureka@gte.com Mail requests to be added/deleted to kdd%eureka@gte.com. (Note: If you received this message you are on the list!) ---------------------------------------------------------------------- Date: Wed, Aug 11, 1993 From: Gregory Piatetsky-Shapiro (GTE Laboratories) gps0@gte.com. I am interested in collecting examples of KDD successes. These could be of two forms: specific one-time successes, such as the one described below, or tools actually used for discovery in data, such as SKICAT (JPL) or Spotlight (A. C. Nielsen). If you have any KDD successes to report, please send them to kdd%eureka@gte.com and I will distribute them to kdd-list. from ComputerWorld, Aug 11, 1993, p. 15. ---------- in "IRS uncovers bogus access to tax records" article. ... Deputy Commissioner Michael P. Dolan said the IRS Atlanta operation discovered the illegal activity by using software able to detect suspicious use patterns over three years from a database of audit trail information. He said widespread use of the detection technique has been inhibited by the difficulty of processing tape archives, which grow by 100 million transactions per month. However, Dolan said, each regional IRS center will soon get the "pattern detection" software with high capacity optical disc hardware to allow line managemers to easily monitor system use -- and possible misuse -- by their employees. ------------------------------------------- From Willi Kloesgen (GMD, Germany) kloesgen@gmd.de We should elaborate some clear definitions of fundamental terms of KDD (e.g. "pattern"), clarify some vague concepts (e.g. "interestingness of finding"), identify the main subgroups of applications (discovery in databases (relational, time series ...), scientific discovery, image database discovery ...) and describe the (methodical) differences between these subgroups. Perhaps an informal group could work on these and other questions and give a report at the next KDD Workshop. Such a group could be announced in KDD mailing list to invite for participation. Any suggestions are welcome. -------------------- The European Community (EEC) has set up 12 "networks of excellence" to coordinate Research and Development throughout Europe and to be a source of information. A network on Machine Learning and Knowledge Acquisition (MLNET) has recently been funded. D. Sleeman of the Abeerdeen Computer Science Department (sleeman@csd.abdn.ac.uk) is MLnet's Academic Coordinator. The network has 33 nodes (10 main notes and 23 associated nodes) and a budget of 620,000 ecus (European Currency Unit, 1 ecu = 1.2$). The aim of this network is to coordinate ML research, "to ensure that the subject achieves a solid scientific basis and becomes an important technology on which the European IT industry can build future intelligent systems". MLnet has set up a number of Technical Committees to promote Industrial Liaison (Y. Kodratoff, France), Research Organisation and Coordination (L. Saitta, Italy), Training (K. Morik, Germany) and Electronic Communication (B. Wielinga, Netherlands). There is a quarterly newsletter edited at Aberdeen. At first, there is the "familiarisation workshop" in Blanes (Spain) and a Summer School in Eastern Europe in 1994. The MLNET workshop in Blanes received approximately 20 papers majority on scientific discovery (modelling the discovery process of a scientist,discovery of quantitative laws), some 5 papers on database discovery most papers deal with dependencies: tackling feature interaction, comparison of methods based on constraints on correlations (tetrad equations) and on conditional independencies, additional knowledge to guide the search for plausible rules. Willi ---------------------------------------------- Willi Kloesgen, GMD, D-53757 Sankt Augustin Phone ++49/2241-14-2723, Fax ++49/2241-14-2618 E-mail: kloesgen@gmd.de ----------------------------------------------- ******** Knowledge Discovery in Databases (KDD-93) ************** Proceedings of this AAAI-93 Workshop are now available as a AAAI technical report. Knowledge Discovery is an area of common interest for researchers in machine learning, statistics, intelligent databases, knowledge acquisition, and expert systems, focusing on unifying themes such as the use of domain knowledge, managing uncertainty, interactive discovery, and transition from research to application. This workshop brought together over 60 researchers from 10 countries. 28 selected papers are included in the proceedings. Contents: Part 1. Real World Applications (9 papers) Part 2. Discovery of Dependencies and Models (8 papers) Part 3. Integrated and Interactive Systems (6 papers) Part 4. Database-Specific Techniques (3 papers) Part 5. Discovery in Textual Documents (2 papers) (send mail to gps@gte.com to get full contents) To order contact Daphne Black AAAI 445 Burgess Drive Menlo Park, CA 94025-3496 Tel: 415-328-3123 Fax: 415-321-4457 e-mail: info@aaai.org Cost: $20 + shipping Within the US and Canada (shipped by UPS): $3.50 for the first book, and $1.00 for each additional book. Outside of the US and Canada: $6.50 per book surface and $15.25 per book airmail. *Please allow 4-6 weeks for delivery. California residents must pay 8.25% sales tax in addition to the cost of shipping the reports. ------------------------------------------------