KDD Nugget 95:9, e-mailed 95-04-21 Contents: * S. Hirtle, CSNA WWW pages moved to http://www.pitt.edu/~csna/ * G. Melli, Announce - Synthetic Classification Data Sets program Siftware: * R. Almond, Software for Belief Networks Page http://bayes.stat.washington.edu/almond/belief.html * R. Segal, Brute -- a system for induction of conjunctive rules * GPS, Siftware: SAS homepage at http://www.sas.com/ WinViz homepage http://www.iti.gov.sg/factsheet/ia/winviz.html * W. Taylor, Classes from Data! Autoclass C is here! CFPs: Calls for Papers and Participation: * H. Blockeel, ILP95 - final CFP The KDD Nuggets is a moderated mailing list for news and information relevant to Knowledge Discovery in Databases (KDD), also known as Data Mining, Knowledge Extraction, etc. Relevant items include tool announcements and reviews, summaries of publications, information requests, interesting ideas, clever opinions, etc. Please include a descriptive subject line in your submission. Nuggets frequency is approximately bi-weekly. Back issues of Nuggets, a catalog of S*i*ftware (data mining tools), references, FAQ, and other KDD-related information are available at Knowledge Discovery Mine, URL http://info.gte.com/~kdd/ or by anonymous ftp to ftp.gte.com, cd /pub/kdd, get README E-mail add/delete requests to kdd-request@gte.com E-mail contributions to kdd@gte.com -- Gregory Piatetsky-Shapiro (moderator) ********************* Official disclaimer *********************************** * All opinions expressed herein are those of the writers (or the moderator) * * and not necessarily of their respective employers (or GTE Laboratories) * ***************************************************************************** ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "I know half the money I spend on advertising is wasted, but I can never find out which half" --- Lord Leverhulme Unilever, an Anglo-Dutch Consumer Product Company (sent by hongbo@buck.ac.uk Hongbo Duh) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Return-Path: Date: Mon, 3 Apr 1995 16:05:58 +0200 From: Stephen Hirtle Subject: CSNA WWW pages have moved To: Multiple recipients of list CLASS-L As the CSNA information officer, I would like to mention a few changes to the World Wide Web (WWW) pages of the CSNA. First and foremost, CSNA now has it own WWW address. The new address is http://www.pitt.edu/~csna/ From this home page, you can access a number of classification-related services including several on-line bibliographies and the CSNA newsletters. The most recent additions are (1) the 1994 volume of the Classification Literature Automated Search Service, known as CLASS, and (2) the February 1995 issue of the CSNA newsletter. If you have do not have access to web browser, such as Mosaic, Lynx, or Netscape, you can retrieve some of the information through an email server. To retrieve the CSNA WWW page via email, send a message to the address "listserv@mail.w3.org" with the single line of send http://www.pitt.edu/~csna/csna.html The hypertext document will be mailed back with you with hypertext links in brackets. You may request further documents by replying to the message with the numbers of the requested documents in the body of the message, e.g., 2 3 14 You can also send the message 'help' to the listserv address above to learn more about this service. I am also in the process of setting up a CSNA ftp server, where you can deposit technical reports, papers, or software code. Please let me know of any information, or pointers, that you would like to see on either the CSNA ftp server or CSNA WWW pages. I would also like to include pointers to WWW pages of classification labs or classification researchers. If you have a home page, or your own classification depository on the WWW, and would like to make it available for others, please send my the address and I will include it in the upcoming list of classification labs. Stephen Hirtle, CSNA Webmaster -------------------------------------------------------------------- Stephen C. Hirtle Associate Professor Dept. of Computer Science Dept. of Information Science Molde College, Box 308 University of Pittsburgh N-6401 Molde, Norway Email: hirtle+@pitt.edu Phone: +47 71 21 40 00 http://www.pitt.edu/~hirtle/ Fax: +47 71 21 41 00 -------------------------------------------------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Return-Path: From: Gabor_Melli@cs.sfu.ca Date: Mon, 10 Apr 95 09:15:26 PDT To: kdd@gte.com Subject: Announce - Synthetic Classification Data Sets program Cc: Gabor_Melli@cs.sfu.ca Content-Type: text Content-Length: 1113 One important way to test learning-from-example algorithms is to evaluate their performance against well understood synthetic data sets. The Synthetic Classification Data Sets (SCDS) program has been created to generate synthetic data sets which are particularly useful to test Knowledge Discovery from Database (KDD) algorithms. SCDS will first generate a synthetic set of rules and proceed to generate a relation which abides by this rule base. Several characteristics of the process can be customized. Rule bases may be in Conjunctive Normal Form, with easy customization of the quantity and complexity of the rules. Data sets can also be customized, for example, to include some interesting real-world characteristics such as irrelevant attributes, missing attributes, noisy data and missing values. The project files and information are located at: URL http://fas.sfu.ca/cs/people/GradStudents/melli/SCDS While the ANSI C source code for version 1.0 is available, you should also test the user-friendly interactive WWW Form interface! Gabor Melli School of Computing Science Simon Fraser University ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From <@watstat.uwaterloo.ca:almond@statsci.com> Thu Mar 9 19:17:10 1995 To: ai-stats@watstat.uwaterloo.ca Subject: Software for Belief Networks Page (Version 1.2) Content-Length: 987 Well, I've finally done an update on my page for software for belief networks. The new version has a new (and hopefully permanent address): http:://bayes.stat.washington.edu/almond/belief.html Please update your pointers. Thanks to David Madigan for hosting the page. There are lots of links to free software and demo versions of commerical software. As always, please send any updates to me at almond@statsci.com. I'm going to compile this version shortly for the final submission of my book (as soon as I finish with the other corrections) so if you want your software mentioned in my book, please send me the update NOW. In the next version, I'm going to try and include software for fitting belief networks. Please send me any pointers you might have (or post them to this list). Enjoy. Russell Almond StatSci (a division of MathSoft) 1700 Westlake Ave., N Suite 500, Seattle, WA 98109 (206) 283-8802 x234 FAX: (206) 283-6310 Email: almond@statsci.com ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Return-Path: Date: Tue, 28 Mar 1995 13:35:58 -0800 From: segal@cs.washington.edu (Richard Segal) To: kdd@gte.com Subject: Addition to the siftware catalog Content-Type: text Content-Length: 1350
*Name: Brute
*Description: Brute is an inductive system for performing both data mining and classification tasks. At the core of Brute is an algorithm for efficiently searching all the conjunctive IF...THEN rules up to a user specified length. Brute's use of massive search allows it to avoid many of the pitfalls associated with greedy search and makes it possible to find better rules. Brute can process 100,000 rules a second on a SPARC 10 when run on databases containing 500 examples. Brute's speed allows it to process fairly large databases in a reasonable amount of time.
*Discovery methods: Induction of conjunctive rules using massive search.
*Comments: Brute has been previously been used to analyze Boeing manufacturing data and improve manufacturing efficiency.
*Source: Original author.
*Platform(s): Portable ANSI C.
*Contact:
          Richard Segal
          Dept. of Computer Science and Engineering, FR-35
          University of Washington
          Seattle, WA 98195
          segal@cs.washington.edu

*Updated: 1995-03-28 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Date: Wed, April 1995 From: gps@gte.com (Gregory Piatetsky-Shapiro) Subject: Siftware: SAS package *Name: SAS
*Description: The SAS System is a modular, integrated, hardware-independent system of software for enterprise-wide information delivery. Distinguishing the software is its ability to:
  • make enterprise data, regardless of source, a generalized resource available to any user or application that requires it.
  • transform enterprise data into meaningful information for a broad range of applications.
  • deliver that critical information to those who need it when they need it through a variety of interfaces tailored to the needs and experience of the individual computer user.
  • perform consistently across and cooperatively among a broad range of hardware environments while exploiting the particular advantages of each.

*Contact: software@sas.sas.com
*Status: product
*URL: http://www.sas.com/
*Updated: 1995-04-19 ---- Also, I added a URL pointer http://www.iti.gov.sg/factsheet/ia/winviz.html pointer for WinViz homepage -- data analysis and visualization software. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Date: Wed, 19 Apr 95 10:10:54 PDT From: taylor@ptolemy-ethernet.arc.nasa.gov (Will Taylor) To: kdd@gte.com Subject: Classes from Data! Autoclass C is here! Elvis weds Alien! Reply-To: taylor@ptolemy-ethernet.arc.nasa.gov *Name: AutoClass C (AC C) *Description: C implementation of AutoClass: an unsupervised Bayesian classification system that seeks a maximum posterior probability classification. *Discovery methods: Classification, Bayesian Statistics, Clustering *Comments: programmed in ANSI C with GNU gcc 2.6.3; source code provided. *URL: http://ic-www.arc.nasa.gov /ic/projects/bayes-group/group/html/autoclass-c-program.html or send e-mail to taylor@ptolemy.arc.nasa.gov *Platform(s): at a minimum, SunOS - others untested. *Contact: Will Taylor, NASA Ames Research Center MS 269-2, Moffett Field, CA 94035-1000 taylor@ptolemy.arc.nasa.gov (415)604-3364, (415)604-3594 *Status: public domain *Updated by: Will Taylor on 1995-04-15 ------------------------------------------------------------------------ ------------------------------------------------------------------------ Announcing AUTOCLASS C, the mostest Bayesian Classifier ever! You describe a bunch of cases with as many attributes as you like, and before you can factor a 500 digit number, out pops a set of classes defined by a small set of class parameters. FEATURES: * Finds the optimum number of classes automatically. * Can deal with real-valued, discrete, or missing values. * Can leap tall buildings in a single bound. (scratch that, wrong ad) * Class/Case assignments are probabilistic, not just either/or. * Indicates which attributes are most influential for which classes. * Described in respectable publications (references provided). * You can stop the search and get the current best answer at any time. * Estimates rate of progress in search, to aid deciding when to quit. * Fully Bayesian - uses priors and finds max posterior classification. How much would you pay for all this? Don't answer! There's more. If you act now you also get: * In ANSI C, with source code as standard equipment. * Organically grown and bug-free (we call them features). * Our cheerful support staff will answer all your calls 3:00-3:01am. * Invented by NASA, the guys who gave you the moon. * Programmed by academics: Diane Cook and Joe Potts at UTexas-Arlington. * Price: FREE! Or make an offer! * Available via anonymous ftp. Questions? Comments? Want the Movie Rights? Contact us at: http://ic-www.arc.nasa.gov/ic/projects/bayes-group/group/html/autoclass-c-program.html or send e-mail to taylor@ptolemy.arc.nasa.gov --------------- The Bayes Boys, Peter Cheeseman, John Stutz, Robin Hanson, Will Taylor (No No, REALLY! We're Serious!) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~ CFPs: Calls for Papers and Participation: ~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Return-Path: Date: Wed, 19 Apr 1995 15:06:13 +0200 From: Hendrik Blockeel To: H42Gyi%HUELLA.BITNET@cc1.kuleuven.ac.be, Steve.Muggleton@comlab.ox.ac.uk, ashwin@comlab.ox.ac.uk, bergadan@di.unito.it, calle@dsv.su.se, celine@lri.fr, claude@cs.unsw.oz.au, compunode@ecrc.de, dpage@comlab.ox.ac.uk, flach@kub.nl, furukawa@icot.or.jp, gunetti@di.unito.it, h1001tur@ella.hu, h159dom@ella.hu, h42gyi@ella.hu, henke@dsv.su.se, Hilde.Ade@CS.kuleuven.ac.be, ilp@lri.fr, ivan.bratko@ijs.si, kdd@gte.com, kietz@gmdzi.gmd.de, Luc.DeRaedt@CS.kuleuven.ac.be, Maurice.Bruynooghe@CS.kuleuven.ac.be, ml@CS.kuleuven.ac.be, mlnet@csd.abdn.ac.uk, mooney@cs.utexas.edu, morik@kilo.informatik.uni-dortmund.de, nada.lavrac@ijs.si, numao@cs.titech.ac.jp, pazzani@ics.uci.edu, pi@dsv.su.se, quinlan@ml2.cs.su.oz.au, saso.dzeroski@ijs.si, stahl@is.informatik.uni-stuttgart.de, stan@csi.uottawa.ca, steffo@kibosh.informatik.uni-dortmund.de, tausend@is.informatik.uni-stuttgart.de, u11557%uicvm.BITNET@cc1.kuleuven.ac.be, volker@kiste.informatik.uni-dortmund.de, wcohen@research.att.com, wirth@faw.uni-ulm.de, wrobel@gmdzi.gmd.de Subject: ILP95 - final CFP Content-Type: text Content-Length: 3036 5th International Workshop on Inductive Logic Programming 4-6 September 1995, Leuven, Belgium Announcement and Call for Papers General Information : ILP-95 is the 5th annual meeting of researchers in and practitioners of inductive learning in first order logic. Previous meetings have been organized in Viana de Castello (91), Tokyo (92), Bled (93), and Bad Honnef (94). Program : The scientific program will include invited talks, presentations of selected papers, poster and demo sessions. The program will be complemented by and overlap with an area meeting on Knowledge Representation and Reasoning of the ESPRIT Network of Excellence Compulog from 6-8 September 1995. It is our intention to have Wednesday 6 September recognized as a joint workshop of the ESPRIT Networks of Excellence in Computational Logic (Compulog) and Machine Learning (MLnet). Further information will become available in January. Submission of Papers : ILP solicits papers addressing inductive machine learning within the representation offered by computational logic. ILP-95 especially wishes to encourage submissions of interest to both researchers in computational logic and inductive machine learning. This includes (but is not limited to) topics such as inductive synthesis of logic programs, applications of abductive logic programming to induction, meta-programming approaches to induction, applications of inductive logic programming to software engineering, deductive databases, database design, theory revision, etc. The proceedings will be distributed at the workshop, and will appear as a technical report of the K.U.Leuven, Computer Science Department. There are also plans for publishing a post-conference volume with a major publishing company. Full papers are limited to 5000 words. Submissions should be made in 5 copies to: Luc De Raedt (ILP-95) Department of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, B-3001 Heverlee (Belgium) Whenever possible, authors should send their title page using email to ilp95@cs.kuleuven.ac.be. Important Dates : Submission deadline 1 May 1995 Notification of acceptance/rejection 1 July 1995 Camera ready copy 1 August 1995 Program Chair : Luc De Raedt (Katholieke Universiteit Leuven, Belgium). Program Committee : F. Bergadano (Italy) I. Bratko (Slovenia) W. Cohen (USA) S. D~zeroski (Slovenia) P. Flach (The Netherlands) P. Idestam-Almquist (Sweden) N. Lavra~c (Slovenia) S. Matwin (Canada) R. Mooney (USA) S. Muggleton (U.K.) M. Numao (Japan) D. Page (U.K.) J.R. Quinlan (Australia) A. Srinivasan (U.K.) C. Rouveirol (France) C. Sammut (Australia) S. Wrobel (Germany) To receive further information about ILP-95 : send email to ilp95@cs.kuleuven.ac.be, or see (via WWW) http://www.cs.kuleuven.ac.be/~ilp95/. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~