KDD Nuggets 94:3, 1994-02-09 Contents: * Peter Edwards: Report on European Machine Discovery Workshop * GPS: Reminder -- KDD-94 Submissions Due March 1 * Roberto Zicari -- CFP: Theory and Practice of Object Systems Journal The KDD Nuggets is a moderated list for the exchange of information relevant to Knowledge Discovery in Databases (KDD), e.g. application descriptions, conference announcements, tool reviews, information requests, interesting ideas, outrageous opinions, etc. Contributions to kdd@gte.com; Add/delete requests to kdd-request@gte.com -- Gregory Piatetsky-Shapiro (moderator) ------------------------------------ From: ML List (ml@ics.uci.edu) Date: January 26, 1994 Report on WS3: Machine Discovery Workshop at Blanes, Spain, Sept 1993 by Peter Edwards (University of Aberdeen) The workshop attracted a considerable number of submissions, 12 of which were accepted for presentation. A broad range of discovery-related activities were represented, including law discovery (Dzeroski, Van Laer, Moulet, Cheng), knowledge discovery in databases/large datasets (Klosgen, Richeldi, Carpineto, Wallis), experimentation (Wendel), and theory refinement (Alberdi & Sleeman, Gordon, Metaxas). A wide variety of applications were described: chemical kinetics (Dzeroski), evaluation of DBMS performance (Moulet), communication network management (Richeldi), neurophysiology (Wendel), botany (Alberdi), solution chemistry (Gordon), NMR of carbohydrates (Metaxas), Alzheimer's disease (Wallis). The workshop began with an invited talk by Jan Zytkow of Wichita State University, USA. In his talk, entitled "Putting Together a Machine Discoverer: Basic Building Blocks", Zytkow gave an overview of the field of Machine Discovery, before addressing issues such as the difference between discovery and learning, and the requirements for a discoverer. The main thrust of his first argument being that discoverers are autonomous, whereas learners depend on a teacher. According to Zytkow, the aim of Machine Discovery is thus to limit the amount of "external" assistance. Statements such as: "All good learners are still discoverers" and "We were discoverers before we became learners" led to some interesting discussion! Zytkow listed the following techniques which he felt were required to build a powerful discoverer: linkage to empirical systems, experimentation strategies, theory formation from data, recognition of the unknown, identification of similar patterns. The first session contained four papers all of which addressed the issue of law discovery. Dzeroski described the LAGRANGE system, which extends the scope of discovery techniques to deal with dynamic systems. LAGRANGE is able to find a set of differential and/or algebraic equations which govern the behaviour of a dynamic system. This is in contrast to existing systems such as BACON, which have focused on laws describing static situations. The second paper (Van Laer) described an extension to the CLAUDIEN inductive logic programming system to allow it to handle numerical information. A simple conjunctive learning algorithm is employed, capable of finding inequalities such as: X * Y <=3D 5.17. An application-oriented view of numerical discovery was presented by Moulet. The ARC.2 system has been applied to the evaluation of Database Management System (DBMS) performance. The system attempts to discover a "cost model", relating the time spent executing a database query to the characteristics of the query. ARC.2 has discovered a number of simple cost models for the GeoSabrina DBMS, which are in agreement with those derived by human experts. The HUYGENS system (Cheng) uses a different approach to discover quantitative laws. A search through a space of diagrammatic representations of the problem is performed, rather than a search through algebraic formulas. Cheng described a series of diagrammatic operators and heuristics which were used to simulate the discovery of BlackUs Law. The second session focused on discovery of knowledge in databases (and large datasets). Klosgen described the Explora KDD system, which employs a number of techniques to reduce the potentially huge search space encountered when searching for regularities in databases. The system constructs a hierarchical search space of hypotheses based on a user-defined pattern and organises and controls the search for interesting patterns within this space. The system also supports the presentation of discovered information through a graphical user-interface. In the next presentation, Richeldi described a comparison of statistical and connectionist approaches for detection of relevant features in a large database containing information on telephone network maintenance activities. The data contain large numbers of irrelevant and redundant features, as well as inter-dependent features. The GALOIS system, for incremental determination of concept lattices was described by Carpineto. A concept lattice is a set of conceptual clusters linked by the general/specific relation. A number of applications of concept lattices were presented, including discovery of dependencies in databases. A large scale medical application was discussed by Wallis (25,000 examples, approx. 300 attributes). A pre-processing method employing transformations such as elimination of irrelevant features, application of functional dependencies and aggregation of attributes was outlined. The final session focused on experimentation and theory refinement. The MOBIS system (Wendel) is a case-based tool designed to assist neurophysiologists in the design and analysis of simulation experiments with biological neural networks. The system performs experiments with varying parameter settings in order to identify surprising new phenomena in a networkUs behaviour. Wendel described techniques used to transform numerical data derived from the simulation into a symbolic description of neuronal behaviour which is compared with previous experimental cases. The remaining presentations addressed issues in theory refinement; Alberdi described a psychological study to determine the search strategies and heuristics employed by expert botanists when performing plant classification tasks. Subjects were presented with puzzling phenomena and their refinement strategies were studied. A computational model of theory revision in the context of scientific classification was proposed based on the psychological results. The next speaker (Gordon) described the HUME system which employs simple qualitative models as part of a multistrategy architecture which integrates both data and theory-driven discovery methods. Gordon presented an overview of experimental results from eighteenth and nineteenth century solution chemistry and demonstrated the utility of qualitative models in guiding the theory construction process. The final workshop paper (Metaxas) discussed the CRITON system - an incremental concept learning system operating in the domain of NMR of carbohydrates. CRITON is able to deal with an incomplete instance language by introducing new descriptors. It does this by incorporating descriptors from a library or through interaction with an oracle (user). --------------------------------------- From: Gregory Piatetsky-Shapiro Date: Wed, 9 Feb 94 Subject: Reminder -- KDD-94 Submissions due March 1 ============================================================================ C a l l F o r P a p e r s ============================================================================ KDD-94: AAAI Workshop on Knowledge Discovery in Databases Seattle, Washington, July 31-August 1, 1994 =========================================== Knowledge Discovery in Databases (KDD) is an area of common interest for researchers in machine learning, machine discovery, statistics, intelligent databases, knowledge acquisition, data visualization and expert systems. The rapid growth of data and information created a need and an opportunity for extracting knowledge from databases, and both researchers and application developers have been responding to that need. KDD applications have been developed for astronomy, biology, finance, insurance, marketing, medicine, and many other fields. Core Problems in KDD include representation issues, search complexity, the use of prior knowledge, and statistical inference. This workshop will continue in the tradition of the 1989, 1991, and 1993 KDD workshops by bringing together researchers and application developers from different areas, and focusing on unifying themes such as the use of domain knowledge, managing uncertainty, interactive (human-oriented) presentation, and applications. The topics of interest include: Applications of KDD Techniques Interactive Data Exploration and Discovery Foundational Issues and Core Problems in KDD Machine Learning/Discovery in Large Databases Data and Knowledge Visualization Data and Dimensionality Reduction in Large Databases Use of Domain Knowledge and Re-use of Discovered Knowledge Functional Dependency and Dependency Networks Discovery of Statistical and Probabilistic models Integrated Discovery Systems and Theories Managing Uncertainty in Data and Knowledge Machine Discovery and Security and Privacy Issues We also invite working demonstrations of discovery systems. The workshop program will include invited talks, a demo and poster session, and panel discussions. To encourage active discussion, workshop participation will be limited. The workshop proceedings will be published by AAAI. As in previous KDD Workshops, a selected set of papers from this workshop will be considered for publication in journal special issues and as chapters in a book. Please submit 5 *hardcopies* of a short paper (a maximum of 12 single-spaced pages, 1 inch margins, and 12pt font, cover page must show author(s) full address and E-MAIL and include 200 word abstract + 5 keywords) to reach the workshop chairman on or before March 1, 1994. Usama M. Fayyad (KDD-94) | Fayyad@aig.jpl.nasa.gov AI Group M/S 525-3660 | Jet Propulsion Lab | (818) 306-6197 office California Institute of Technology | (818) 306-6912 FAX 4800 Oak Grove Drive | Pasadena, CA 91109 | ************************************* I m p o r t a n t D a t e s ********** * Submissions Due: March 1, 1994 * * Acceptance Notice: April 8, 1994 Final Version due: April 29, 1994 * ****************************************************************************** Program Committee ================= Workshop Co-Chairs: Usama M. Fayyad (Jet Propulsion Lab, California Institute of Technology) Ramasamy Uthurusamy (General Motors Research Laboratories) Program Committee: Rakesh Agrawal (IBM Almaden Research Center) Ron Brachman (AT&T Bell Laboratories) Leo Breiman (University of California, Berkeley) Nick Cercone (University of Regina, Canada) Peter Cheeseman (NASA AMES Research Center) Greg Cooper (University of Pittsburgh) Brian Gaines (University of Calgary, Canada) Larry Kerschberg (George Mason University) Willi Kloesgen (GMD, Germany) Chris Matheus (GTE Laboratories) Ryszard Michalski (George Mason University) Gregory Piatetsky-Shapiro (GTE Laboratories) Daryl Pregibon (AT&T Bell Laboratories) Evangelos Simoudis (Lockheed Research Center) Padhraic Smyth (Jet Propulsion Laboratory) Jan Zytkow (Wichita State University) ============================================================================ -------------------------- From: zicari@informatik.uni-frankfurt.de (Roberto Zicari) Subject: TAPOS Date: Wed, 9 Feb 94 14:47:34 MEZ Call for Papers Theory and Practice of Object Systems (TAPOS) ============================================= Editors in chief: Karl Lieberherr, Northeastern University, Boston, Massachusetts Roberto Zicari, Johann Wolfgang Goethe University, Frankfurt, Germany ************** Aims and Scope ************** Theory and Practice of Object Systems is an archival, peer reviewed journal dedicated to publishing high quality research results selected primarily in areas of Object Technology including but not limited to: - Programming Languages and Models - Foundations, Semantics, Type Theory - Database Management Systems and Database Languages - Concurrency - Distribution - Software Engineering and Software Development Tools and Environments - Formal Specification - Metrics and Evaluation - Analysis and Design Methods - Novel Applications - Operating Systems Contributions in other areas of object-based computing are also welcome. Research contributions on these aspects will be collected under the interdisciplinary umbrella of the object-oriented approach they have in common rather than from the point of view of the parent discipline. Theoretical papers should either break significant new ground or unify and extend existing theories. Systems papers should emphasize the underlying principles and important discoveries, backed up by architectural and implementation details. Published quarterly, Theory and Practice of Object Systems (TAPOS) disseminates new, but long lasting concepts and results of high quality useful to researchers and practitioners of object technology. The main goal of TAPOS is to make a fundamental contribution to the growth and consolidation of a scientific object community with high intellectual standards. The journal is a service to the object community which provides a forum for stringently refereed, noteworthy, and relevant results. ********************* How to submit a paper ********************* The editors-in-chief encourage the submission of contributions from all parts of the world. Five (5) copies of submitted articles should be sent to one of the editors-in-chief. The editors-in-chief will assign the article to an Associate Editor whose subject area expertise is appropriate to the article's subject. Published papers will include the name of the Associate Editor who managed the refereeing process. A special transfer of copyright agreement, signed and executed by the author, must be enclosed with each manuscript submission. (If the article is a work made for hire, the agreement must be signed by the employer.) If a paper is not accepted, the copyright agreement will be destroyed. Copies of the copyright agreement may be obtained from the editors-in-chief through e-mail. The corresponding author will receive 25 free reprints. There is no page charge to authors. Papers are processed with the understanding that they have not been published, submitted or accepted for publication elsewhere. Please submit your paper to either of the following addresses: Professor Karl Lieberherr Editor, TAPOS Northeastern University College of Computer Science 125 Cullinane Hall Boston, MA 02115-9959 U.S.A. lieber@CCS.neu.EDU Professor Roberto Zicari Editor, TAPOS Johann Wolfgang Goethe-Universitaet Fachbereich Informatik (20) Robert Mayer Strasse 11-15 D-60325 Frankfurt am Main, Germany zicari@informatik.uni-frankfurt.de All other correspondence concerning reprints, subscriptions, etc. should be sent to Ms. Diana Cerra, Professional and Trade Division. John Wiley and Sons, Ind., 605 Third Avenue, New York, NY 10158, USA. Ms. Cerra's email address is dcerra@jwiley.com. ****************** Associate editors: ****************** Professor Gul Agha Department of Computer Science 1304 W Springfield Ave University of Illinois Urbana, IL 61801 Areas: concurrent programming languages, semantics, parallel computing ----------------------------------------------------------- Dr. H. V. Jagadish Computing Systems Research Laboratory, MH 2T204 AT&T Bell Laboratories 600 Mountain Ave. Murray Hill, NJ 07974 Areas: object-oriented databases ----------------------------------------------------------- Professor TSE Maibaum Head, Department of Computing Imperial College of Science Technology and Medicine 180 Queen's Gate London SW7 2BZ UK Areas: formal methods, specification and implementation, concurrency and real time, modularisation. ----------------------------------------------------------- Dr. Jose Meseguer Computer Science Laboratory SRI International 333 Ravenswood Avenue Menlo Park, CA 94025, USA Areas: mathematical foundations of OOP, formal specification of OO systems, declarative approaches to concurrent OOP ----------------------------------------------------------- Professor Atsushi Ohori Research Institute for Mathematical Sciences Kyoto University Sakyo-ku, Kyoto 606-01 Japan Areas: type systems data models database programming language ----------------------------------------------------------- Dr. Harold Ossher, H1-B26 IBM T. J. Watson Research Center P. O. Box 704 Yorktown Heights, NY 10598 Areas: software composition system structure software development environments object-oriented languages ----------------------------------------------------------- Dr. Remo Pareschi Rank Xerox Research Centre 6, chemin de Maupertuis F-38240 Meylan France Areas: concurrency and distribution, object-oriented logic programming languages, object coordination schemas ----------------------------------------------------------- Professor Michael I. Schwartzbach Computer Science Department Aarhus University Ny Munkegade DK-8000 Aarhus C Denmark Areas: type systems, semantics, theory, implementation. ----------------------------------------------------------- Professor Mario Tokoro Department of Computer Science Keio University 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223 Japan Areas: Concurrent and Distributed Computation Models, Programming Languages, Operating Systems, and MultiAgent Systems. ----------------------------------------------------------- Professor Akinori Yonezawa Dept. of Information Science Faculty of Science University of Tokyo Hongo, Bunkyo-ku Tokyo 113 Japan Areas: concurrency, algorithms, language design, language implementation. ----------------------------------------------------------- Editorial board: Abiteboul Serge, INRIA, Paris, France Bertino Elisa, University of Milano, Milano, Italy Bruce Kim, Williams College, Williamstown, MA Cardelli Luca, Digital, Systems Research Center, Palo Alto, CA Freeman-Benson Bjorn, Carleton University, Ottawa Canada Gehani Narain, AT&T Bell Labs, Murray Hill, NJ Ghezzi Carlo, Politecnico di Milano, Milano, Italy Gutknecht Juerg, Swiss Federal Institute of Technology, Zurich, Switzerland King Roger, University of Colorado, Boulder, CO Koskimies Kai, University of Tampere, Tampere, Finland Mandrioli Dino, Politecnico di Milano, Milano, Italy Mitchell John, Stanford University, Palo Alto, CA Palsberg Jens, Northeastern University, Boston, MA Pirahesh Hamid, IBM Almaden, Almaden, CA Reif John, Duke University, Durham, NC Reuter Andreas, University of Stuttgart, Stuttgart, Germany Scholl Marc, University of Ulm, Ulm, Germany Soley Richard, Object Management Group, Framingham, MA Zdonik Stanley, Brown University, Providence, RI ================================================================== Free Sample issue and Subscription Order Form Theory and Practice of Object Systems Please enter my subscription to Theory and Practice of Object Systems Volume 1, 1995, 4 issues, ISSN 1074-3227 at the rate I have selected Personal rate __ $60 US and Can. __ $80 Outside North America Institutional rate __ $170 US and Can. __ $210 Outside North America Prices include shipping, handling, and packing charges worldwide. Air service included in the subscription price outside the U.S. Personal rate subscriptions are available to individuals and must be prepaid. Subscriptions are entered on a calendar-year basis only. The latest issue, as well as all published issues of the current volume, will be shipped after your payment is received. __Please send me a FREE sample issue. Method of Payment: __Check enclosed. All checks must be drawn on a U.S. bank and payable to John Wiley & Sons. __Purchase order enclosed. __Charge my credit card __MasterCard __Visa __American Express Card Number _______________________________ Expiry _____ Signature______________________________________________ Name____________________________________________________________ John Wiley and Sons, att. Diane Cerra, 605 Third Avenue, New York, NY 10158, USA