Knowledge Discovery Nuggets Index


To
KD Mine: main site for Data Mining and Knowledge Discovery.
Here is how to subscribe to KD Nuggets
Past Issues: 1997 Nuggets, 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets


Knowledge Discovery Nuggets(TM) 97:28, e-mailed 97-09-30

News:
* ComputerWorld: Data Mining lifts ski marketing,
  • http://www.computerworld.com:8080/home/online9697.nsf/All/970922ski

  • * R. Kohavi, Two large census datasets available from SGI,
  • http://reality.sgi.com/ronnyk/

  • Publications:
    * Randall Caldwell, J. of Computational Intelligence in Finance:
    Special issue on Complexity and Dimensionality Reduction,
  • http://ourworld.compuserve.com/homepages/ftpub/call.htm

  • * Wolfgang Banzhaf, Genetic Programming Book Announcement,
  • http://ls11-www.informatik.uni-dortmund.de/people/banzhaf/publications.html


  • Siftware:
    * Jie Cheng, belief network learning system,
  • http://193.61.148.131/jcheng/bnpc.htm

  • * S. Ananyan, Megaputer Intelligence Update,
  • http://www.megaputer.ru

  • Positions:
    * Kenneth D Kopple, Research position at SmithKline Beecham
    Pharmaceuticals,
  • http://www.sb.com/careers

  • Meetings:
    * R. Zicari, Bill Gates to keynote COMDEX Internet & Object World,
    Oct 7-10, 1997, Frankfurt, Germany, URL:
  • http://www.ltt.de

  • * ECML-98, European Conf. on Machine Learning,
    Chemnitz, Germany, April 21-24 1998,
  • http://www.tu-chemnitz.de/informatik/ecml98/

  • * Trish Carbone, CFP: AFCEA Federal Data Mining Symposium,
    Washington, D.C., December 16-17, 1997,
  • http://www.afcea.org

  • * Ed Rigdon, 1998 Summer American Marketing Association meeting,
    August 15-18, 1998, Boston, MA
  • http://www.ama.org/conf/summer/98scall.htm

  • --
    Data Mining and Knowledge Discovery community, focusing on the
    latest research and applications.

    Submissions are most welcome and should be emailed, with a
    DESCRIPTIVE subject line (and a URL) to gps.
    Please keep CFP and meetings announcements short and provide
    a URL for details.

    To subscribe, see
  • http://www.kdnuggets.com/subscribe.html


  • KD Nuggets frequency is 2-3 times a month.
    Back issues of KD Nuggets, a catalog of data mining tools
    ('Siftware'), pointers to Data Mining Companies, Relevant Websites,
    Meetings, and more is available at Knowledge Discovery Mine site
    at
  • http://www.kdnuggets.com/


  • -- Gregory Piatetsky-Shapiro (editor)
    gps

    ********************* Official disclaimer ***************************
    All opinions expressed herein are those of the contributors and not
    necessarily of their respective employers (or of KD Nuggets)
    *********************************************************************

    ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Rearranging the letters of 'Data Mining' gives:

    A giant mind.
    In mad giant.
    Am giant din.
    A tin, dim nag.
    -- perhaps descriptions of Data Mining researchers ?
    GPS (thanks to Genius2000 anagram software)


    Previous  1 Next   Top
    Date: Sat, 20 Sep 1997 13:09:33 -0400
    From: computerworld_weekly (cwflash_mail@computerworld.com)
    Subject: COMPUTERWORLD on Data Mining the Ski Market

    [here is finally the data mining project which has really good lifts.
    GPS]

    CORPORATE STRATEGIES

    'Data mining lifts ski marketing'

    Until American Skiing Co. started building a data warehouse and
    data mining tools, its marketing efforts were headed downhill.
    The ski resort conglomerate was having a hard time keeping track
    of critical information -- as basic as who its customers were.
    But under its new, centralized approach, American Skiing is able
    to keep track of 2 million skiers, offer bonuses to repeat
    customers and relate to every one of them on a more personalized
    basis.

    Computerworld: page 43
    @Computerworld:
  • http://www.computerworld.com:8080/home/online9697.nsf/All/970922ski



  • Previous  2 Next   Top
    From: Ronny Kohavi (ronnyk@starry.engr.sgi.com)
    Date: Thu, 18 Sep 1997 23:11:23 -0700
    Subject: Two large census datasets available from SGI

    A year ago, we at Silicon Graphics, created the 'adult' dataset, which
    is now available at UCI. Thanks to Terran Lane who interned here this
    summer, we now have two larger files based on two years of real US
    census data (unlike the previous dataset, these are not filtered
    adults only).

    The files are challenging for scale-up experiments because the
    training sets are larger than common UCI files: 101MB and 47MB respectively.

    The files are available at:
  • http://reality.sgi.com/ronnyk/census-income.tar.gz

  • http://reality.sgi.com/ronnyk/census-year.tar.gz


  • The files are in the standard UCI/C4.5 format with some documentation
    on the attributes.

    --

    Ronny Kohavi (ronnyk@sgi.com,
  • http://robotics.stanford.edu/~ronnyk

  • Engineering Manager, Analytical Data Mining.


    Previous  3 Next   Top
    Date: Sat, 20 Sep 1997 17:27:16 -0400
    Subject: CFP: J. of Computational Intelligence in Finance: Special issue on
    Complexity and Dimensionality Reduction in Finance
    From: Randall Caldwell

    URL:
  • http://ourworld.compuserve.com/homepages/ftpub/call.htm


  • Journal of Computational Intelligence in Finance
    Call for Papers
    Special Issue on
    'Complexity and Dimensionality Reduction in Finance'


    The Journal of Computational Intelligence in Finance, a peer-reviewed
    technical journal, published by Finance & Technology Publishing, is
    seeking papers for review and publication in 1998 on 'Complexity and
    Dimensionality Reduction in Finance'.

    The Journal of Computational Intelligence in Finance publishes applied
    research and practical applications of high quality that are based on
    sound theoretical, empirical or quantitative analysis. It provides the
    international forum for the convergence of the new multi-disciplined
    field of computational intelligence in finance.

    Papers published in the Journal are eligible for entry in an Annual
    Essay Award Contest. The Editorial Advisory Board of the Journal
    selects the best paper for which a cash award is presented each year.

    SPECIAL TOPIC

    Complexity and Dimensionality Reduction in Finance

    PUBLICATION DATE

    May 1998

    PAPER SUBMISSION DEADLINE

    December 15, 1997

    SCOPE

    In the broad sense, all intelligent perception and data
    understanding seeks to reduce redundancy in data and, thus,
    its complexity and dimensionality. This special issue of JCIF
    focuses on a narrower scope: the theories, methods and
    algorithms for mapping financial data from its original
    representation into another form with reduced complexity and/or
    dimensionality that appear beneficial to financial applications.

    Of particular interest are techniques which can serve as
    preprocessors to data-driven models and data mining technologies,
    including those which address or utilize one or more of the
    following: complexity and dimensionality characterization,
    identification and analysis; data compression; feature extraction
    techniques; regularity discovery; inductive reasoning; randomness
    tests; algorithmic entropy; informational distance; minimal
    description length; adaptive and nonlinear PCA and other alternatives
    to standard forms of linear PCA; finite sequence statistics; variable
    combining methods; data filtering; categorical versus continuously-
    valued inputs; high-dimensional visualization analysis; and input
    space reduction techniques.

    MOTIVATION

    In finance, we inevitably encounter an unavoidable dilemma: an
    interest in collecting and utilizing as much data as possible in its
    original form so that potentially useful information is not lost,
    although this often results in data with high complexity and/or
    dimensionality that increases costs and reduces performance. Despite
    this, the notion that more input data is better persists.

    The need for managing complexity and dimensionality arises from eroding
    profit margins, diminishing arbitrage opportunities, lowered barriers to
    entry, increasingly segmented markets, increased costs, and, in general,
    the reduced performance (e.g., generalization ability) of tools applied,
    such as data-driven models and data mining technologies. Thus, the topic
    of this special issue represents very important areas of applied research
    across multiple disciplines relevant to computational intelligence in
    finance.

    ABSTRACTS AND PAPERS

    For submission requirements on this CFP and further details, see:

  • http://ourworld.compuserve.com/homepages/ftpub/call.htm


  • or contact: Editors, JCIF, P.O. Box 764, Haymarket, VA 20168, USA

    or send inquiry to: ftpub@compuserve.com



    Previous  4 Next   Top
    Subject: Genetic Programming Book Announcement
    Date: Tue, 23 Sep 1997 15:36:57 PDT
    From: Wolfgang Banzhaf (banzhaf@ICSI.Berkeley.EDU)

    Wolfgang Banzhaf Peter Nordin Robert E. Keller Frank D. Francone
    Genetic Programming --- An Introduction
    On the Automatic Evolution of Computer Programs and Its Applications

    With a Foreword
    by

    John R. Koza

    Publication date: November 1997
    Joint publication by: Morgan Kaufmann Publishers, San Francisco
    dpunkt.verlag, Heidelberg

    approx. 480 pp., 130 figures
    appr. US $ 50,-
    ISBN: 1-55860-510-X 3-920993-58-6

    For details, see
  • http://ls11-www.informatik.uni-dortmund.de/people/banzhaf/publications.html



  • FROM THE FOREWORD BY J.R. KOZA

    Genetic programming addresses the problem of automatic programming,
    namely the problem of how to enable a computer to do useful things
    without instructing it, step by step, on how to do it. The rapid growth
    of the field of genetic programming reflects the growing recognition
    that, after half a century of research in the fields of artificial
    intelligence, machine learning, adaptive systems, automated logic, expert
    systems, and neural networks, we may finally have a way to achieve
    automatic programming. Genetic programming is fundamentally different
    from other approaches in terms of (i) its representation (namely,
    programs), (ii) the role of knowledge (none), (iii) the role of logic
    (none), and (iv) its mechanism (gleaned from nature) for getting to a
    solution within the space of possible solutions.


    FROM THE FIRST SECTION OF THE BOOK

    Automated programming will be one of the most important areas of
    computer science research over the next twenty years. Hardware
    speed and capability has leapt forward exponentially. Yet software
    consistently lags years behind the capabilities of the hardware. The
    gap appears to be ever increasing. Demand for computer code
    keeps growing but the process of writing code is still mired in the
    modern day equivalent of the medieval ``guild'' days. Like swords in
    the 15th century, muskets before the early 19th century and books
    before the printing press, each piece of computer code is, today,
    handmade by a craftsman for a particular purpose.

    The history of computer programming is a history of attempts to move
    away from the ``craftsman'' approach -- structured programming, object
    oriented programming, object libraries, rapid prototyping. But each of
    these advances leaves the code that does the real work firmly in the
    hands of a craftsman, the programmer. The ability to enable computers
    to learn to program themselves is of the utmost importance in freeing
    the computer industry and the computer user from code that is obsolete
    before it is released.


    Wolfgang Banzhaf
    Department of Computer Science
    University of Dortmund
    GERMANY
  • http://ls11-www.informatik.uni-dortmund.de/people/banzhaf/



  • Previous  5 Next   Top
    Date: Mon, 22 Sep 1997 12:53:34 -0000
    From: 'Jie Cheng' (j.cheng@ulst.ac.uk)
    Subject: A belief network learning system
    URL:
  • http://193.61.148.131/jcheng/bnpc.htm


  • A belief network learning system is now available for download. It includes
    a wizard-like interface and a construction engine.

    Name: Belief Network Power Constructor
    Version: 1.0 Beta 1
    Platforms: 32-bit windows systems (windows95/NT)
    Input: A data set with discrete values in the fields (attributes) and
    optional domain knowledge (attribute ordering, partial ordering, direct
    causes and effects).
    Output: A network structure of the data set.

    Main Features:

    1.Easy to use. It gathers necessary input information through 5 simple
    steps.

    2.Accessibility. Supports most of the popular desktop database and
    spreadsheet formats, including: Ms-Access, dBase, Foxpro, Paradox, Excel
    and text file formats. It also supports remote database servers like
    ORACLE, SQL-SERVER through ODBC.

    3.Reusable. The engine is an ActiveX DLL, so that you can easily integrate
    the engine into your belief network, datamining or knowledge base system
    for windows95/NT.

    4.Efficient. This engine constructs belief networks by using conditional
    independence(CI) tests. In general, it requires CI tests to the complexity
    of O(N^4); when the attribute ordering is known, the complexity is O(N^2).
    N is the number of attributes (fields).

    5.Reliable. Modified mutual information calculation method is used as CI
    test to make it more reliable when the data set is not large enough.

    6.Support domain knowledge. Complete ordering, partial ordering and causes
    and effects can be used to constrain the search space and therefore speed
    up the construction process.

    7.Running time is Linear to the number of records.


    The system can be downloaded from web site:
  • http://193.61.148.131/jcheng/bnpc.htm


  • Suggestions and comments are welcome.

    ----------------------------------------------------
    Jie Cheng email: j.cheng@ulst.ac.uk
    16J24, Faculty of Informatics, UUJ, UK. BT37 0QB
    Tel: 44 1232 366500 Fax: 44 1232 366068
  • http://193.61.148.131/jcheng/

  • ----------------------------------------------------


    Previous  6 Next   Top
    From: ananyan@proteus.iucf.indiana.edu
    Date: Mon, 29 Sep 1997 17:59:46 -0500
    Subject: Megaputer Update

    Our company, Megaputer Intelligence Ltd. (MPI) based in Moscow, Russia, is
    a leading provider of advanced data mining and decision support
    solutions.

    Below I provide some updated information about the company and products.
    I would like to inform you that MPI has recently rolled out new data
    mining systems:

    -- PolyAnalyst 3.0 for Windows NT

    -- PolyAnalyst Knowledge Server (Client/Server architecture).

    PolyAnalyst 3.0 is a next generation automated knowledge discovery
    system. PolyAnalyst utilizes the newest AI technology: Evolutionary
    Programming and Symbolic Knowledge Acquisition. It presents discovered
    knowledge explicitly as rules and algorithms or predicting tables.
    PolyAnalyst achieves phenomenal results at major international contests
    of systems for data mining. Among users of the system are securities
    traders, bankers, doctors, and utility companies.

    We now provide free downloading of an evaluation copy of PolyAnalyst 3.0
    for Windows NT platform from our new Web site at

  • http://www.megaputer.ru


  • The corresponding user manual can be requested directly from
    megaputer@glas.apc.org

    Megaputer Intelligence recently opened an office in the USA:

    tel: 812-325-3026
    fax: 812-339-1646
    Megaputer Intelligence
    1518 E Fairwood Drive
    Bloomington IN 47408

    Sergei Ananyan
    Megaputer Intelligence



    Previous  7 Next   Top
    Date: Thu, 25 Sep 1997 09:42:51 -0400
    Subject: Research position at SmithKline Beecham Pharmaceuticals
    From: Kenneth D Kopple @ SB_PHARM_RD


    EXCEPTIONAL OPPORTUNITY IN CHEMINFORMATICS
    At SmithKline Beecham Pharmaceuticals the application of combinatorial
    chemistry and
    high-throughput screening is resulting in an extraordinary increase in
    the numbers of compounds and corresponding data being generated for
    drug discovery. We are expanding our transnational Cheminformatics
    group to work closely with medicinal chemists and screening scientists
    in the UK and US in the collection, transfer, manipulation and
    exploitation of these data.
    This expansion opens an opportunity (based at either our US or UK
    state-of-the-art facilities) in Knowledge Discovery in Databases,
    covering
    the development and application of tools to find relationships within
    and among large chemical and biological databases:
    GROUP LEADER - KNOWLEDGE DISCOVERY
    Requirements include a PhD in physical, chemical, biological or
    computer
    sciences or statistics, with at least 5 years' experience in
    pattern recognition, machine learning or chemometrics and a proven
    record of performance in chemical or biological database analysis.
    Job Code H7-0273
    As part of our commitment to attract and retain the best, SmithKline
    Beecham provides a fully competitive salary/benefits/relocation
    package. To be considered for this outstanding opportunity,
    mail or e-mail your curriculum vitae, indicating job code, to the
    address below. For more information on SmithKline Beecham, visit
    our Web site at www.sb.com/careers.
    We are an Equal Opportunity Employer.
    SmithKline Beecham
    Job Code H7-0273
    P.O. Box 2646
    Bala Cynwyd, PA 19004, USA
    e-mail: smithkline@jwtworks.com.


    Previous  8 Next   Top
    From: 'Prof. Zicari' (zicari@informatik.uni-frankfurt.de)
    Date: Thu, 18 Sep 1997 16:49:25 +0200 (METDST)
    URL:
  • http://www.ltt.de

  • ***********************************************************************
    Bill Gates to keynote COMDEX Internet & Object World Frankfurt '97.

    Microsoft Chairman & CEO to keynote on Internet - Intranet Communications
    and new Trends.

    ***********************************************************************

    Frankfurt -- August 1997. Bill Gates, Chairman & CEO, Microsoft
    Corporation, will present a keynote address at COMDEX Internet & Object
    World Frankfurt '97. Gates will focus on new trends in Internet and
    Intranet Communications.

    The Internet markets are exploding, and Dataquest analysts are
    prediciting a worldwide EDI market in the year 2,000 of US $ 1.9 billion.
    The Internet will play a major role in the way business will be done in the
    future. This challenge will be addressed in Bill Gates' keynote.

    The keynote address will take place on Wednesday, October 8, 1997,
    at the Sheraton Conference Center, Frankfurt,
    (Airport). Admission is free to all attendees of COMDEX Internet & Object
    World Frankfurt '97.

    COMDEX Internet & Object World Frankfurt will take place on
    October 7 - 10, 1997, in the Sheraton Conference Center Frankfurt (Airport),
    Frankfurt/Main, Germany.
    The show will consist of two conferences side by side and one combined
    exhibition with over 100 exhibitors.
    COMDEX Internet is the premiere COMDEX event in Germany, and it focuses
    on the business use of the Internet. Object World Frankfurt, now in its
    sixth year, has experienced constant growth and is recognized as the most
    important event for object technology in Europe.


    Show management is expecting more than 4,000 visitors to the show.

    The complete conference program of COMDEX Internet & Object World
    Frankfurt '97 is available at:
  • http://www.ltt.de


  • COMDEX Internet and Object World Frankfurt '97 are produced by SOFTBANK
    COMDEX Inc. and LogOn Technology Transfer GmbH.

    NOTE: For media registration and accreditation to COMDEX Internet &
    Object World Frankfurt '97, call LogOn PR at +49-6173-955855.
    (Annegret Claushues)


    Previous  9 Next   Top
    Date: Wed, 10 Sep 1997 18:32:19 +0200
    From: crecml98@lri.fr (Conf ECML98)
    Subject: ECML'98 - Call for Papers

    TENTH EUROPEAN CONFERENCE ON MACHINE LEARNING (ECML'98)
    Chemnitz, Germany, April 21-24 1998

    Up-to-date information on and full call for papers are at
  • http://www.tu-chemnitz.de/informatik/ecml98/


  • GENERAL INFORMATION:

    The 10th European Conference on Machine Learning (ECML-98) will be
    held in Chemnitz (ex- Karl Marx Stadt, near Dresden), Germany, from
    April, 21st to 24th 1998.

    PROGRAM

    The scientific program (April 21 - 23) will include invited talks,
    presentations of accepted papers, summary and commenting sessions on
    current and upcoming issues in machine learning, tutorials, an
    industrial session as well as poster and demonstration
    sessions. Saturday, April 24, will be devoted to workshops.

    Separate calls for proposals will be issued (please, consult ECML'98
    web page or contact ECML'98 chairpersons at ecml98@lri.fr for
    details).


    RELEVANT RESEARCH AREAS

    Submissions are invited that describe empirical and theoretical
    research in all areas of machine learning. In addition, papers from
    related disciplines that deal with adaptive intelligence,
    (semi-)automated knowledge acquisition, or (semi-)automated knowledge
    organization are welcome.

    Submissions that describe the application of machine learning methods
    to real-world problems are encouraged, but such submissions should
    speak of general issues of machine learning, perhaps illustrating
    novel learning methods or demonstrating the utility of established
    methods in previously unexplored settings.

    IMPORTANT DATES:

    Submission deadline: 31 October 1997
    Notification of acceptance 13 January 1998
    Camera ready copy 9 February 1998
    Conference 21-24 April 1998

    IMPORTANT ADDRESS

    Submitted papers, and poster / demonstration descriptions should be
    sent to :

    Claire Nedellec and Celine Rouveirol (ECML'98)
    LRI, Bat 490
    Universite Paris-Sud
    F-91405 Orsay Cedex FRANCE
    E-mail: ecml98@lri.fr



    Previous  10 Next   Top
    Date: Fri, 19 Sep 1997 09:45:59 -0400
    From: Trish Carbone (carbone@mitre.org)
    Subject: CFP: AFCEA Federal Data Mining Symposium

    FIRST FEDERAL DATA MINING SYMPOSIUM
    J. W. Marriot Hotel in Washington, D.C.
    December 16-17, 1997


    On behalf of AFCEA and the participating commands, we are pleased to invite
    you to the first Federal Data Mining Symposium!

    The Federal Data Mining Symposium will spotlight the technical advances in
    and applications of Data Mining in the government community. The need for
    better and more automated methods of analysis is particularly important as
    the amount of data being collected and stored increases dramatically.
    Analysts must be knowledgeable about new types of techniques, both
    statistical and artificial intelligence techniques, in order to better find
    patterns, correlations, trends, and summaries in the wealth of data.

    The major goals of the Symposium are to exchange information and ideas on
    the role of data mining and present requirements and proposed solutions,
    provide discussions on the broad range of applicable technologies, provide
    policy guidance applicable to DoD and civil agency information resource
    managers, and identify and encourage service-unique and government-wide
    knowledge and use of the important technology.

    The Federal Data Mining Symposium will focus on three overall areas:
    * User requirements for better analysis methods including data mining
    techniques
    * Applications of data mining that have been constructed and fielded,
    both for
    structured data as well as textual and other multimedia data
    * Technology for addressing the requirements

    Topics of interest include:
    * User requirements for data mining
    * Applications of data mining
    * Data mining from multimedia data (e.g., text, imagery, geospatial)
    * Lessons learned from constructing data mining systems
    * Solutions to data mining problems (noisy data, uncertain data, incomplete
    data, dynamic data)
    * Data cleansing as part of the data mining process
    * Visualization as part of data mining process
    * Validation and verification of discovered knowledge
    * Security/privacy concerns and solutions
    * Employment of discovered knowledge in decision support or other systems

    Data users, analysts, administrators, managers, developers, researchers,
    theoreticians, and vendors are cordially invited to attend and to submit
    papers for presentation at the Federal Data Mining Symposium. Papers will
    be selected based on relevance to the conference and technical quality.
    Selected papers will be presented at the conference and/or published in the
    proceedings. Exhibit Space Available!

    Call for Papers Due No Later Than - October 31, 1997

    For More Information:

    Telephone: (703) 631-6126
    Toll Free: (800) 336-4583
    Fax: (703) 631-6133
    E-mail:
  • http://www.afcea.org

  • Mail: AFCEA International
    Events Department, Attn: Data Mining Registration
    4400 Fair Lakes Court
    Fairfax, VA 22033-3899


    Previous  11 Next   Top
    Date: Mon, 29 Sep 1997 10:14:02 -0500
    From: Ed Rigdon (MKTEER@langate.gsu.edu)
    Subject: Re: CFP: 1998 Summer AMA

    The American Marketing Association's 1998 Summer Educators
    Conference will be held August 15-18 in Boston. The CFP is at:
  • http://www.ama.org/conf/summer/98scall.htm

  • The CFP for the 'Research Methodology' track especially invites
    papers and special session proposals dealing with 'automated data
    analysis/data mining.' As chair of this track, I would like to see
    presentations that facilitate an 'outbreak' of KDD work within the
    academic Marketing community. Along with competitive submissions, I
    also need volunteers to serve as reviewers for these submissions.

    Thanks.--Ed Rigdon (erigdon@gsu.edu)


    Previous  12 Next   Top