Knowledge Discovery Nuggets 97:18, e-mailed 97-05-27

KDD Nuggets Index


To
KD Mine: main site for Data Mining and Knowledge Discovery.
To subscribe to KDD Nuggets, email to kdd-request
Past Issues: 97 Nuggets, 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets


Knowledge Discovery Nuggets 97:18, e-mailed 97-05-27

News:
* Ronny Kohavi, Silicon Graphics' MineSet used in Incyte's LifeTools 3D
  • http://www.incyte.com/press/1997/PR9712-LT3D.html

  • * R. Zicari, COMDEX Internet Application Awards,
  • http://www.ltt.de

  • * Brij Masand, HPCwire: Robert Grossman discusses managing, mining
    large data sets
    Publications:
    * GPS, First Issue of DMKD journal is available on-line in PDF
    format,
  • http://www.wkap.nl/kapis/CGI-BIN/WORLD/kaphtml.htm?DAMISAMPLE

  • * Andy Pryke, Bibliography of KDD and Data Mining Papers,
  • http://www.cs.bham.ac.uk/~anp/papers.html

  • Meetings:
    * D. Fischer, COLT/ICML Early Registration deadline June 2,
  • http://cswww.vuse.vanderbilt.edu/~mlccolt/

  • * Jan Komorowski, PKDD'97 -- Call For Participation,
  • http://www.idi.ntnu.no/pkdd97/

  • * David Heckerman, Summer School on PROBABILISTIC GRAPHICAL MODELS
  • http://www.newton.cam.ac.uk/programs/nnm.html

  • * Vasant Honavar, CFP: Workshop on Automata Induction
    Grammatical Inference, and Language Acquisition at ICML-97
  • http://www.cs.iastate.edu/~honavar/mlworkshop.html

  • * Honghua Dai, KDEX-97: IEEE Knowledge and Data Engineering
    Exchange Workshop,
  • http://www.sd.monash.edu.au/kdex-97

  • * Gordon, CFP: ICML-97 workshop on Reinforcement Learning
  • http://www.cs.cmu.edu/~ggordon/ml97ws

  • --
    Data Mining and Knowledge Discovery community, focusing on the
    latest research and applications.

    Submissions are most welcome and should be emailed, with a
    DESCRIPTIVE subject line (and a URL) to gps.
    Submissions may be edited for length.
    Please keep CFP and meetings announcements short and provide
    a URL for details.

    To subscribe, see
  • http://www.kdnuggets.com/subscribe.html


  • KD Nuggets frequency is 3-4 times a month.
    Back issues of KD Nuggets, a catalog of data mining tools
    ('Siftware'), pointers to Data Mining Companies, Relevant Websites,
    Meetings, and more is available at Knowledge Discovery Mine site
    at
  • http://www.kdnuggets.com/


  • -- Gregory Piatetsky-Shapiro (editor)
    gps

    ********************* Official disclaimer ***************************
    All opinions expressed herein are those of the contributors and not
    necessarily of their respective employers (or of KD Nuggets)
    *********************************************************************

    ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'When you come to a fork in the road, take it.'
    - Yogi Berra -

    Previous  1 Next   Top
    Date: Thu, 15 May 1997 22:22:53 -0700
    From: Ronny Kohavi (ronnyk@starry.engr.sgi.com)
    Subject: Silicon Graphics' MineSet used in Incyte's LifeTools 3D

    A recent press release by Incyte Pharmaceuticals Inc. announces
    LifeTools 3D, a powerful data mining and visualization software based
    on Silicon Graphics' MineSet(tm) software suite of data analysis and
    visualization tools. In collaboration with Silicon Graphics, Incyte
    created customized functions that are specifically designed to help
    researchers view, explore, and identify novel genes within LifeSeq.

    See
  • http://www.incyte.com/press/1997/PR9712-LT3D.html

  • for details.

    --
    Ronny Kohavi (ronnyk@sgi.com,
  • http://robotics.stanford.edu/~ronnyk

  • Engineering Manager, Analytical Data Mining.

    Previous  2 Next   Top
    From: 'Prof. Zicari' (zicari@informatik.uni-frankfurt.de)
    Date: Fri, 9 May 1997 23:39:14 +0200 (METDST)
    Subject: COMDEX Internet Application Awards.

    News Release

    First COMDEX Internet Application Awards
    IBM, Microsoft and SUN to sponsor Awards Program for the new generation of
    Internet applications

    Frankfurt -- April 1997. The three leading IT companies IBM, MICROSOFT and
    SUN Microsystems will jointly support an international Awards Program
    designed for the new generation of Internet-based applications for
    business.

    The first COMDEX Internet Application Awards will be given out in the
    following three categories:
    Best Intranet-based application for enterprise usage
    Focus: Use of an Intranet for Institutional/Corporate knowledge
    for competitive advantage.
    Most Innovative Web Site
    Focus: Best or most innovative Web Site with respect to user
    interface, easy to use, innovative content.
    Best Transactional Internet Application
    Focus: Database, interactive applications.

    The Award winners will be selected among the submittals by a jury of
    international experts. The Awards ceremony will take place on October 8,
    1997 at the trade show COMDEX Internet & Object World Frankfurt'97 (October
    7-10,1997, Sheraton Conference Center, Frankfurt/Main Airport).

    'Successful Internet technologies like Java confirm us in considering the
    Internet as the future base for enterprise computing. The COMDEX Internet
    Application Awards program provides an excellent forum for honoring and
    supporting outstanding Internet applications. We are looking forward to an
    exciting contest', says Gert Haas, Marketing Director, SUN Microsystems,
    Germany.

    Microsoft's commitment to the Awards Program is explained by Karl-Heinz
    Breitenbach, Customer Unit Manager Internet & Developer Customer Unit,
    Microsoft Germany: 'The availability of all relevant information at work is
    the base for a fast and successful decision in a company. We therefore have
    taken the challenge of providing 'information at your fingertips' very
    early and this is reflected by our current product line. Internet
    technology today allows to rapidly and reliably represent information
    distributed in all branches of the company via a so called Intranet
    solution. With the sponsorship of the COMDEX Internet Application Awards,
    Microsoft confirms its commitment to innovative Internet technologies which
    perfectly match our company goals.'

    Sanyaya Addanki, General Manager of Network Computing Solutions, IBM EMEA,
    explains IBM's motivation for a sponsorship: 'IBM is committed to providing
    companies with solutions that link business critical applications and data
    with the global reach and easy access of the web. We are proud to sponsor
    the COMDEX Internet Application Awards Program, which fosters the
    development of electronic business applications. Electronic business is the
    cornerstone of IBM's network computing vision.'

    To obtain the entry kit:

    download it from the web at:
  • http://www.ltt.de

  • send an e-mail to: LogOn@omg.org
    call LogOn at: +49-6173-9558-51

    COMDEX Internet and Object World Frankfurt '97 are produced by SOFTBANK
    COMDEX Inc. and LogOn Technology Transfer GmbH.
    The show is sponsored by: Object Management Group (OMG), A1-Solutions,
    Business Online, Computer Associates, Computer Zeitung, MID and redmond's.
    Internet and Wireless are sponsored by Omnilink Internet Service Center and
    ARtem.

    Information on Conferences and Exhibition:
    Christiane Sattler
    LogOn Technology Transfer GmbH
    Burgweg 14, D-61476 Kronberg/Ts., Germany
    phone: +49-6173-9558-53
    fax: +49-6173-9404-20
    e-mail: LogOn@omg.org
    Web:
  • http://www.ltt.de


  • Previous  3 Next   Top
    [the following article is included with the permission of HPCwire. GPS]

    Date: Fri, 23 May 1997 14:00:49 -0400
    From: Brij Masand (brij@gte.com)
    Subject: ROBERT GROSSMAN DISCUSSES MANAGING, MINING LARGE DATA SETS

    [From H P C w i r e *** May 23, 1997: Vol. 6, No. 20 ***]

    ROBERT GROSSMAN DISCUSSES MANAGING, MINING LARGE DATA SETS
    by Alan Beck, editor in chief, HPCwire 05.23.97
    =============================================================================

    Chicago, Ill. -- Issues raised in the effective archiving, managing and
    mining of very large data sets have significant pragmatic repercussions
    throughout both commercial and scientific computing. To learn more about the
    state of the art in this area, HPCwire interviewed Robert Grossman, professor
    of mathematics, statistics and computer science at the University of Illinois
    at Chicago, president of Magnify, and principal researcher in the Terabyte
    Challenge.

    -------------------

    HPCwire: Please give an overview of the current status of the Terabyte
    Challenge, including funding sources and participants.

    GROSSMAN: 'The Terabyte Challenge is open, distributed test bed for
    managing and mining massive data sets. The infrastructure for the Terabyte
    Challenge is provided by the NSF sponsored National Scalable Cluster Project
    (NSCP) and its industrial partners. The NSCP philosophy is to use commodity
    components with high performance networking to build virtural platforms with
    supercomputing power. The software tools developed for the Terabyte Challenge
    seek to balance high performance computing with the high performance
    input/output required by data intensive and data mining applications.

    'Currently, the NSCP consists of approximately 25 nodes and 500 Gigabytes
    of disk at both UIC and UPenn, together with smaller clusters at the
    participating partners. The infrastructure will be more than doubling over
    the next few months to over 100 nodes and 2 Terabytes of disk. Unlike
    other centers, the NSCP is configured for managing and mining large data
    sets, ranging in size from 100 to 500 Gigabytes.

    'We are currently planning the third Annual Terabyte Challenge, which
    will take place at SC 97. The first two took place at Supercomputing 95 and
    96 (both won High Performance Computing Challenge Awards).

    'Currently, the University of Illinois at Chicago, the University of
    Pennsylvania, and the University of Maryland form the core academic team. Two
    industrial partners-HUBS (Philadelphia) and Magnify, Inc. (Chicago) will also
    be working closely on this year's Terabyte Challenge. Funding is provided by
    NSF to the NSCP Consortium, by DOE to UIC and UPenn, and by DOD to Magnify.
    We expect additional partners to join us. If interested, please contact RLG.

    'Current applications include mining scientific data (UIC and UPenn),
    mining medical data (UIC and UPenn), detecting network intrusions with data
    mining (Magnify, Inc), and data intensive computing in support of virtual
    reality (HUBS).

    'The web site
  • http://www.lac.uic.edu
  • will contain additional information
    shortly.'

    HPCwire: What progress has been made in scaling algorithms for very large
    data sets?

    GROSSMAN: 'I use the 10x rule: one can expect to archive 10-100x more data
    than one can manage, and manage 10-100x more data than one can mine. This
    makes sense since archiving requires a simple retrieval of files or objects,
    managing requires the ability to perform simple queries, and mining requires
    statistically and numerically intensive queries. At SC 96, we mined data sets
    that were roughly 100-250 Gigabytes in size using 10-25 nodes. At SC 97, we
    hope to mine 500-1000 Gigabytes of data on 50-100 nodes. I want to emphasize
    that one can manage and perform simple queries of much larger data sets (up
    to tens of Terabytes), but the detailed data mining of even a few hundred
    gigabytes of data is a challenge today.'

    'Parallelizing data mining algorithms can be done in several ways. Most
    data mining algorithms are sufficiently compute-intensive that they work best
    when the data and the working space required for the algorithm fit into
    memory. For large data sets this is not clearly not possible and the
    challenge is to balance the i/o requirements of the algorithm with the cpu
    requirements. Several approaches are possible:

    'For the purposes here, we assume that the data mining process consists of
    several steps, including 1) extracting patterns, 2) using these patterns
    automatically to build predictive models, and 3) selecting or combining
    multiple predictive models to produce a single decision. In each of the four
    methods described next, one or more subsets of the data are chosen and mined.
    The methods differ in how the subsets are chosen: the subsets may be created
    by random draws, by a partition of the data, by a cover of the data, or by a
    range based query of the data.

    'In sample based data mining, one samples a large data set and then
    extracts a patterns or builds a model. This is the most common approach. It
    works well for patterns that are still easily found after down sampling. It
    has the advantage that the compute time is vastly reduced (since the data to
    be mined is vastly smaller) and the disadvantage that the patterns obtained
    are often not indicative of the whole data set -- this is closely related to
    the problem of over-fitting. This approach is most often not parallelized,
    although sometimes sampling can be done in parallel and the results combined
    into one model using model averaging techniques.

    'In partitioned based data mining, the data set is partitioned into distinct
    subsets which fit into memory, each partition is separately mined to produce
    a collection of predictive models, and then the predictive models are
    combined using model selection and model averaging techniques. This type of
    data mining is easily parallelized, since one (or more) processors can be
    assigned to each partition.

    'Cover-based data mining is similar to partitioned based data mining, but
    the different subsets to be mined can be overlapping. This is closely
    related to what is called local mining, in which the patterns extracted use
    data which is localized in some fashion, say based on the N closest data
    points to a fixed reference point.

    'Attribute-based data mining creates different subsets to be mined by using
    an attribute based query of the underlying data set. For example, all objects
    whose first attribute is less than 1.1 and whose second attribute is equal to
    'A', etc. are selected and then mined.

    'For more information, see R. L. Grossman, Scaling Data Mining Algorithms
    Using Cover-based Learning with Model Selection and Model Averaging,
  • http://www.magnify.com
  • '

    HPCwire: How is the TC approaching the mining of highly distributed data?

    GROSSMAN: 'On the systems side, we have made good progress in this area.
    The NSCP clusters at UIC and UPenn have been connected for several weeks now
    by the vBNS at OC-3 (155 Mbps) speeds. Using this infrastructure we have
    experimented with wide area data mining of scientific and medical data. We
    are currently using this experience to develop new algorithms for wide area
    data mining and to develop new generations of our data management and data
    mining tools. The challenge is to develop a new class of algorithms for
    extracting patterns from widely distributed data without the necessity of
    first warehousing the data.'

    HPCwire: What progress has been made in better understanding dynamical
    systems via data mining?

    GROSSMAN: 'Not as much as we would have liked. Data mining algorithms
    today, by and large, work with data which is flat and static. The core
    dynamical system concepts of a state vector and its evolution in time are
    missing in most data mining algorithms. Hybrid systems is an emerging field
    which combines dynamical systems with discrete structures such as rule
    systems and automata. The latter can express the patterns discovered in data
    mining. Researchers working in the NSCP are actively investigating exploiting
    hybrid systems and related techniques to develop next generation data mining
    algorithms which can utilize state information and work with time varying
    data.'

    HPCwire: How is TC research being made available to the commercial sector?
    Have any new products or partnerships resulted from TC-generated technology?

    GROSSMAN: 'The NSCP and the Terabyte Challenge have 1) published the core
    ideas they have developed for data mining and data intensive computing, 2)
    developed reference architectures and implementations for software tools to
    support data mining (the UIC software tools PTool, JTool, and DMTool), and 3)
    encouraged companies to exploit this technology for data intensive computing
    and data mining.

    'To date, HUBS in Philadelphia and Magnify, Inc. in Chicago have begun to
    employ some of these ideas in the products and services they offer.
    Currently, regional data minings centers are in the planning process in both
    Chicago and Philadelphia.'

    HPCwire: How do you see the TC evolving over the next five years?

    GROSSMAN: 'The most exciting development is the expected transformation of
    the NSCP into two regional data mining centers with very strong industrial
    ties: one in Chicago and one in Philadelphia. This has three important
    consequences: 1) First the compute, i/o, and networking infrastructure which
    we can dedicate to data mining projects is expected to double this year and
    hopefully to double again in about two years. 2) With our industrial
    partners, we are actively working to demonstrate the practical feasibility of
    mining massive data sets and to establish open standards for managing,
    mining, and modeling massive data sets. 3) Using the vBNS network connecting
    the centers in Chicago and Philadelphia, we are finding it easy to experiment
    with the type of wide area data mining issues which we expect to take on an
    increasing important role for scientific, engineering, medical, and business
    data mining applications.

    'To summarize, during the next five years, we expect the Terabyte Challenge
    not only to continue to push the boundaries of massive data mining through an
    annual competition, but also, together with its industrial partners, to be
    actively involved with establishing data mining standards and reference
    implementations of software tools for managing, mining, and modeling massive
    data sets.

    'Additional participants for 1997 competition are welcome. Please contact
    one of the organizers if interested. Additional information
    can be found at
  • http://www.nscp.uic.edu
  • '

    --------------------
    Alan Beck is editor in chief of HPCwire. Comments are always welcome and
    should be directed to editor@hpcwire.tgc.com

    Copyright 1997 HPCwire. Redistribution of this article is forbidden by law
    without the expressed written consent of the publisher. For a free trial
    subscription to HPCwire, send e-mail to trial@hpcwire.tgc.com.
    H P C w i r e
    The Text-on-Demand E-zine for High Performance Computing
    ***************************************************************************




    Previous  4 Next   Top
    Date: Thu, 22 May 1997 15:05:54 -0400
    From: Gregory Piatetsky-Shapiro (gps)
    Subject: First Issue of DMKD journal is available on-line in PDF format

    The premiere issue of Data Mining and Knowledge Discovery journal
    is available on-line, in PDF format, at
  • http://www.wkap.nl/kapis/CGI-BIN/WORLD/kaphtml.htm?DAMISAMPLE


  • To read this very good (in my biased opinion) issue you need an Acrobat reader,
    which you can download from
  • http://www.adobe.com/acrobat/


  • Only the first issue will be freely available on-line,
    but you can subscribe to the journal for $50 individual rate, more
    for institutional rate
    -- see
  • http://www.wkap.nl/kapis/CGI-BIN/WORLD/journalhome.htm?1384-5810

  • for subscription information. Please support this journal !


    Previous  5 Next   Top
    From: A.N.Pryke@cs.bham.ac.uk
    Date: Fri, 23 May 97 22:12:09 BST
    Subject: Nuggets: Bibliography of KDD and Data Mining Papers

    The Master Bibliography of KDD and Data Mining Papers is a
    bibliography of over 400 papers on the topics of Data Mining and
    Knowledge Discovery in Databases (this includes closely related papers
    on visualisation and machine learning). More than 70 of the papers are
    online.

    It is available in either bibtex, or html annotated bibtex formats
    from:

  • http://www.cs.bham.ac.uk/~anp/papers.html



  • A search interface is also available at:

  • http://www.cs.bham.ac.uk/~anp/bibtex/search.html



  • Andy additional references, or corrections are gratefully
    received. Please email them to me, Andy Pryke, at
    A.N.Pryke@cs.bham.ac.uk Only references in machine readable format
    (e.g. refer or preferable Bibtex) can be added, due to time
    constraints.

    Note that all the information I have about the papers in in the
    bibliography, and many (330ish) of the papers are not available
    online.

    Please read the _collection_ copyright statement at
  • http://www.cs.bham.ac.uk/~anp/bibtex/copyright.html.


  • If you find the bibliography useful, you may wish to send me a
    postcard (details in the copyright statement).

    Andy Pryke
    --
    Andy Pryke, Research Student, Computer Science, Birmingham University
    Data Mining Information -
  • http://www.cs.bham.ac.uk/~anp/TheDataMine.html



  • Previous  6 Next   Top
    Date: Fri, 16 May 1997 19:09:05 -0500
    From: dfisher@vuse.vanderbilt.edu (Douglas H. Fisher)
    Subject: COLT/ICML Early Registration

    Early registration for the Tenth Annual Conference on
    Computational Learning Theory (COLT-97) and/or the Fourteenth
    International Conference on Machine Learning (ICML-97)
    concludes June 2, 1997. Room blocks at area hotels and on campus
    are also 'released' June 2 (though rooms will likely still be available
    after that date). See
  • http://cswww.vuse.vanderbilt.edu/~mlccolt/

  • for more information.


    Previous  7 Next   Top
    Date: Fri, 16 May 1997 16:44:56 +0200 (MET DST)
    From: Jan Komorowski (Jan.Komorowski@idi.ntnu.no)
    Subject: PKDD'97 -- Call For Participation

    1st European Symposium on Principles of
    Data Mining and Knowledge Discovery in Databases
    Trondheim, Norway
    June 24-27, 1997

    Tutorials: June 24-25
    Symposium: June 26-27

    This is an invitation to the 1st European Symposium on Principles of
    Data Mining and Knowledge Discovery in Databases.

    PKDD'97 is the first symposium in an intended series of meetings of
    the data mining and knowledge discovery from databases (KDD) community
    in Europe. The goal of the PKDD series is to provide a European-based
    forum for interaction among all theoreticians and practitioners
    interested in data mining and knowledge discovery. Fostering an
    interdisciplinary collaboration is one desired outcome, but the main
    long-term focus is on theoretical principles for the emerging
    discipline of KDD, especially those new principles that go beyond each
    of the contributing areas.

    There were 50 papers submitted to PKDD'97. After the selection by the
    program committee, the papers were assigned into three categories: 14
    plenary papers, 13 parallel session papers and 11 poster papers that
    include spot-light presentations in the plenary sessions. In
    addition, four tutorials were selected: Rough Sets for Data Mining and
    Knowledge Discovery, Techniques and Applications of KDD, High
    Performance Data Mining, and Data Mining in the Telecommunications
    Industry.

    The proceedings are published by Springer Verlag.

    The invited speakers include Evangelos Simoudis, USA, and Bjarne Foss,
    Norway. Theey will provide their different perspectives on the field:
    one is data mining for businesses and the other data mining seen from
    the point of view of control theory. Panel discussions on the present
    situation and the future development of the field are planned.

    There will be software exhibitions of both commercial and academic
    software.

    Please look at the PKDD'97 Homepage
  • http://www.idi.ntnu.no/pkdd97/
  • for
    detailed information and news about the symposium.


    Previous  8 Next   Top
    From: David Heckerman (heckerma@MICROSOFT.com)
    Subject: Summer School on PROBABILISTIC GRAPHICAL MODELS
    Date: Fri, 16 May 1997 08:08:00 -0700

    A Newton Institute EC Summer School

    PROBABILISTIC GRAPHICAL MODELS

    1 - 5 September 1997

    Isaac Newton Institute, Cambridge, U.K.

    Organisers: C M Bishop (Aston) and J Whittaker (Lancaster)


    Probabilistic graphical models provide a very general framework for
    representing complex probability distributions over sets of
    variables. A powerful feature of the graphical model viewpoint is that
    it unifies many of the common techniques used in pattern recognition
    and machine learning including neural networks, latent variable
    models, probabilistic expert systems, Boltzmann machines and Bayesian
    belief networks. Indeed, the increasing interactions between the
    neural computing and graphical modelling communities have resulted in
    a number of powerful new ideas and techniques. The conference will
    include several tutorial presentations on key topics as well as
    advanced research talks.


    Provisional themes:

    Conditional independence; Bayesian belief networks; message
    propagation; latent variable models; variational techniques; mean
    field theory; learning and estimation; model search; EM and MCMC
    algorithms; axiomatic approaches; causality; decision theory; neural
    networks; information and coding theory; scientific applications and
    examples.


    Provisional list of speakers:

    C M Bishop (Aston) D J C MacKay (Cambridge)
    R Cowell (City) J Pearl (UCLA)
    A P Dawid (UCL) M D Perlman (Washington)
    D Geiger (Technion) M Piccioni (Aquila)
    E George (Texas) R Shachter (Stanford)
    W Gilks (Cambridge) J Q Smith (Warwick)
    D Heckermann (Microsoft) M Studeny (Prague)
    G E Hinton (Toronto) M Titterington (Glasgow)
    T Jaakkola (UCSC) J Whittaker (Lancaster)
    M I Jordan (MIT) S Lauritzen (Aalborg)
    B Kappen (Nijmegen) D Spiegelhalter (Cambridge)
    M Kearns (AT&T) S Russell (Berkeley)

    This instructional conference will form a component of the Newton
    Institute programme on Neural Networks and Machine Learning, organised
    by C M Bishop, D Haussler, G E Hinton, M Niranjan and L G Valiant.
    Further information about the programme is available via the WWW at

  • http://www.newton.cam.ac.uk/programs/nnm.html



  • Location and Costs:

    The conference will take place in the Isaac Newton Institute and
    accommodation for participants will be provided at Wolfson Court,
    adjacent to the Institute. The conference package costs 270 UK pounds
    which includes accommodation from Sunday 31 October to Friday 5
    September, together with breakfast, lunch during the days that the
    lectures take place and evening meals.


    Applications:

    To participate in the conference, please complete and
    return an application form and, for students and postdoctoral fellows,
    arrange for a letter of reference from a senior scientist. Limited
    financial support is available for participants from appropriate
    countries.

    Application forms are available from the conference Web Page at

  • http://www.newton.cam.ac.uk/programs/nnmec.html


  • Completed forms and letters of recommendation should be sent to Heather
    Dawson at the Newton Institute, or by e-mail to
    h.dawson@newton.cam.ac.uk

    *Closing Date for the receipt of applications and
    letters of recommendation is 16 June 1997*


    Previous  9 Next   Top
    From: Vasant Honavar (honavar@cs.iastate.edu)
    Subject: Call for Participation: Workshop on Automata Induction,
    Grammatical Inference, and Language Acquisition
    Date: Thu, 8 May 1997 10:53:48 -0500 (CDT)

    Workshop on
    Automata Induction, Grammatical Inference, and Language Acquisition
    The Fourteenth International Conference on Machine Learning (ICML-97)
    July 12, 1997, Nashville, Tennessee

    The Automata Induction, Grammatical Inference, and Language Acquisition
    Workshop will be held on Saturday, July 12, 1997 during the Fourteenth
    International Conference on Machine Learning (ICML-97) which will be
    co-located with the Tenth Annual Conference on Computational Learning Theory
    (COLT-97) at Nashville, Tennessee from July 8 through July 12, 1997.
    Additional information on ICML-97 and COLT-97 can be found at
  • http://www.cs.iastate.edu/~honavar/mlworkshop.html



  • Previous  10 Next   Top
    Date: Wed, 21 May 1997 12:23:13 +1000
    From: Honghua Dai (dai@cs.monash.edu.au)
    Subject: KDEX-97 Final Call for Papers

    1997 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX-97)
    --------------------------------------------------------------------
    Sponsored by the IEEE Computer Society and Co-located with
    the 9th IEEE Tools with Artificial Intelligence Conference

    November 4, 1997, Newport Beach, California, U.S.A.
    ===================================================

    Call for Papers

    The 1997 IEEE Knowledge and Data Engineering Exchange Workshop
    (KDEX-97) will provide an international forum for researchers,
    educators and practitioners to exchange and evaluate information and
    experiences related to state-of-the-art issues and trends in the areas
    of artificial intelligence and databases. The goal of this workshop
    is to expedite technology transfer from researchers to practitioners,
    to assess the impact of emerging technologies on current research
    directions, and to identify emerging research opportunities.
    Educators will present material and techniques for effectively
    transferring state-of-the-art knowledge and data engineering
    technologies to students and professionals. The workshop is currently
    scheduled for an one-day duration, but depending on the final program
    it might be extended to a second day.

    Submissions can be in the form of survey papers, experience reports,
    and educational material to facilitate technology transfer. Accepted
    papers will be published in the workshop proceedings by the IEEE
    Computer Society. A selected number of the accepted papers will
    possibly be expanded and revised for publication in the IEEE
    Transactions on Knowledge and Data Engineering (IEEE-TKDE) and the
    International Journal of Artificial Intelligence Tools. Educational
    material related to papers published in the IEEE-TKDE will be posted
    on the IEEE-TKDE home page.

    The theme of the workshop is 'AI MEETS DATABASES'. Topics of interest
    include, but are not limited to:

    - Computer supported cooperative processing and interoperable
    systems
    - Data sharing, data warehousing and meta-data management
    - Distributed intelligent mediators and agents
    - Distributed object management
    - Dynamic knowledge
    - Evaluation and measurement of knowledge and database systems
    - High-performance issues (including architectures, knowledge
    representation techniques, inference mechanisms, algorithms and
    integration methods)
    - Information structures and interaction
    - Intelligent search, data mining and content-based retrieval
    - Knowledge and data engineering systems
    - Quality assurance for knowledge and data engineering systems
    (correctness, reliability, security, survivability and
    performance)
    - Software re-engineering and intelligent software information
    systems
    - Spatio-temporal, active, mobile and multimedia data
    - Emerging applications (biomedical systems, decision support,
    geographical databases, Internet technologies and applications,
    digital libraries, etc.)

    All submissions should be limited to a maximum of 5,000 words. Six
    hardcopies should be forwarded to the following address.

    Xindong Wu (KDEX-97)
    Department of Software Development
    Monash University
    900 Dandenong Road
    Caulfield East, Melbourne 3145
    Australia

    Phone: +61 3 9903 1025
    Fax: +61 3 9903 1077
    E-mail: xindong@insect.sd.monash.edu.au

    Please include a cover page containing the title, authors (names,
    postal and email addresses, telephone and fax numbers), and an
    abstract. This cover page must accompany the paper.

    ************ I m p o r t a n t D a t e s *****************
    * 6 copies of full papers received by: June 15, 1997 *
    * acceptance/rejection notices: July 31, 1997 *
    * final camera-readies due by: August 31, 1997 *
    * workshop: November 4, 1997 *
    ************************************************************

    Further Information
    ===================

    WWW:
  • http://www.sd.monash.edu.au/kdex-97


  • Previous  11 Next   Top
    From: gordon@AIC.NRL.Navy.Mil
    Date: Tue, 20 May 97 10:30:38 EDT
    Subject: CFP: ICML-97 workshop on REINFORCEMENT LEARNING: TO MODEL OR
    NOT TO MODEL, THAT IS THE QUESTION

    Workshop at the Fourteenth
    International Conference on Machine
    Learning (ICML-97)

    Vanderbilt University, Nashville, TN
    July 12, 1997

    www.cs.cmu.edu/~ggordon/ml97ws

    Recently there has been some disagreement in the reinforcement
    learning community about whether finding a good control policy
    is helped or hindered by learning a model of the system to be
    controlled. Recent reinforcement learning successes
    (Tesauro's TD-gammon, Crites' elevator control, Zhang and
    Dietterich's space-shuttle scheduling) have all been in
    domains where a human-specified model of the target system was
    known in advance, and have all made substantial use of the
    model. On the other hand, there have been real robot systems
    which learned tasks either by model-free methods or via
    learned models. The debate has been exacerbated by the lack
    of fully-satisfactory algorithms on either side for
    comparison.

    Topics for discussion include (but are not limited to)

    o Case studies in which a learned model either contributed to
    or detracted from the solution of a control problem. In
    particular, does one method have better data efficiency?
    Time efficiency? Space requirements? Final control
    performance? Scaling behavior?
    o Computational techniques for finding a good policy, given a
    model from a particular class -- that is, what are good
    planning algorithms for each class of models?
    o Approximation results of the form: if the real system is in
    class A, and we approximate it by a model from class B, we
    are guaranteed to get 'good' results as long as we have
    'sufficient' data.
    o Equivalences between techniques of the two sorts: for
    example, if we learn a policy of type A by direct method B,
    it is equivalent to learning a model of type C and computing
    its optimal controller.
    o How to take advantage of uncertainty estimates in a learned
    model.
    o Direct algorithms combine their knowledge of the dynamics and
    the goals into a single object, the policy. Thus, they may
    have more difficulty than indirect methods if the goals change
    (the 'lifelong learning' question). Is this an essential
    difficulty?
    o Does the need for an online or incremental algorithm interact
    with the choice of direct or indirect methods?

    full information at
    www.cs.cmu.edu/~ggordon/ml97ws
    Contact: Geoff Gordon (ggordon@cs.cmu.edu)


    Previous  12 Next   Top