KDD Nuggets Index


To KD Mine: main site for Data Mining and Knowledge Discovery.
To subscribe to KDD Nuggets, email to kdd-request
Past Issues: 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets


Data Mining and Knowledge Discovery Nuggets 96:9, e-mailed 96-03-15

Contents:
News:
* U. Fayyad, CNN story on 'data mining'
http://www-cgi.cnn.com/TECH/9602/information_overload/index.html
* J. Han, KDD-96 submission logistics -- only 3 days left!
Publications:
* K. Thearling, 2 white papers on Data Mining at
http://www.santafe.edu/~kurt
Siftware:
* C. Volinsky, logistic regression: Bayesian Model Averaging
via bic.logit, http://lib.stat.cmu.edu/S/bic.logit
Positions:
* Staff, Vacancy: Lectureship in Trinity College Dublin, Ireland
http://www2.tcd.ie/statistics
Meetings:
* P. Chapnick, Data Mining Summit, Chicago, May 2-3, http//www.vldb.com
* S. Wrobel, ICML 96 Workshop announcement,
http://nathan.gmd.de/persons/stefan.wrobel/ICML96/workshops.html

--
Data Mining and Knowledge Discovery community,
focusing on the latest research and applications.

Contributions are most welcome and should be emailed,
with a DESCRIPTIVE subject line (and a URL, when available) to (kdd@gte.com).
E-mail add/delete requests to (kdd-request@gte.com).

Nuggets frequency is approximately weekly.
Back issues of Nuggets, a catalog of S*i*ftware (data mining tools),
and a wealth of other information on Data Mining and Knowledge Discovery
is available at Knowledge Discovery Mine site, URL http://info.gte.com/~kdd.

-- Gregory Piatetsky-Shapiro (moderator)

********************* Official disclaimer ***********************************
* All opinions expressed herein are those of the writers (or the moderator) *
* and not necessarily of their respective employers (or GTE Laboratories) *
*****************************************************************************

~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'Users don't want metadata -- they want betta' data'
Michael Cohn, in ComputerWorld, Feb 26, 1996

Previous  1 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: Usama Fayyad (fayyad@MICROSOFT.com)
Subject: FW: CNN story on 'data mining'
Date: Fri, 1 Mar 1996 10:12:58 -0800

Text from:
http://www-cgi.cnn.com/TECH/9602/information_overload/index.html

How to beat information overload


February 23, 1996
Web posted at: 3:40 p.m. EST
From Technology Correspondent Brian Nelson

(CNN) -- If the Internet is a haystack, finding a 'needle' of
information may be just a keystroke away. Search engines, data bases
and 'knowbots' may have what you're looking for.

An IBM solution called data mining is touted in a television ad
featuring runway models at a fashion show. It's a research tool that can
do everything from locating business sales figures to uncovering moves
to beat a chess opponent. Data mining has even helped professional
basketball coaches identify the right players and the right plays to
beat an opposing NBA team.

Search tools (also known as 'engines') such as Lycos and Yahoo allow web
surfers to search for something by topic by typing in the requested
information. AT&T Interchange, an on-line service for business
customers, goes further. In addition to providing access to numerous
databases on the Internet, it keeps on running well after you've shut
off the computer. You just have to ask to be updated. When you log back
on, new information will be waiting for you.

Search tools for networks and databases are becoming increasingly
sophisticated. The most advanced, called 'knowbots', are software tools
that strive to know what you want, and then find it for you.

For example, Firefly http://www.ffly.com/ - Carl, a web site for
music lovers, uses knowbot software
to help visitors expand their music collections. By typing in answers to
a series of questions, the knowbot produces a list of recording artists
similar to those the visitor already enjoys.



Previous  2 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: han@cs.sfu.ca
Date: Sun, 10 Mar 96 12:44:39 PST
Subject: Last Call for Papers (KDD-96)

Last Call For Papers: KDD-96

The Second International Conference on Knowledge Discovery and Data
Mining (KDD-96), Portland, Oregon, USA, August 2-4, 1996.

(Please visit the WEB site of the KDD'96 conference for more information:
http://www-aig.jpl.nasa.gov/kdd96


Sponsored by AAAI and Co-located with AAAI-96 and UAI-96

SUBMISSION:

Please submit 5 *hardcopies* of a short paper (a maximum of 9 single-spaced
pages not including cover page but including bibliography, 1 inch margins,
and 12pt font) to be received by March 18, 1996. A cover page must include
author(s) full address, E-MAIL, paper title and a 200 word abstract, and up
to 5 keywords. This cover page must accompany the paper. In addition, an
ASCII version of the cover page should be sent electronically via email to
kdd96@almaden.ibm.com by March 18th 1996 (preferably earlier for e-mail).
For the electronic title page, authors are required to use the template
as follows.

Please mail the 5 hardcopies of the full papers to:
AAAI (KDD-96)
445 Burgess Drive
Menlo Park, CA 94025-3496
U.S.A.
Phone: (+1 415) 328-3123; Fax: (+1 415) 321-4457
Email: kdd@aaai.org

Please e-mail your ASCII version of the cover page to
kdd96@almaden.ibm.com by March 18th 1996 (preferably earlier for e-mail).

Important Dates

5 copies of full papers received by: March 18, 1996
(in addition to an electronic ASCII title page)
Acceptance notices: April 19, 1996
Final camera-readies due to AAAI by: May 20, 1996

Program Co-chairs:

Evangelos Simoudis, IBM Almaden Research Center
JiaWei Han, Simon Fraser University

-----------------------------------------------------------
KDD-96 Electronic Abstract Submission Template.
-----------------------------------------------------------

Please observe the following:

1. Electronic submission is for abstract only. Hardcopies of the FULL
papers must be submitted as per instructions in the Call for Papers.

2. Please DO NOT REMOVE the special Keywords beginning with 'KDDT___'
prefix. They are needed for processing of submission.

3. Please fill in all fields, replace the example text with your
own information, and e-mail this entire file to kdd96@almaden.ibm.com

4. please REPLACE every line that begins with 'enter' using your own text,
do not touch the KEYWORDS IN CAPS.

KDDT___BEGIN
KDDT___TITLE
enter paper title here, can take multiple lines
KDDT___TITLEND
KDDT___AUTHORS
enter 2 lines per author, name and affiliation (no address here) on one line
enter email address of author on previous line on the second line.
enter 2n lines if you have if you have n authors
KDDT___AUTHORSEND
KDDT___CONTACT
enter the contact name and address here, this should be a single address, include phone and fax,
enter as many lines as needed for full address.
KDDT___CONTACTEND
KDDT___EMAIL
enter single e-mail here that will serve as primary contact e-mail
KDDT___ABSTRACT
enter your abstract here
enter as many lines as needed, please limit to 200 words or less
KDDT___ABSTRACTEND
KDDT___KEYWORDS
enter 5 to 10 subject keywords here, separate by commas,
enter as many lines as needed
KDDT___KEYWORDSEND
KDDT___END

-----------------------------------------------------------
This abstract template can also be fetched with the conference URL:
http://www-aig.jpl.nasa.gov/kdd96



Previous  3 Next   Top
>~~~Publications:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Fri, 1 Mar 96 14:46:35 MST
From: Kurt Thearling (kurt@santafe.edu)
To: kdd@gte.com
Subject: New DIG white papers available at http://www.santafe.edu/~kurt

The following two Data Intelligence Group (DIG) white papers have
recently been converted to HTML and are now available via the
WWW. DIG was created by Dun & Bradstreet to focus its data mining
activities and is part of Pilot Software, a D&B company. Both
papers can be found on my home page at http://www.santafe.edu/~kurt.
Hardcopies are also available (just send me email)

- kurt

White Paper 95/01: An Overview of Data Mining at Dun & Bradstreet

This document is a survey by DIG of data mining projects and
opportunities throughout the Dun & Bradstreet organization. Data
mining, the extraction of hidden predictive information from large
databases, is a powerful new technology with greater potential to help
D&B 'preemptively define the information market of tomorrow.' D&B
companies already know how to collect and refine massive quantities of
data to deliver relevant and actionable business information. In this
sense, D&B has been 'mining' data for years. Today, some D&B units are
already using data mining technology to deliver new kinds of answers
that rank high in the business value chain because they directly fuel
return-on-investment decisions. In the D&B units DIG surveyed, we
found strong interest and a wide range of activities and research in
data mining.

White Paper 95/02: From Data Mining to Database Marketing

The market for data mining - if you believe the hype - will be
billions of dollars by the turn of the century. Unfortunately, much of
what is now considered data mining will be irrelevant, since it is
disconnected from the business world. In general, marketing analysts
predictions that the technology will be very relevant to businesses in
the future are correct. The key to making a successful data mining
software product is to embrace the business problems that the
technology is meant to solve, not to incorporate the hottest
technology. In this report I will address some of the issues related
to the development of data mining technology as it relates to business
users.



Previous  4 Next   Top
>~~~Siftware:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sender: volinsky@stat.washington.edu
Date: Fri, 01 Mar 1996 10:47:39 -0800
From: CTV (volinsky@stat.washington.edu)
Subject: logistic regression: Bayesian Model Averaging via bic.logit

######################################################################
# Revised 'bic.logit' now available at Statlib!
######################################################################

'bic.logit' is an S-Plus function that does Bayesian Model
Averaging for logistic regression. The function accounts for model
uncertainty by selecting a reduced set of
models, and averaging over that set to produce posterior model
probailities and posterior probabilities that a parameter is
non-zero.

bic.logit was first submitted to Statlib by Adrian Raftery in
Nov. 1994. The function takes as input logistic regression data with
p potential covariates. First, the function reduces the model space of
2^p models using the leaps and bounds algorithm (Furnival and Wilson 1974).
Next, it calculates posterior model probabilities using a BIC
approximation and reduces model space via Occam's Window (Madigan and
Raftery 1994). The posterior probabilities are used in calculating
posterior parameter estimates and standard deviations.

The revised version of bic.logit retains the same structure but
contains some new features. These are:

1) More accurate fitting of the full set of models via leaps and
bounds by using an adjustment suggested by Lawless and Singhal (1978)
for nonlinear models.

2) Ability to include prior probabilities that the parameters are
non-zero. The user can either set a uniform prior over model
space (the default), ensure that a particular variable be included in
every model, or incorporate prior knowledge of the importance of a
variable.

3) Ability to change the nbest parameter which is passed to the leaps
and bounds function. This parameter controls the number of models for
each model size which leaps and bounds returns. A default of 150
assures the user that even for large p, no significant models will be
lost. For most practical purposes, nbest can be reduced to 30 or
less, which reduces computer run time without sacrificing good models.

4) We condensed the code and made it more effiecient.

To access the function, either send the message 'send bic.logit from S'
to statlib@stat.cmu.edu, or access it via the World Wide Web at
http://lib.stat.cmu.edu/S/bic.logit.
An example is included.

-------------------------------------------------------------------
References:

Raftery, A.E. (1994). 'Bayesian model selection in social research'.
Working Paper 94-12, Center for Studies in Demography and Ecology,
University of Washington. A revised version was published in
Sociological Methodology 1995 (Peter V. Marsden, editor), Blackwells.

This is available via the World Wide Web at
http://www.stat.washington.edu/tech.reports/bic.ps
It is also available via regular ftp using the following commands:


Furnival, G.M. and R.W. Wilson (1974) 'Regression by leaps and bounds'
Technometrics 16, 499-511.

Lawless, J. and K. Singhal (1978) 'Efficient screening on nonnormal
regression models. Biometrics 34 318-27.


Previous  5 Next   Top
>~~~Positions:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Return-Path: (gps0@ns.gte.com)
Date: Wed, 06 Mar 1996 18:36:43 +0100
From: statistics.office@tcd.ie (Statistics Office, TCD)
Subject: Vacancy: Lectureship in Trinity College Dublin
X-Sender: statistics.office@vax1.tcd.ie (Unverified)
To: kdd@gte.com
Mime-Version: 1.0
Content-Transfer-Encoding: 7BIT
Content-Type: text/plain; charset='us-ascii'
Content-Length: 2301

Trinity College
Department of Statistics

Full time lecturing position


The Department of Statistics is seeking to appoint one full time lecturing
position. The Department of Statistics at Trinity was founded in 1966 and
has grown to be the largest such department in Ireland, with 11 academic
staff with a wide range of interests, ranging from statistical theory
through applied statistics and operations research to the management of
information systems. It is responsible for an undergraduate degree in
Management Science and Information Systems Studies (MSISS), for the
provision of a number of courses within the mathematics degree and for
service teaching in many academic disciplines. It is well equipped with
modern computing facilities for teaching and research. The Department is
seeking to appoint one full time lecturing position in order to strengthen
its support for the MSISS degree and to participate in the development of
research and postgraduate initiatives in this area.

Research efforts in the Department are oriented towards applications and
full advantage has been taken of the latest developments in information
technology. Members of staff are actively involved in information systems,
data mining, spatial modelling, interactive graphics, quality control,
forecasting. The Department has been receiving research support for these
from a variety of Irish and international bodies. Currently, the Department
has 3 research students registered for the PhD and 9 for the M.Sc.

Applications are sought from candidates active in any area of management
science. The Department is particularly keen to receive applications from
those who are active in the field of information systems.

Salary will be within the range IRpounds 14,242 - IRpounds 20,096

Application forms and further particulars relating may be obtained from the:

Establishment Officer,
Staff Office,
Trinity College,
Dublin 2,
Ireland.

Telephone + (353) 1 608 1678
Facsimile + (353) 1 677 2169
email recruit@tcd.ie

to whom formal application may be made, preferably before March 31st, 1996.
Enquiries may also be made directly to Prof. J Haslett. Further details
on the Department of Statistics may be found http://www2.tcd.ie/statistics

Trinity College is an equal opportunity employer.


Previous  6 Next   Top
>~~~Meetings:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Mon, 04 Mar 96 14:44:39 PST
From: 'Chapnick, Philip' (pchapnick@mfi.com)

Data Mining Summit
May 2 - 3, 1996
Chicago, Ill USA

For more information contact Miller Freeman at:

(415) 905-2267; Fax (415) 905-2218

or find us on the World Wide Web: http//www.vldb.com

Hosted by Database Programming &Design, DBMS Magazine, and Miller Freeman Inc.,
a United News & Media company

Announcing the first-annual Data Mining Summit. For many organizations, the
ability to analyze large amounts of data to derive valuable information is a
major competitive advantage. Along with relational databases themselves,
decision support systems (DSSs), online analytical processing (OLAP), and
multidimensional databases (MDDs) are important technologies designed to support
this effort. At the cutting edge, however, is data mining.

Data mining is about the use of knowledge discovery, pattern recognition,data
analysis, and expert systems technology to automate the search for information
locked up in (typically) very large databases. Data mining algorithms and
techniques have long been the focus of artificial intelligence and expert
database research. More recently, organizations such as American Express, GTE
Mobilnet, Eli Lilly, A.C. Nielson, and MCI Corp. have deployed data mining tools
and technology. These and other companies find data mining critical to sales and
inventory analysis, database marketing, fraud detection, financial prediction,
and pharmaceutical development. Now, research, product development, and user
demand are maturing at the same time: The field is hitting critical mass.

The Data Mining Summit brings you a unique opportunity to learn about this topic
in a way that will enable you to evaluate the tools and techniques for business
purposes. Our goal in developing the program is to highlight not just what data
mining is--but how you can apply data mining today.

As you will see by the program, we have assembled the leading lights
of the data mining industry. Our illustrious group of speakers are
known for their development of tools and commercial systems--and for
their ability to explain this often complex topic. On Thursday, May 2,
three of the founders of the data mining and knowledge discovery field, Usama
Fayyad, Gregory Piatetsky-Shapiro, and Evangelos Simoudis, will
deliver a full-day session, called ''Data Mining and Knowledge
Discovery in Databases: A Tutorial for IS Managers, Practitioners, and
Database Developers.'' Friday, May 3 will feature a series of
case-study oriented presentations about how data mining is being
applied to various applications.

Thursday's Data Mining Summit Features an All-Day Session:

Thursday, 8:30 am - 4:00 pm

Data Mining and Knowledge Discovery in Databases: A Tutorial for IS Managers,
Practitioners, and Database Developers

Usama M. Fayyad, Microsoft Research
Gregory Piatetsky-Shapiro, GTE Laboratories
Evangelos Simoudis, IBM

Though growing rapidly, business databases and data warehouses hide
strategically valuable knowledge that users cannot understand and analyze
sufficiently with traditional ad-hoc query and reporting tools. This tutorial
will explore a new generation of automated tools and techniques for data mining
and knowledge discovery in databases. Data mining is underlying technology
behind cutting-edge decision support applications in database marketing, fraud
detection, financial prediction, sales and inventory analysis, banking, and
many other fields. We will begin by defining the field
and explaining the relationship between data mining and knowledge discovery.
Then, they will offer guidelines for selecting a successful data mining
application; examine fundamental data mining methods, including clustering,
summarization, classification, dependency analysis, and deviation detection; and
indicate which method is most appropriate for the task at hand.

In this session, you will learn:

How to find strategic knowledge hidden in large databases

Key application challenges--such as noisy and inadequate data, random patterns,
changing knowledge, and privacy protection--and how to address them

Step-by-step approach to the knowledge discovery process

Case studies of successful mining applications

Usama Fayyad is a Senior Researcher at Microsoft Research. Prior to
joining Microsoft in 1996, he headed the Machine Learning Systems Group=
at the Jet Propulsion Laboratory (JPL), California Institute of Technology
where he developed data mining systems for automated science data analysis.
He remains affiliated with JPL as a Distinguished Visiting Scientist.
Fayyad is a recipient of the 1993 Lew Allen Award for Excellence in
Research, and has
also received the 1994 NASA Exceptional Achievement Medal. His research
interests include knowledge discovery in large databases, data mining,
machine learning theory and applications, statistical pattern recognition,
and clustering. He was program co-chair of KDD-94 and KDD-95
(the First International Conference on Knowledge Discovery and Data
Mining).
He is general chair of KDD-96, an editor-in-chief of the journal: Data
Mining and Knowledge Discovery, and co-editor of the new MIT Press book
(1996): Advances in Knowledge Discovery and Data Mining.


Gregory Piatetsky-Shapiro is the Principal Member of Technical Staff at GTE
Laboratories and the Principal Investigator of the Knowledge Discovery in
Databases project. He led the development of Key Findings Reporter (KEFIR), a
system for analysis of healthcare costs, which won Leslie H. Warner technical
achievement award (GTE's highest), and is currently working on knowledge
discovery in customer databases and on discovery systems which integrate
multiple approaches. Piatetsky-Shapiro is one of the founders of the knowledge
discovery field; he organized and chaired the first three KDD workshops in 1989,
1991, and 1993, was part of organizing committee of KDD-95 and KDD-96
conferences, and is the chair of KDD steering committee. Piatetsky-Shapiro
co-edited Knowledge Discovery in Databases (AAAI/MIT Press, 1991) and Advances
in Data Mining and Knowledge Discovery (AAAI/MIT Press, 1995), and is an editor
of the Data Mining and Knowledge Discovery journal.
Gregory is also a founder and a moderator of Nuggets, an electronic
newsletter for the Data Mining and Knowledge Discovery Community
(e-mail to kdd@gte.com) and a webmaster of Knowledge Discovery Mine
http://info.gte.com/~kdd/. He has a Ph.D. from New York University.

Evangelos Simoudis is Director, Data Mining Solutions at IBM, where he
is responsible for the development and deployment of data mining
solutions to IBM's customers worldwide. Prior to joining IBM, Dr.
Simoudis was a Group Leader of the Data Comprehension Group at the
Lockheed AI Center where, since 1991, he led the development and
market introduction of the Recon data mining system, and led research
on knowledge discovery in databases, machine learning, case-based
reasoning and their application to financial, retail, and fraud
detection problems. In 1994 Dr. Simoudis and his team were awarded
Lockheed's Pursuit of Excellence Award for their work on the Recon
system. Simoudis is also an adjunt assistant professor at the
Computer Engineering department of the Santa Clara University where he
teaches graduate courses on machine learning and case-based reasoning.
Dr. Simoudis holds a Ph.D. in Computer Science from Brandeis
University, and M.S. in Computer Science from the University of
Oregon, a B.S. in Electrical Engineering from the California Institute
of Technology, and a B.A. in Physics from Grinnell College. Prior to
joining Lockheed, Dr. Simoudis was a principal software engineer at
Digital Equipment Corporation's Artificial Intelligence Center where
he led work on case-based reasoning, learning, and distributed AI.




Friday's Data Mining Sessions:

8:30 am: Principles of Data Mining
by Kamran Parsaye, Information Discovery, Inc.

9:50 am: Data Mining in Finance: BestPractices by Rick Makos, Unisys Corp.

11:00 am: Mining Data from Marketing Databases by Evangelos Simoudis, IBM Corp.

1:15 pm: A Human-Centered Approach to Knowledge Discovery by Tej Anand, NCR
Corp.

2:30 pm: Fuzzy Logic in Business: Data Mining in Fraud Detection and Insurance
by Earl Cox, The Metus Systems Group

3:40 pm: Predicting Sales: Case Studies from the Pharmaceutical Industry by
Peter Politakas, Digital Equipment Corp.



8:30 - 9:30 am
Principles of Data Mining
Kamran Parsaye, Information Discovery, Inc.

Driven by business users, who want to take the initiative regarding data access,
data mining is having a tremendous effect on data analysis. This session will
discuss a framework and conceptual structure for viewing decision support
technologies that relate to data mining. The goal will be to provide attendees
with a better overall sense of the principles and purposes of data mining. By
looking at case studies of how data mining has worked in financial services,
retail business, and other applications, Parsaye will compare and contrast the
key methodologies for successful data mining.

In this session, you will learn:

How discovery, forensic analysis, and predictive modeling form the three vectors
of data mining

The four ''spaces'' of decision support: Query, Online Analytical Processing
(OLAP), Discovery, and Mapping

Why these decision support spaces are orthogonal and how they derive from
relational domains

Which data mining methodologies have worked best with particular
commercialapplications

Kamran Parsaye, Ph D, is CEO of Information Discovery, Inc. He has been
developing commercial data mining applications since the mid-1980s. Parsaye has
a wide range of experience in the software industry, both as a research
scientist and in business, and has provided guidance to top-level management of
leading industrial, financial, and government organizations. He is the co-author
of Intelligent Database Tools & Applications (John Wiley & Sons, 1993).

9:50 - 10:55 am

Data Mining in Finance: Best Practices
Rick Makos, Unisys Corp.


Data warehouse and other data analysis solutions are rapidly increasing in
importance to the financial services community. In this presentation, Makos will
highlight the key business issues driving data analysis in finance. He will
detail a framework for implementation based on the best practices of leaders
within the financial services marketplace, who are using data mining solutions
to fundamentally change their businesses. Makos will discuss how attendees can
use best-practice techniques to reap economic benefits for their organizations.
Just as important, businesses can use data mining and data warehousing as
positive catalysts to update their core organizational paradigms.

In this session, you will learn:

The business drivers behind the growing interest in data mining in financial
services

Best practices based on the experience of key users of data mining and data
warehouse techniques

Using these techniques, how to reap business and organization benefits from
previously untapped corporate data resources

Rick Makos has over 12 years of business experience in the financial services
industry. He has worked in a variety of situations from business and technical
consulting to major financial services firms on a worldwide basis. He has
provided strategic advice on large-scale retail deli-very solutions; delivery
channel evolution (private banking and telephone delivery); and data warehouse,
decision-support solutions. In the last three years, Makos has worked as an
industry consultant in financial services industry, where he has assisted
institutions in leveraging their existing corporate data into a enterprise-wide,
cross-functional data warehouse. He has worked with Bank of America, Fidelity
Investments, Royal Bank of Canada, Citibank, and other corporations in these
applications.

11:00 am - 12:00 pm

Mining Data from Marketing Databases
Evangelos Simoudis, IBM Corp.


Fierce competition in retail, finance, media, and a range of other markets means
that IS must enable business users to employ timely and in-depth understanding
of current customers and all other consumers. This service has become even
more critical because other opportunities to improve a company's bottom line are
harder to find than in the past--and frequently have a shorter duration. Many
companies today collect and purchase a variety of consumer and customer data,
which they need to analyze quickly. With the introduction of data mining
technologies, analysts have an exciting new tool for rapid, in-depth analysis.
Simoudis will describe information discovery technologies for database marketing
tasks. Using case studies, he will discuss applications such as customer
segmentation, which analysts can use to decrease customer attrition and improve
cross-selling opportunities.


In this session, you will learn:

How data mining improves database marketing efforts

Business drivers for bringing data mining to bear customer segmentation efforts
and other database marketing applications

Which information discovery techniques are best for database marketing

Evangelos Simoudis is IBM's director of Data Mining Solutions. Before joining
IBM, Simoudis led Lockheed Corp.'s data mining research, and was responsible for
the commercial introduction and marketing of Lockheed's Recon data mining system
for financial and retail markets. Simoudis also spent six years as a member of
the principal research staff at Digital Equipment Corp.'s Artificial
Intelligence Center. He conducted research on machine learning, pattern
recognition, knowledge-based systems, and distributed artificial intelligence;
Digital has incorporated his research work in products for engineering design
and diagnostics. Simoudis has written extensively on data mining and machine
learning, and is the North American editor of the Artificial Intelligence
Review.

1:15 - 2:15 pm

Human-Centered Approach to Knowledge Discovery,
Tej Anand, NCR Corp.


Knowledge discovery efforts have typically focused on learning algorithms, which
provide the core capability for generalizing useful, high-level rules from large
numbers of specific facts. While learning algorithms hold much excitement and
are having a substantive effect particularly in scientific applications,
real-world discovery tasks are extremely complex. Low-level data mining is only
one small part of the overall process. In this presentation, Anand will identify
the building blocks of a real-world knowledge discovery process. After a careful
elucidation of these steps, Anand will offer a framework for comparing
different systems with a better understanding of some the ''human'' aspects of
knowledge discovery that have not received adequate attention.


In this session, you will learn:

Why a human-centered approach is key to successful data mining applications

How to see the big picture in a real-world knowledge discovery process

Within a clean framework, where to apply current techniques and technologies

Tej Anand established the Knowledge Discovery From Databases team at the AT&T
Global Information Solutions' (now NCR Corp). Human Interface Technology Center
in 1993 to enable commercial enterprises to realize business insights hidden in
their operational data. This team provides business and technical consulting in
the retail, insurance, and consumer packaged-goods industries. Before joining
AT&T, Anand developed two data mining tools for A. C. Nielsen, called
''Spotlight'' and ''Opportunity Explorer,'' which extract business insights from
retail point-of-sale data. He has also been a member of the research staff at
Philips Laboratories, Briarcliff Manor, N. Y., where he did work in artificial
intelligence software systems.

2:30 - 3:30 pm

Fuzzy Logic in Business: Data Mining in Fraud Detection and Insurance Earl Cox,
The Metus Systems Group


In this presentation, Cox will examine three real-world systems based on
knowledge mining technologies. First is a provider fraud and abuse detection
system for managed health care; this shows a data mining example that generates
a case-based reasoning repository from underlying data relationships. Second,
Cox will discuss a portfolio safety and suitability analysis system for
insurance applications. The third application is a project management selection
and risk assessment system, which uses fuzzy-neural networks and clustering to
rank capital-intensive projects based on risk and net-present value. In each
application, Cox will explain the actual system design and development process.


In this session, you will learn:

How fuzzy logic works successfully with core data mining applications

How to formulate knowledge mining objectives

How to apply clustering, neural nets, and rule discovery to real-world business
systems

Earl Cox is founder and CEO of the The Metus Systems Group, an advanced systems
science consulting and software services organization specializing in the
integration and application of emerging technologies in the corporate
environment. He has in-depth and hands-on experience in various areas including
complex system architecture design, programming systems, enterprise and decision
support modeling, and database and object-oriented macro and microsystem
development. A regular speaker and contributor to various publications, he is
also the author of The Fuzzy System Handbook, Fuzzy Logic for Business and
Industry, and Fuzzy Models in Knowledge-Based Systems. Usama Fayyad heads the
Machine Learning Systems Group at the Jet Propulsion Laboratory (JPL),
California Institute of Technology. He is P.I. of the Science Data Analysis and
Visualization Task targeting applications of data mining techniques for the
analysis of large science databases, as well as other tasks involving industrial
applications of machine learning. He is also adjunct assistant professor in
Computer Science at the University of Southern California, where he teaches
courses in artificial intelligence (AI). Fayyad is a recipient of the 1993 Lew
Allen Award for Excellence, the highest honor JPL awards to researchers in the
early years of their professional careers, and has also received the 1994 NASA
Exceptional Achievement Medal. His research interests include knowledge
discovery in large databases, data mining, machine learning theory and
applications, statistical pattern recognition, clustering, and non-linear
regression. He has served on the program committees of several major
conferences, including KDD-94, the First International Conference on Knowledge
Discovery and Data Mining (KDD-95).

3:40 - 4:40 pm

Predicting Sales: Case Studies from the Pharmaceutical Industry Peter Politakas,
Digital Equipment Corp.


In this presentation, Politakas will use examples from pharmaceutical sales and
marketing applications to show how analysts can apply data mining technology
successfully to a range of end-user decision-making priorities. Pharmaceutical
manufacturers, for example, need to decide how best to allocate its sales force
in a complex, competitive marketplace. Data mining techniques can produce
decision criteria to predict physicians' future prescription behavior. With this
additional information, a company's sales force can gain a competitive advantage
by targeting prospective customers. Politakas will show how key parts of the
data mining process--case composition and rule representation of the decision
criteria--address specific business needs.


In this session, you will learn:

How to compose cases; that is, the creation of features for data
mining from multiple sources

Methods for using decision criteria produced by the data mining engine to
establish rules

How to use rules to enable users to refine and apply knowledge to new data

Peter Politakasis a senior technical consultant in Digital's Data Mining and
Knowledge-based Solutions Services Group. He has developed and consulted on
numerous data mining and expert systems, including applications in finance,
medicine, hardware troubleshooting, manufacturing, and pharmaceutical
industries. His research interest is in generalizing knowledge acquisition and
validation methods for data mining and knowledge-based development. He joined
Digital in 1982 after completing his Ph.D. on expert system development at
Rutgers University. He has written several technical publications in the
artificial intelligence field.



Previous  7 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Subject: ICML 96 Workshop Announcement and CfP - corrected
Date: Fri, 01 Mar 96 16:54:48 +0100
From: Stefan.Wrobel@gmd.de
[Submission dates were incorrect in the first announcement.]
[Please accept our apologies if you receive this several times.]

**************************************************
ICML'96
13th International Conference on Machine Learning
Bari (Italy), July 3-6th, 1996
**************************************************
Announcement of Workshops
Call for Workshop Contributions
**************************************************
This information is also accessible at
http://nathan.gmd.de/persons/stefan.wrobel/ICML96/workshops.html


The 13th International Conference on Machine Learning (ICML'96) will
feature five pre-conference workshops to be held on July 3, 1996 (some
starting on July 2, 1996). The titles and organizers and WWW home pages
of these workshops are:

Evolutionary computing and Machine Learning
Terry Fogarty, Gilles Venturini
CfP: http://zen.btc.uwe.ac.uk/evol/cfp.html

Learning in context-sensitive domains
Miroslav Kubat, Gerhard Widmer
CfP: http://www.ai.univie.ac.at/icml_ws/ws.html

Synergy between scientific knowledge discovery and knowledge
discovery in databases
Derek Sleeman, Patricia Riddle
CfP: http://www.csd.abdn.ac.uk/~sleeman/cfp-iml96ws.html

Machine Learning meets human computer interaction (HCI meets ML)
Juergen Herrmann, Vassilis Moustakis
CfP: http://www.ics.forth.gr/~moustaki/ICML96_HCI_ML/hci_ml.html

Machine Learning and Databases
Christel Vrain, D. Laurent, Yves Kodratoff
CfP: http://web.univ-orleans.fr/~vrain/MLDB_ICML_cfp.html

Submissions should be made to the workshop organizers as specified in
the workshop's call for papers. There is a common timetable for all
workshops as follows.

Submission deadline: April 23, 1996
Notification of acceptance: May 14, 1996
Final version due: June 4, 1996
Workshop: July 3, 1996

Please address all question concerning individual workshops to the
workshop's organizers. General inquiries should go to any of the
members of the Organizing Committee or to the address:

icml96@di.unito.it

ICML'96 has its own page on the World-Wide Web in the URL at:
http://www.di.unito.it/pub/WWW/ICML96/home.html
Up-to-date workshop information will always be accessible from this address
http://nathan.gmd.de/persons/stefan.wrobel/ICML96/workshops.html
or from the workshops' home pages.
====================================================================
Program Chair
-------------
Lorenza Saitta saitta@di.unito.it
Universita di Torino Phone: (+39) 11 - 7429.214
Dipartimento di Informatica Fax: (+39) 11 - 751.603
Corso Svizzera 185
10149 Torino (Italy)

Local Chair
-----------
Floriana Esposito esposito@vm.csata.it
Universita di Bari Phone: (+39) 80 - 5443.264
Dipartimento di Informatica Fax: (+39) 80 - 5443.196
Via Orabona 4
70125 Bari (Italy)

Workshop Chair
--------------
Stefan Wrobel (stefan.wrobel@gmd.de)
GMD, FIT.KI
Schlo Birlinghoven
53754 Sankt Augustin (Germany)

Publicity Chair
---------------
Jeff Schlimmer (schlimme@eecs.wsu.edu)
School of Electrical Engineering and Computer Science
Washington State University
Pullman, WA 99164-2752 (USA)

Organizing Committee
--------------------
Giovanni Semeraro (Italy) semeraro@vm.csata.it
Marco Botta and Filippo Neri (Italy) {botta, neri}@di.unito.it
====================================================================


Previous  8 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~