KDD Nuggets Index

To KD Mine: main site for Data Mining and Knowledge Discovery.

To subscribe to KDD Nuggets, email to kdd-request

Past Issues: 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets

Data Mining and Knowledge Discovery Nuggets 96:37, e-mailed 96-11-27

News:

* GPS, LA Times: Bill Gates says Microsoft's competitive advantage is ..
Publications:

* R. Kohavi, Data Mining Using MLC++ won TAI's Best Paper award,

ftp://starry.stanford.edu/pub/ronnyk/mlc96.ps.Z

* F. Famili, CFP: Intelligent Data Analysis - new journal,

http://www.elsevier.com/locate/ida

* H. Motoda, CFP: IEEE Expert Spec. Issue on

feature transformation and subset selection

* T. Dietterich, Statistical Tests for Comparing Supervised

Classification Learning Algorithms,

ftp://ftp.cs.orst.edu/pub/tgd/papers/stats.ps.gz
Siftware:

* GPS, new entries for data mining companies

DAZSI and Neural Technologies
Positions:

* M. Singh, NCSU Position: Workflow Management and Data Mining,

http://www.csc.ncsu.edu
Meetings:

* L. Huan, Advance Program PAKDD-97: Pacific-Asia Conf. on

Knowledge Discovery and Data Mining, Singapore, 23-24 February, 1997

http://www.iscs.nus.sg/conferences/pakdd97

--
Discovery in Databases (KDD) community, focusing on the latest research and
applications.

Submissions are most welcome and should be emailed,
with a DESCRIPTIVE subject line (and a URL, when available) to kdd@gte.com
To subscribe, email subscribe kdd to kdd-request@gte.com.

Nuggets frequency is approximately 3 times a month.
Back issues of Nuggets, a catalog of S*i*ftware (data mining tools),
and a wealth of other information on Data Mining and Knowledge Discovery
is available at Knowledge Discovery Mine site http://info.gte.com/~kdd

-- Gregory Piatetsky-Shapiro, Editor.

********************* Official disclaimer ***********************************
* All opinions expressed herein are those of the writers (or the moderator) *
* and not necessarily of their respective employers (or GTE Laboratories) *
*****************************************************************************

~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The man who never alters his opinions is like standing water,
and breeds reptiles of the mind.
William Blake

Previous 1 Next Top

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Fri, 22 Nov 1996 15:31:49 -0500
From: gps@gte.com (Gregory Piatetsky-Shapiro)
Subject: LA Times: Bill Gates says Microsoft's competitive advantage is
in 'Bayesian networks.'

(I got this from uai list. GPS)

Los Angeles Times, October 28, 1996

When Microsoft Senior Vice President Steve Ballmer first heard his
company was planning to make a huge investment in an Internet service offering
movie reviews and local entertainment information in major cities across the
nation, he went to Chairman Bill Gates with his concerns.

After all, Ballmer has billions of dollars of his own money in
Microsoft stock, and entertainment isn't exactly the company's strong point.

But Gates dismissed such reservations. Microsoft's competitive
advantage, he responded, was its expertise in 'Bayesian networks.'

Asked recently when computers would finally begin to understand human
speech, Gates began discussing the critical role of 'Bayesian' systems.

Ask any other software executive about anything 'Bayesian' and you're
liable to get a blank stare.

Is Gates onto something? Is this alien-sounding technology
Microsoft's new secret weapon?

Quite possibly.

Bayesian networks are complex diagrams that organize the body of
knowledge in any given area by mapping out cause-and-effect
relationships among key variables and encoding them with numbers that
represent the extent to which one variable is likely to affect
another.

Programmed into computers, these systems can automatically generate
optimal predictions or decisions even when key pieces of information are
missing.

When Microsoft in 1993 hired Eric Horvitz, David Heckerman and Jack
Breese, pioneers in the development of Bayesian systems, colleagues in
the field were surprised. The field was still an obscure, largely
academic enterprise.

Today the field is still obscure. But scratch the surface of a range
of new Microsoft products and you're likely to find Bayesian networks embedded
in the software. And Bayesian nets are being built into models that are used to
predict oil and stock prices, control the space shuttle and diagnose disease.

Artificial intelligence (AI) experts, who saw their field
discredited in the early 1980s after promising a wave of 'thinking'
computers that they ultimately couldn't produce, believe widening
acceptance of the Bayesian approach could herald a renaissance in the
field.

Bayesian networks provide 'an overarching graphical framework' that
brings together diverse elements of AI and increases the range of its likely
application to the real world, says Michael Jordon, professor of brain
and cognitive science at the Massachusetts Institute of Technology.

Microsoft is unquestionably the most aggressive in exploiting the new
approach. The company offers a free Web service that helps customers
diagnose printing problems with their computers and recommends the quickest way
to resolve them. Another Web service helps parents diagnose their
children's health problems.

The latest version of Microsoft Office software uses the technology
to offer a user help based on past experience, how the mouse is being moved and
what task is being done.

'If his actions show he is distracted, he is likely to need help,'
Horvitz says. 'If he's been working on a chart, chances are he needs help
formatting the chart.'

'Gates likes to talk about how computers are now deaf, dumb, blind
and clueless. The Bayesian stuff helps deal with the clueless part,' says
Daniel T. Ling, director of Microsoft's research division and a former IBM
scientist.

Bayesian networks get their name from the Rev. Thomas Bayes, who
wrote an essay, posthumously published in 1763, that offered a
mathematical formula for calculating probabilities among several
variables that are causally related but for which--unlike calculating
the probability of a coin landing on heads or tails--the relationships
can't easily be derived by experimentation.

Early students of probability applied the ideas to discussions about
the existence of God or efforts to improve their odds in gambling. Much
later, social scientists used it to help clarify the key factors influencing a
particular event.

But it was the rapid progress in computer power and the development
of key mathematical equations that made it possible for the first time, in the
late 1980s, to compute Bayesian networks with enough variables that they were
useful in practical applications.

The Bayesian approach filled a void in the decades-long effort to add
intelligence to computers.

In the late 1970s and '80s, reacting to the 'brute force' approach
to problem solving by early users of computers, proponents of the
emerging field of artificial intelligence began developing software
programs using rule-based, if-then propositions. But the systems took
time to put together and didn't work well if, as was frequently the
case, you couldn't answer all the computer's questions clearly.

Later companies began using a technique called 'neural nets' in which
a computer would be presented with huge amounts of data on a particular
problem and programmed to pull out patterns. A computer fed with a big stack of
X-rays and told whether or not cancer was present in each case would pick out
patterns that would then be used to interpret X-rays.

But the neural nets won't help predict the unforeseen. You can't
train a neural net to identify an incoming missile or plane because you could
never get sufficient data to train the system.

In part because of these limitations, a slew of companies that popped
up in the early 1980s to sell artificial intelligence systems virtually all
went bankrupt.

Many AI techniques continued to be used. Credit card companies, for
example, began routinely using neural networks to pick out transactions that
don't look right based on a consumer's past behavior. But increasingly, AI was
regarded as a tool with limited use.

Then, in the late 1980s--spurred by the early work of Judea Pearl,
a professor of computer science at UCLA, and breakthrough mathematical
equations by Danish researchers--AI researchers discovered that
Bayesian networks offered an efficient way to deal with the lack or
ambiguity of information that has hampered previous systems.

Horvitz and his two Microsoft colleagues, who were then classmates at
Stanford University, began building Bayesian networks to help diagnose
the condition of patients without turning to surgery.

The approach was efficient, says Horvitz, because you could combine
historical data, which had been meticulously gathered, with the less
precise but more intuitive knowledge of experts on how things work to get the
optimal answer given the information available at a given time.

Horvitz, who with two colleagues founded Knowledge Industries to
develop tools for developing Bayesian networks, says he and the others left the
company to join Microsoft in part because they wanted to see their theoretical
work more broadly applied.

Although the company did important work for the National Aeronautics
and Space Administration and on medical diagnostics, Horvitz says, 'It's not
like your grandmother will use it.'

Microsoft's activities in the field are now helping to build a
groundswell of support for Bayesian ideas.

'People look up to Microsoft,' says Pearl, who wrote one of the key
early texts on Bayesian networks in 1988 and has become an unofficial
spokesman for the field. 'They've given a boost to the whole area.'

A researcher at German conglomerate Siemens says Microsoft's work has
drawn the attention of his superiors, who are now looking seriously at
applying Bayesian concepts to a range of industrial applications.

Scott Musman, a computer consultant in Arlington, Va., recently
designed a Bayesian network for the Navy that can identify enemy
missiles, aircraft or vessels and recommend which weapons could be
used most advantageously against incoming targets.

Musman says previous attempts using traditional mathematical
approaches on state-of-the-art computers would get the right answer
but would take two to three minutes.

'But you only have 30 seconds before the missile has hit you,' says
Musman.

General Electric is using Bayesian techniques to develop a system
that will take information from sensors attached to an engine and,
based on expert opinion built into the system as well as vast amounts
of data on past engine performance, pinpoint emerging problems.

Microsoft is working on techniques that will enable the Bayesian
networks to 'learn' or update themselves automatically based on new
knowledge, a task that is currently cumbersome.

The company is also working on using Bayesian techniques to improve
upon popular AI approaches such as 'data mining' and 'collaborative
filtering' that help draw out relevant pieces of information from
massive databases. The latter will be used by Microsoft in its new
online entertainment service to help people identify the kind of
restaurants or entertainment they are most likely to enjoy.

Still, as effective as they are proving to be in early use, Bayesian
networks face an uphill battle in gaining broad acceptance.

'An effective solution came just as the bloom had come off the AI
rose,' says Peter Hart, head of Ricoh's California Research Center at
Menlo Park, a pioneer of AI.

And skeptics insist any computer reasoning system will always fall
short of people's expectations because of the computer's tendency to
miss what is often obvious to the human expert.

Still, Hart believes the technology will catch on because it is
cost-effective. Hart developed a Bayesian-based system that enabled
Ricoh's copier help desk to answer twice the number of customer questions in
almost half the time.

Hart says Ricoh is now looking at embedding the networks in
products so customers can see for themselves what the likely problems
are. He believes auto makers will soon build Bayesian nets into cars
that predict when various components of a car need to be repaired or
replaced.

Previous 2 Next Top

>~~~Publications:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Thu, 21 Nov 1996 22:50:58 -0800
From: Ronny Kohavi (ronnyk@starry.engr.sgi.com)
Subject: Data Mining Using MLC++ won TAI's Best Paper award

The paper 'Data Mining using MLC++, A Machine Learning Library in C++'
won the IEEE Tools with Artificial Intelligence Best Paper Award.

The paper is available at ftp://starry.stanford.edu/pub/ronnyk/mlc96.ps.Z
or under publications off http://robotics.stanford.edu/~ronnyk

MLC++ is now being used in SGI's data mining product, MineSet 1.1.
For more information, see: http://www.sgi.com/Products/software/MineSet

Data Mining using MLC++
A Machine Learning Library in C++

Ron Kohavi Dan Sommerfield James Dougherty
Data Mining and Visualization Platform Group
Silicon Graphics, Inc. Sun Microsystems
{ronnyk,sommda}@engr.sgi.com jamesd@eng.sun.com

ABSTRACT

Data mining algorithms including machine learning, statistical
analysis, and pattern recognition techniques can greatly improve our
understanding of data warehouses that are now becoming more
widespread. In this paper, we focus on classification algorithms and
review the need for multiple classification algorithms. We describe a
system called MLC++, which was designed to help choose the appropriate
classification algorithm for a given dataset by making it easy to
compare the utility of different algorithms on a specific dataset of
interest. MLC++ not only provides a workbench for such comparisons,
but also provides a library of C++ classes to aid in the development
of new algorithms, especially hybrid algorithms and multi-strategy
algorithms. Such algorithms are generally hard to code from scratch.
We discuss design issues, interfaces to other programs, and
visualization of the resulting classifiers.

Previous 3 Next Top

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Thu, 14 Nov 1996 14:03:18 -0500
From: fazel@ai.iit.nrc.ca (Fazel Famili)
Subject: Intelligent Data Analysis - New Journal

C A L L F O R P A P E R S (New Journal)
=============================================

Intelligent Data Analysis - An International Journal

An electronic, Web-based journal
Published by Elsevier Science

******** Launch Date : January 15, 1997 ********

URL: http://www.elsevier.com/locate/ida
http://www.elsevier.nl/locate/ida

Important e-mail addresses:

Editor-in-Chief: Dr. A. Famili (famili@ai.iit.nrc.ca)
Subscription Information: USDirect@elsevier.com

Introduction
------------

As science and engineering disciplines become more and more computerized,
the volume and complexity of the data produced on a day-to-day basis quickly
becomes overwhelming. Traditional data analysis approaches have proven
limited in their ability to generate useful information. In a wide variety
of disciplines (as diverse as financial management, engineering, medical/
pharmaceutical research and manufacturing) researchers are adapting
Artificial Intelligence techniques and using them to conduct intelligent
data analysis and knowledge discovery in large data sets.

Aims/Scope
~~~~~~~~~~

The journal of Intelligent Data Analysis will provide a forum for the
examination of issues related to the research and applications of
Artificial Intelligence techniques in data analysis across a variety of
disciplines. These techniques include (but are not limited to): all areas of
data visualization, data pre-processing (fusion, editing, transformation,
filtering, sampling), data engineering, database mining techniques, tools
and applications, use of domain knowledge in data analysis, machine
learning, neural nets, fuzzy logic, statistical pattern recognition,
knowledge filtering, and post-processing. In particular, we prefer papers
that discuss development of new AI architectures, methodologies, and
techniques and their applications to the field of data analysis. Papers
published in this journal will be geared heavily towards applications, with
an anticipated split of 70% of the papers published being applications-
oriented, and the remaining 30% containing a more theoretical material.

Intelligent Data Analysis will be a fully electronic, refereed quarterly
journal. It will contain a number of innovative features not available in
comparable print publications. These features include:

- An alerting service notifying subscribers of new papers in the journal
- Links to large-scale data collections
- Links to secondary collection of data related to material presented in
the journal
- The ability to test new search mechanisms on the collection of journal
articles
- Links to related bibliographic material

Information for Authors:
~~~~~~~~~~~~~~~~~~~~~~~~

General

Intelligent Data Analysis invites submission of research and application
articles that comply with the Aims and Scope of the journal. In particular,
articles that discuss development of new AI architectures, methodologies,
and techniques and their applications to the field of data analysis are
preferred. Manuscripts are received with the understanding that their
content is unpublished material and is not being submitted for publication
elsewhere. Further, it is understood that each co-author has made
substantial contributions to the work described and that each accepts joint
responsibility for publication.

Manuscripts

The manuscript should be submitted in the following format. The first page
of the article should contain the title (preferably less than 10 words),
the name(s), address(es), affiliation(s), and e-mail address(es) of the
authors. The first page should also contain an abstract of approximately
200 words, followed by 3 to 5 keywords. Manuscripts should not exceed the
equivalent of 35-40 double-spaced typed pages of text (or the compressed/
encoded PostScript file should not be more than 1.0 Mb.).

Submission

In the interest of rapid publication, authors should submit the text of
original papers in PostScript (compressed files) to the Editor-in-Chief,
Dr. A. Famili, at famili@ai.iit.nrc.ca. Graphic and/or tabular files should
be sent in separate files in Encapsulated PostScript or GIF format. Please
make sure to (i) COMPRESS your file, (ii) UUENCODE your file, (iii) ATTACH
your file to your e-mail (do not include it in your e-mail). If you prepare
your file in any environment other than UNIX, please specify the environment
and the steps taken. The corresponding author will receive an acknowledgment
via e-mail.

The Review Process

Each article will be reviewed by at least two reviewers. The authors
will receive results of the review process via e-mail.

Final Manuscripts

Upon acceptance for publication, the publisher requires an electronic
copy of the manuscript in one of the following formats, along with originals
of figures and tables: FrameMaker (UNIX), WordPerfect, Microsoft Word.
For articles produced in TeX, authors may submit ASCII versions of the
text along with EPS files for any maths, tables, or diagrams.

For questions about these instructions or more information about submitting
a paper to Intelligent Data Analysis contact:

Dr. A. Famili at famili@ai.iit.nrc.ca

or refer to the journal home page http://www.elsevier.com/locate/ida.

Previous 4 Next Top

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: motoda@sanken.osaka-u.ac.jp
Subject: Call for papers to the IEEE Expert special issue of feature
transformation and subset selection
Date: Fri, 15 Nov 96 11:11:47 +0900

Could you post this CFP to the KDD nuggets?

Hiroshi Motoda
Professor
Division of Intelligent Systems Science,
The Institute of Scientific and Industrial Research,
Osaka University
8-1 Mihogaoka, Ibaraki, Osaka 567
Japan

E-mail motoda@sanken.osaka-u.ac.jp
Phone : 81-6-879-8540

Fax : 81-6-879-8544
URL : http//www.sanken.osaka-u.ac.jp/labs/hr/

--------------------

Call For Papers

IEEE Expert

Special Issue on

Feature Transformation and Subset Selection

Guest Editors: Huan Liu and Hiroshi Motoda

I. BACKGROUND

As computer and database technologies have advanced, human
beings rely more and more on computers to accumulate data,
process data, and make use of data. Machine learning, knowledge
discovery, and data mining are some of the Artificial
Intelligence (AI) tools that help human accomplish those tasks.
Researchers and practitioners realize that in order to use these
tools effectively, an important part is pre-processing in which
data is processed before it is presented to any learning,
discovering, or visualizing algorithm. In many discovery
applications (for example, marketing data analysis), a key
operation is to find subsets of the population that behave
enough alike to be worthy of focused analysis. Feature
transformation and subset selection are applied frequently in
data pre-processing.

Feature transformation (FT) is a process through which a new set
of features is created. The variants of FT are feature
construction, feature discovery, and feature extraction.
Assuming the original set consists of A1, A2,..., An features,
these variants can be defined below.

Feature construction is a process that augments the space of
features by inferring or creating additional features. After
feature construction, we may have additional m features
An+1,An+2,...,An+m. For example, a new feature Ak (n < k <=
n+m) could be constructed by performing a logical operation of
Ai and Aj from the original set.

Feature discovery is a process that discovers missing
information about the relationships between features and forms
a new set of features. An example of feature discovery is as
follows: a two-dimensional problem (say, A1=width and
A2=length) may be transformed to a one-dimensional problem
(B1=area) after feature discovery.

Feature extraction is a process that extracts a set of new
features from the original features through some functional
mapping. After feature extraction, we have B1, B2,..., Bm (m <
n), Bi = Fi(A1,A2,...,An), and Fi is a mapping function. For
instance, B1=c1A1+c2A2 where c1 and c2 are constants.

Subset selection (SS) is different from FT in that no new
features will be generated, but only a subset of original
features is selected and the feature space is reduced. As to FT,
feature construction expands the feature space, whereas feature
discovery and feature extraction reduce the feature space.

There is a wide and strong interest in FT and SS among
practitioners from Statistics, Pattern Recognition, Data Mining,
and Knowledge Discovery to Machine Learning since data
preprocessing is an essential step in the knowledge discovery
process for real-world applications.

II. OBJECTIVE and SCOPE

The objective of this special issue is to report on the recent
studies in FS and SS. The main goal is to increase the awareness
of the AI community to the research of FT and SS, currently
conducted in isolation. Through this special issue, we hope to
produce a contemporary overview of modern solutions, to create
synergy among these seemingly different branches but with a
similar goal - facilitating data processing and knowledge
discovery, and to point to future research directions.

Papers are expected to cover the following aspects of FT and SS;
in all cases, authors are strongly encouraged to use real-world
examples to show that their work is scalable and applicable to
practical problems:

. Theories and Methodologies of novel approaches to feature
transformation and subset selection

. Applications of feature transformation and subset selection
to real-world problems: practice, experiments, and lessons
learned

. Combinations of different methods such as machine learning,
statistics as well as neural networks in feature
transformation

. Future directions and important issues in unifying this
currently diversified field

III. SUBMISSION REQUIREMENTS and SCHEDULE

High quality, original papers that deal with real-world problems
are solicitated. All the submitted manuscripts will be subject
to a rigorous review process. Manuscripts should be prepared in
accordance with the IEEE Expert 'submission guidelines'.
Manuscripts should be approximately 5,000 words long, preferably
not exceeding 10 references. This special issue is scheduled to
appear in late 1997.

Important Dates:

Submission April 30 (FIRM DEADLINE)

Notification June 30

Prospective authors should submit six copies of the completed
manuscript to one of the guest editors:

Huan Liu Hiroshi Motoda S16 #4-17 Institute of Scientific & Industrial
Dept of Info Sys & Comp Sci Research National University of Singapore
Osaka University Kent Ridge, Singapore, 119260 Ibaraki, Osaka 567,
Japan liuh@iscs.nus.sg motoda@sanken.osaka-u.ac.jp

Previous 5 Next Top

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From: Tom Dietterich (tgd@chert.cs.orst.edu)
Date: Thu, 17 Oct 96 08:52:50 PDT
Subject: Statistical Tests for Comparing Supervised Classification Learning Algorithms

(This paper was announced in ML list a while ago, but I think it would
also be useful to many data miners, since it addresses a fundamental
problem. GPS)

The following paper is available from
ftp://ftp.cs.orst.edu/pub/tgd/papers/stats.ps.gz

Statistical Tests for Comparing Supervised Classification Learning Algorithms
Thomas G. Dietterich
Department of Computer Science
Oregon State University
Corvallis, OR 97331

Abstract:

This paper reviews five statistical tests for determining whether one
learning algorithm out-performs another on a particular learning task.
These tests are compared experimentally to determine their probability
of incorrectly detecting a difference when no difference exists (Type
I error). Two widely-used statistical tests are shown to have high
probability of Type I error in certain situations and should never be
used. These tests are (a) a test for the difference of two
proportions and (b) a paired-differences $t$ test based on taking
several random train/test splits. A third test, a paired-differences
$t$ test based on 10-fold cross-validation, exhibits somewhat elevated
probability of Type I error. A fourth test, McNemar's test, is shown
to have low Type I error. The fifth test is a new test, 5x2cv, based
on 5 iterations of 2-fold cross-validation. Experiments show that
this test also has good Type I error. The paper also measures the
power (ability to detect algorithm differences when they do exist) of
these tests. The 5x2cv test is shown to be slightly more powerful
than McNemar's test. The choice of the best test is determined by the
computational cost of running the learning algorithm. For algorithms
that can be executed only once, McNemar's test is the only test with
acceptable Type I error. For algorithms that can be executed ten
times, the 5x2cv test is recommended, because it is slightly more
powerful and because it directly measures variation due to the choice
of training set.

Thomas G. Dietterich Voice: 541-737-5559
Department of Computer Science FAX: 541-737-3014
Dearborn Hall, 303 URL: http://www.cs.orst.edu/~tgd
Oregon State University
Corvallis, OR 97331-3102

Previous 6 Next Top

>~~~Siftware:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Tue, 26 Nov 1996
From: gps@gte.com (Gregory Piatetsky-Shapiro)
Subject: New Data Mining-related Companies

In
Companies:Consulting page,
added DAZ Systems Incorporated,
provider of Agent-Oriented Database Technology for Data-Mining and Knowledge
Discovery.

In
Companies:Software and Services page,
added Neural Technologies,
providing a range of products/services associated with data mining.

Previous 7 Next Top

>~~~Positions:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: mpsingh@eos.ncsu.edu (Munindar Singh)
Subject: NCSU Position: Workflow Management and Data Mining
Date: Fri, 22 Nov 1996 08:28:06 -0500 (EST)

Hi,

We are advertising the following position. Please feel free to
forward this ad to students and colleagues. Early applications are
especially welcome.

Thanks,
Munindar Singh

=========================================================================
North Carolina State University
Department of Computer Science
Workflow Management and Data Mining

The Department of Computer Science at North Carolina State University
seeks an assistant professor in the broad area of workflow management
and data mining, although candidates with a background in other areas
of distributed systems will also be considered. We especially seek
candidates with interests and qualifications that complement or
strengthen our present faculty. The successful candidate will have a
Ph.D. in Computer Science and an extensive research record.

This new position has been created to further strengthen our
activities in software systems. Our present activities involve a
number of relevant areas, such as workflows and relaxed transaction
management, software process modeling, formal methods and tools for
concurrency, heterogeneous database access, scientific data
management, multiagent systems, machine learning, distributed
computing, multimedia, and human interfaces. Applications of interest
include education, healthcare, process control, and scientific
decision-support.

The new faculty member will find a lively and collegial work
environment. The department is in a period of rapid growth and
advancement, and is positioning itself to be at the forefront of
selected areas in computer science. We attract research sponsorship
from a variety of sources, including ARPA, AFOSR, EPA, NASA, NIH, NSF,
and ONR. Industrial sources include IBM, Fujitsu, Glaxo-Wellcome, and
others. The candidate will have access to our state-of-the-art
high-performance ATM-based networking, computational, and multimedia
facilities. In December, the department's Multimedia Laboratory, with
which the successful candidate will be affiliated, will move to a
5,000 sq. ft. space in the new $41 million Engineering Graduate
Research Center.

The university is located in Raleigh which forms one vertex of the
world-renowned Research Triangle. The Research Triangle area was
recently recognized as one of the 'best places to live in' in the U.S.
It boasts a large concentration of high technology companies, such as
Alcatel, Data General, Ericsson, Fujitsu, Glaxo-Wellcome, IBM, Nortel
(Northern Telecom, Bell Northern Research) and the SAS Institute, and
research institutions such as EPA, NIEHS/NIH, and RTI.

Interested candidates should send their resume (including citizenship
and visa status) and the names of four references to:

Chair, Workflow Recruitment Committee
Department of Computer Science
226 Withers Hall
North Carolina State University
Raleigh, NC 27695-8206, USA

Prospective candidates are encouraged to access the department's
homepage http://www.csc.ncsu.edu and to send email to
workflow@csc.ncsu.edu. The university is an Equal Opportunity,
Affirmative Action employer.

Previous 8 Next Top

>~~~Meetings:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: Liu Huan (liuh@iscs.nus.sg)
Subject: Advance Program - PAKDD97 and Conference Registration Form
Date: Mon, 25 Nov 1996 09:09:48 +0800 (GMT-8)

FIRST PACIFIC-ASIA CONFERENCE on

KNOWLEDGE DISCOVERY and DATA MINING (PAKDD97)

Singapore, 23-24 February, 1997

(Co-located with 2nd Pacific-Asia Conference on Expert Systems/
3rd Singapore International Conference on Intelligent Systems)

http://www.iscs.nus.sg/conferences/pakdd97
or
http://www.iscs.nus.sg/conferences/pakdd97.html

************************************************************************

Advance Program at a Glance - PAKDD '97

23 Feb (Sunday)

0900 Opening Ceremony
0930 Invited Talk by Evangelos Simoudis (IBM)

1030 Tea Break
1100 Session 1: Clustering Techniques

1220 Lunch (provided)

1400 Session 2A: Session 2B:
Classifiers & Concept Learning &
Classification Techniques I Rule Discovery

1520 Tea Break

1550 Session 3A: Session 3B:
Classifiers & KDD Applications
Classification Techniques II

1730 Conference Dinner

24 Feb (Monday)

0900 Invited Talk by Tejwesh Anand (NCR)
1000 Session 4A: Session 4B:
Visual Data Exploration DM Tools I -
Piot Software
Software AG
1040 Tea Break

1110 Session 5A: Session 5B:
Learning Association DM Tools II -
Rules & Patterns SAS Institute
Integral Solutions Ltd

1230 Lunch (Provided)

1400 Tutorial

An Introduction to Effective Data Mining
by Dr. Evangelos Simoudis (IBM)

1500 Tea Break

1530 Tutorial continued

1700 Conference Ends

************************************************************************

EXHIBITION

An exhibition of the latest data mining tools and products will be held in
conjunction with the conference. We have already received numerous
requests for exhibition space. Potential exhibitors and corporate sponsors
are invited to contact the PAKDD '97 Exhibition Chair, Ms. Lynica Foo (Tel:
770 5948, fax: 779 1827) for further information.

23 - 24 Feb 97 0900hrs - 1700hrs

************************************************************************

TECHNICAL PROGRAM

Invited Talks by Evangelos Simoudis and Tejwesh Anand

Papers to Be Presented (please check the conference web page)

Tutorial by Evanelos Simoudis

************************************************************************

For further information

Ms. Hwee-Leng Ong

PAKDD '97 Secretariat

Japan-Singapore AI Centre
Information Technology Institute
11 Science Park Road
Singapore Science Park II
Singapore 117685

Fax: (+65) 770 5951

Email: pakdd97@iti.gov.sg

WWW: http://www.iscs.nus.sg/conferences/pakdd97

***************CUT HERE********************CUT HERE**************************

Conference Registration Form - PAKDD'97

Name (please underline surname): ____________________________(Prof/Dr/Mr/Ms)*

Designation: ________________________________________________________________

Organisation: _______________________________________________________________

Mailing Address: ____________________________________________________________

_____________________________________________ Country: _____________________

Tel: _________________ Fax: _________________ Email: _____________________

For members of co-operating societies, please indicate:

Society Name: __________________ Membership No. _________________

For student registration, please indicate:

Supervisor Name: _______________ Supervisor's Signature _____________

Circle fees Early Registration Rates (S$) Late Registration Rates (S$)
that apply (postmarked by 30 Nov 1996) (after 30 Nov 1996)
Member* Non-Member Student+ Member* Non-Member Student+

Conference 342.00 360.00 180.00 360.00 400.00 200.00

Tutorial 110.00 128.00 68.00 121.00 135.00 75.00

Conference 362.00 390.00 198.00 385.00 428.00 220.00
cum Tutorial

Joint Conf 974.00 990.00 432.00 1062.00 1098.00 450.00

*Members of supporting, academic, or speakers.

+Student registration does not include conference dinner.

Notes: - Tutorial only registration includes tea and tutorial notes.
- Joint Conference registration includes PAKDD'97 Conference +
Tutorial and PACES conference
- Conference registration includes daily lunches, teas, conference
dinner, and proceedings.
- Rates inclusive of Goods and Service Tax (GST)

Special Meal Requirements (please tick):
[ ] Vegetarian
[ ] Muslim
[ ] Others (please specify) ____________________

Method of Payment

All payments should be made in Singapore dollars. Only personal or company
cheque from a Singapore bank will be accepted. Bank drafts must be payable
in Singapore.

The full refund request deadline is 1 Jan 1997. There will be no refund
after this date.

Payment by cheque (local delegates only), bank draft or cashier's order
made payable to

NATIONAL COMPUTER BOARD - ITI

I enclose a cheque/bank draft/cashier's order for
the total sum of S$ ___________

Cheque/Bank Draft/Cashier's Order Number: _____________________________

Payment by credit cards

[ ] Visa [ ] MasterCard

Credit Card No:___________________________Expiry Date:_________________

Card Holder's Name _______________________Signature:___________________

Please send completed registration to:

PAKDD '97 Secretariat, Japan-Singapore AI Centre, Information Technology
Institute, 11 Science Park Road, Singapore Science Park II, Singapore
117685. Fax: (65) 779 1827.

Previous 9 Next Top

>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~