Knowledge Discovery Nuggets 97:09

To KD Mine: main site for Data Mining and Knowledge Discovery.
Here is how to subscribe to KD Nuggets
Past Issues: 1997 Nuggets, 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets


Knowledge Discovery Nuggets 97:09, e-mailed 97-03-10

News:
* P. Domingo, Re: Looking for phrase matching tool
* R. Jain, Tandem Data Mining Announcement,
http://www.tandem.com
Siftware:
* R. Quinlan, C5.0: Successor to C4.5,
http://www.rulequest.com
Positions:
* P. Norvig, Job offered in information extraction and learning,
data mining, http://www.junglee.com
* M. Bramer, Research Fellowship in Knowledge Discovery
* X. Liu, Research Studentship in Intelligent Data Analysis,
http://web.dcs.bbk.ac.uk/~hui/IDA/home.html
* D. Sleeman, University of Aberdeen, Chair of Computing Science
http://www.csd.abdn.ac.uk/people/chair_fp.html
--
Knowledge Discovery Nuggets is a free electronic newsletter for the
Data Mining and Knowledge Discovery community, focusing on the latest
research and applications.

Submissions are most welcome and should be emailed, with a DESCRIPTIVE
subject line (and a URL) to gps.
To subscribe, see http://www.kdnuggets.com/subscribe.html

KD Nuggets frequency is 3-4 times a month.
Back issues of KD Nuggets, a catalog of data mining tools ('Siftware'),
and a wealth of other information on Data Mining and Knowledge Discovery is
available at Knowledge Discovery Mine site http://www.kdnuggets.com/

-- Gregory Piatetsky-Shapiro (editor)

********************* Official disclaimer *****************************
All opinions expressed herein are those of the contributors and not
necessarily of their respective employers, or of KD Nuggets
***********************************************************************

~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There is no security, only opportunity
General McArthur

Previous  1 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To: David.Throop@sw.boeing.com
cc: kdd@gte.com, pedrod@ruffles.ICS.UCI.EDU
Subject: Re: Looking for phrase matching tool
Date: Fri, 28 Feb 1997 13:43:11 -0800
From: 'Pedro M. Domingos' (pedrod@ruffles.ICS.UCI.EDU)

Alvaro Monge and Charles Elkan of UC San Diego (amonge@cs.ucsd.edu,
elkan@cs.ucsd.edu) have one such program. They have a paper in the
proceedings of KDD-96 (p. 267) that describes their system, and also gives
references to other work in the area.

Pedro


Previous  2 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[Note: the following is a commercial announcement. GPS]
From: JAIN_ROHIT%t16@fedex
Date: 28 Feb 97 15:08:00 -0600
To: gps
Cc: mehta_abhay@tandem.com, rohrberg_lars@tandem.com
Subject: Tandems's Feb. 11 announcement

Hi folks,

It seems in Nuggets you seem to cover announcements made by many companies.
I am wondering what would be needed on Tandem's part to have you include
that announcement in Nuggets. You can get to the announcement from our home
page at http://www.tandem.com. I have also included parts of it in this
message.

Rohit Jain

Contact:
Kristine Austin
Tandem Computers Incorporated
Tel: +1 (408) 285 6645
World Wide Web Home Page Address: http://www.tandem.com

Tandem Object Relational Data Mining Architecture Drives Next Generation of
Knowledge Discovery

Cupertino, CA February 11, 1997 Tandem. Computers Incorporated today
announced a revolutionary approach in bringing complete knowledge discovery
to business users through its Object Relational Data Mining technology. For
the first time, the complete warehouse data set is available for real-time
data mining, resulting in reduced processing time, more complete results,
and significantly easier management. This new architecture establishes a
standard SQL interface between client data mining tools and both object
relational and relational database engines. The database engine will perform
specialized data manipulation functions required by the data mining algorithms.
Tandem's Object Relational Data Mining architecture takes full advantage of
the capabilities of relational database engines resulting in the ability to
mine larger volumes of data and better performance.

By integrating the best-of-breed data mining software with a relational
database, Tandem's Object Relational Data Mining will enable business
professionals to more effectively uncover and exploit valuable patterns and
trends hidden in their data. This architecture will enhance knowledge
discovery in solutions such as credit card marketing, claims analysis,
retail basket analysis, and others.

The interface between data mining tools and the database engine is enabled
through the use of SQL extensions, ultimately allowing customers to enjoy a
much wider range of data mining clients. Tandem will promote the
establishment of de facto standards for these extensions with other database
vendors and data mining tool providers. 'Initially, the use of SQL
extensions will greatly enhance the way traditional alphanumeric data types
are mined today,' said Abhay Mehta, Tandem's director of Object Relational
Data Mining Development. As technology evolves, this architecture will
enable the fast, efficient mining of more complex data types such as image,
voice, video, and other multimedia objects. In the second half of 1997,
Tandem's ServerWare database will be the first to combine all of the
elements into a powerful knowledge discovery business environment.

Tandem will be able to build on its success in the data warehouse
marketplace to position itself well in the high-end macromining segment of
the data mining arena, said Dr. Wolfgang Martin, program director, META
Group. Tandem s approach is unique in that it opens up the powerful
ServerWare database, and other database management systems, to a wide range
of data mining functions while accommodating future data mining developments
and complex data types.

Tandem s data mining partners have been selected so that customers can
benefit from their combined breadth of data mining algorithms and for the
ability of their tools to work in a high-performance parallel environment
necessary to take advantage of this new architecture. Data mining partners
include leading companies such as Angoss Software International Limited,
Data Distilleries B.V., Magnify Incorporated, NeoVista Solutions
Incorporated, and Syllogic B.V.

ANGOSS Software International Limited

ANGOSS KnowledgeSEEKER excels in applications including fraud detection,
target marketing, process control, and risk management.
KnowledgeSEEKER displays results in a decision tree format by uncovering
valuable relationships and correlations in the dataset, and by writing
predictive rules. This format can be easily understood by any business end
user. KnowledgeSEEKER turns data into valuable business knowledge.

Data Distilleries B.V.

Data Distilleries Data Surveyor uses highly efficient decision tree based
search strategies and database optimization techniques, enabling it to take
into account hundreds of variables to mine finance, retail, insurance, and
database marketing databases. At the end of the data mining process, Data
Surveyor produces a graphical representation of the discovered relationships
and an overview of all actions and results during the mining process.

Magnify Incorporated

Magnify s PATTERN software is an open set of modular software tools for
mining, managing, and analyzing very large data sets. The PATTERN system
includes several specialized applications, such as PATTERN:Detect for
detecting fraud, anomalies, and rare events and PATTERN:Profit for
predicting the delinquency, bankruptcy, credit usage, and profitability of
customers. The PATTERN system incorporates algorithms for parallel and
distributed variants of classification, regression, and optimization trees,
and a variety of other data
mining algorithms.

NeoVista Solutions Incorporated

NeoVista Solutions Decision Series suite of knowledge discovery tools are
directed towards solving data mining challenges in a variety of markets,
including retail, insurance, telecommunications, and healthcare.
The Decision Series suite includes pattern discovery tools based on neural
networks, clustering, genetic algorithms, and association rules.

Syllogic B.V.

The Syllogic Data Mining Tool supports all stages in the data mining
process, including data selection, data cleaning, enrichment, coding,
discovery, and visualization. Using a toolbox approach, the tool combines
various database analysis techniques, such as decision trees, association
rules, k-nearest neighbor, clustering, and visualization to solve business
challenges in the finance, transportation, government, and system and
network management segments.

To help customers stay on the leading edge of data mining, Tandem is also
partnering with key universities such as Simon Fraser University in order to
benefit from the results of their on-going research. This alliance includes
parallelizing existing and next-generation data mining algorithms and
techniques.

Tandem is making a major investment in data mining and in driving its
widespread deployment as a business tool, said Bill Heil, senior vice
president and general manager of Tandem s ServerWare business unit. By
focusing on the Tandem ServerWare database engine and partnering with
best-of-breed solutions providers and researchers, we are able to supply
customers with the industry s most advanced and comprehensive range of data
mining solutions. What we are offering is an extensible approach designed to
keep customers at the forefront of the latest developments in knowledge
discovery.

Availability

Tandem s Object Relational Data Mining solutions will be available starting
in the third quarter of 1997. With these solutions, customers will be able
to take advantage of the industry s most scalable performance for mining
databases residing on either Microsoft. Windows NT.
Server based platforms (including Tandem s recently introduced S-series
servers based on Windows NT Server) or on Tandem s massively scalable
NonStop. Himalaya. servers.

About Tandem

Founded in 1974, Tandem Computers Incorporated designs and delivers
technology solutions that companies rely on to compete in a business world
that runs 24 hours a day. A US$1.9 billion company headquartered in
Cupertino, California, Tandem has offices, strategic partners, and providers
in more than 50 countries around the world.


Tandem, Himalaya, NonStop, Object Relational Data Mining, ServerWare, and
the Tandem logo are trademarks or registered trademarks of Tandem Computers
Incorporated in the United States and/or other countries. Microsoft and
Windows NT are either trademarks or registered trademarks of Microsoft
Corporation in the United States and other countries. All other brand or
product names are trademarks or registered trademarks of their respective
companies.


Contact:
Kristine Austin
Tandem Computers Incorporated
Tel: +1 (408) 285 6645
World Wide Web Home Page Address: http://www.tandem.com

Tandem Introduces Object Relational Data Mining Solutions and Services for
Vertical Markets

Business-driven offerings target card marketing, micromerchandising,
claims analysis, and other key applications

Cupertino, CA February 11, 1997 Applying its vertical market expertise and
new Object Relational Data Mining architecture to real-world business
problems, Tandem. Computers Incorporated today launched a series of Object
Relational Data Mining solutions packages for card marketing,
micromerchandising, and insurance claims analysis. Tandem also announced new
consulting services designed to allow companies to quickly enjoy low-risk,
discovery-driven decision making.

The solutions and services are based on Tandem s revolutionary new Object
Relational Data Mining architecture. This enables customers to efficiently
mine their entire database, not merely samples, for useful patterns and
trends. The result is a more effective realization of the full business
value of data. Object Relational Data Mining solutions add significant new
functionality to customer segmentation and predictive modeling techniques,
said Jonathan Kalman, managing director of MRJ Technology Solutions, a
leading specialty systems integrator. Tandem is taking a profoundly
different approach by integrating its powerful database, capable of handling
an entire organization s data, with leading data mining tools.

Delivering full value of business data

The new solutions packages will be comprised of the cross-platform Tandem
ServerWare, database, appropriate integrated data mining and other analysis
tools from leading solutions partners, Tandem S-series massively scalable
Himalaya. and/or Microsoft. Windows NT.
Server based hardware platforms, application and reporting templates, data
models, and Directional Consulting services. Though specially tested and
packaged, the solutions are all easily customizable. Initial solutions include:

Card Marketing

Aimed at card acquirers and issuers, this solutions package applies Object
Relational Data Mining architecture and other decision support technology to
improve the effectiveness of cardholder retention and acquisition efforts.
This provides a better understanding of when certain customers are likely to
leave and why, leading to more effective customer segmentation, increased
response rates to marketing promotions, and improved margins through
targeted product development and pricing.

Micromerchandising

This package enables retailers to mine immense volumes of detailed
merchandising data, resulting in improved in-stock positions, reduced
markdowns by better understanding buying patterns and trends, enhanced
promotional effectiveness, and improved store profitability through more
precise forecasting.

Claims Analysis

Aimed at insurance providers looking to contain underwriting costs and
improve loss ratios, this package uses Object Relational Data Mining
technology to support new product development, fraud profiling and
detection, better service provider alliances, and more exact underwriting
experience comparisons.

Immediate customer reaction to these benefits is positive. Said Juan
Verastigui, director of Claims System Development at USAA, a leading
insurance company, Tandem s Object Relational Data Mining architecture and
the way it leverages the parallel ServerWare database will provide USAA with
the ability to derive full value from all our claims data, and not just
subsets. The resulting faster and more complete answers to our business
queries will have a very positive effect on our bottom line.

Looking ahead, Object Relational Data Mining architecture will enable the
mining of complex data types that include voice, video and images.
Said MRJ s Jonathan Kalman, Object Relational Data Mining solutions provide
immediate value with traditional data types, and extensibility to meet
future multimedia analysis needs.

Directional Consulting, new Object Relational Data Mining services

Tandem s Directional Consulting services are an integral part of the new
solutions packages and are also available separately. These services define
a low-risk, high-return methodology proven over many Tandem based data
warehousing implementations for exploring and understanding how data mining
can support particular business initiatives.

Directional Consulting services use a phased approach to having data mining
production environments up and running within 90 days. The process begins
with establishing priorities for implementation of Object Relational Data
Mining and proceeds to a proof of concept phase to verify that the
selected data mining solutions will meet expectations.
System design, data modeling, and implementation then follow, culminating
with the establishment of a robust, scalable operational environment that
supports application evolution and growth.

Availability

Tandem Card Marketing, Micromerchandising, and Claims Analysis solutions
will be available beginning in the first quarter of 1997. These will be
enhanced to take advantage of Object Relational Data Mining technology in
the third quarter of 1997.

About Tandem

Founded in 1974, Tandem Computers Incorporated designs and delivers
technology solutions that companies rely on to compete in a business world
that runs 24 hours a day. A US$1.9 billion company headquartered in
Cupertino, California, Tandem has offices, strategic partners, and providers
in more than 50 countries around the world.


Tandem, Himalaya, NonStop, Object Relational Data Mining, ServerWare, and
the Tandem logo are trademarks or registered trademarks of Tandem Computers
Incorporated in the United States and/or other countries. Microsoft and
Windows NT are either trademarks or registered trademarks of Microsoft
Corporation in the United States and other countries. All other brand or
product names are trademarks or registered trademarks of their respective
companies.


Previous  3 Next   Top
>~~~Siftware:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Wed, 5 Mar 1997 23:31:07 -0500 (EST)
From: quinlan@linux42.dn.net (Ross Quinlan)
Subject: Successor to C4.5

I have developed a new inductive program called C5.0. Its main advantages are:

* new, faster methods for generating rules
* support for boosting
* optional non-uniform misclassification costs

Further information and free demonstration versions are available from

http://www.rulequest.com

Ross Quinlan


Previous  4 Next   Top
>~~~Positions:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Fri, 28 Feb 1997 15:33:40 -0800
From: norvig@junglee.com (Peter Norvig) Organization: Junglee Corp.
To: gps, connectionists@cs.cmu.edu, ai+lisp-jobs@cs.cmu.edu,
ml@ics.uci.edu, ai-stats@watstat.uwaterloo.ca, uai@ghost.CS.ORST.EDU
Subject: Job offered in information extraction and learning, data mining

Junglee is looking for full-time employees and summer interns to work on
information discovery and data mining from text documents. We're
looking for creative hard-working people with experience in some of the
following: agents, databases, information extraction, parsing, regular
expressions, language design, statistics, machine learning, and GUI
design.

Junglee develops Internet and Intranet information technology for the
future and pushes it to market today. Technology that raises eyebrows
and drops barriers. Founded in 1996 by four PhD students from the
Stanford University Computer Science Department and a Silicon Valley
veteran, Junglee Corporation has excellent funding, high-profile
customers, and a strong revenue plan.

Our Virtual DataBase (VDB) engine is fueled by our ability for data
source description, extraction, and attribute mapping. Imagine
capturing data from hundreds of disparate unstructured web sites,
mixing that with data from other heterogeneous, distributed database
and non-database sources and turning it all into a relational aggregate
with the power of full SQL queries and the ease and portability of
HTML user interfaces. We call these applications PALs - powerful
information sites where people can ask for and get an answer.
Several of our PALs are up on the web today at www.junglee.com and
www.washingtonpost.com; we are currently building more of them for
some well-known companies.

One of the key aspects of the technology is discovering/mining
information from text. The project is lead by Peter Norvig who has done
extensive work on Natural Language Processing, Machine Learning, and
other Artificial Intelligence problems. While this project involves
significant ground-breaking research, it is definitely a development
project, not just research.

Please send responses to jobs@junglee.com or by fax to 408-522-9470
and mention this posting.


--
Peter Norvig norvig@junglee.com Junglee Corporation
phone: 408-522-9482 1250 Oakmead Parkway fax: 408-522-9470 Suite 310
http://www.junglee.com Sunnyvale CA 94086 http://www.norvig.com



Previous  5 Next   Top
>~~~Positions:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: 'Max Bramer' (bramerma@sis.port.ac.uk) Organization: University of
Portsmouth
To: ai-sges@mailbase.ac.uk, jimc@cogs.susx.ac.uk,
ai-cbr@mailbase.ac.uk,
cphc-jobs@ukc.ac.uk, kaw@swi.psy.uva.nl, kdd@gte.com
Date: Sat, 1 Mar 1997 17:05:45 +0000
Subject: Research Fellowship in Knowledge Discovery
Reply-to: bramerma@sis.port.ac.uk

UNIVERSITY OF PORTSMOUTH

DEPARTMENT OF INFORMATION SCIENCE

RESEARCH FELLOWSHIP IN KNOWLEDGE DISCOVERY

Salary: stlg17,472 - stlg20,381 (Pay award pending)

Closing Date: 21 March, 1997
(Note: This is an extension to the previously announced closing date.)

Reference: RTEC 0149 (G)

Applications are invited for a two-year Research Fellowship in the
Department of Information Science to commence as soon as possible.

The successful candidate will work closely with Professor Max Bramer (Head
of the Department of Information Science) to develop research in the area of
Knowledge Discovery and Data Mining. The Department currently has projects
in the sub-areas of automatic induction of classification rules from
examples, Case Based Reasoning, Neural Networks, Genetic Algorithms and
related statistical techniques.

Applicants should have a good honours degree in Computer Science or related
subject. Preference will be given to candidates who have (or expect soon to
receive) a higher degree in a relevant discipline.
Relevant commercial experience would also be an advantage.

Informal enquiries may be made to Professor Bramer, either by telephone
(01705) 844444 or by electronic mail (bramerma@sis.port.ac.uk), or to Simon
Thompson on (01705) 844097 (thompsonsg@sis.port.ac.uk). Further information
about the department is also available from the World Wide Web at
http://www.sis.port.ac.uk.

Further particulars are available from:

Personnel Office
University House
Winston Churchill Avenue
Portsmouth PO1 2UP
England

Telephone (01705) 843421 (24 hour answerphone) E-mail: jobs@pers.port.ac.uk
http://www.port.ac.uk/

IMPORTANT NOTE: All applications should be sent (preferably on paper not by
email) to the Personnel Office NOT to the Department of Information Science.
_______________________________________________________

Professor Max Bramer
Department of Information Science
University of Portsmouth
Milton, Southsea PO4 8JF, England
Tel: +44-(0)1705-844444 Fax: +44-(0)1705-844006 email:
bramerma@sis.port.ac.uk


Previous  6 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: hui@dcs.bbk.ac.uk (Xiaohui Liu)
Date: Tue, 4 Mar 97 12:17:57 GMT
To: gps
Subject: Re: EPSRC CASE Research Studentship in Intelligent Data Analysis

BIRKBECK COLLEGE
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY OF LONDON


EPSRC CASE Research Studentship in Intelligent Data Analysis


Applications are invited for an EPSRC CASE PhD studentship, within the
Intelligent Data Analysis (IDA) Group, at the Department of Computer
Science, Birkbeck College. The three-year studentship is for the
investigation of intelligent data analysis techniques for research
problems in process industries, funded by Honeywell Hi-Spec
Solutions, UK and Honeywell Technology Center, USA. The successful
candidate will have a tax-free salary of at least 10,000 pounds (there are
experience, age-related and dependants additions), and will be expected to
work on a joint research project between Birkbeck and Honeywell on 'Causal
Modeling for Time Series Data'.

The IDA Group at Birkbeck conducts research into the application of
computationally intelligent techniques to data analysis problems.
The group has enjoyed successful collaboration with several external
organisations in industry and medicine on a variety of IDA research
projects, funded by government agencies, industrial sponsorships and
charity organisations. The group is to host the second IDA conference in
London this August.

Applicants should have at least a 2(i) in Computer Science or related
subject, with a good background in Artificial Intelligence or Statistics,
or a 2(i) in Chemical Engineering with strong computing background.
Please submit a CV as soon as possible, but not later than 31 March 1997,
to Dr X Liu, Department of Computer Science, Birkbeck College, Malet
Street, London WC1E 7HX, UK. Phone Dr Liu on 0171-631 6711 or email him
(hui@dcs.bbk.ac.uk) if you wish to make an informal enquiry.

Information regarding this project and research activities of the IDA
Group at Birkbeck can be accessed on the World Wide Web via URL:

http://web.dcs.bbk.ac.uk/~hui/IDA/home.html



Previous  7 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: Derek Sleeman (sleeman@csd.abdn.ac.uk)
Date: Sun, 2 Mar 1997 15:01:52 GMT
To: ai-cbr@mailbase.ac.uk, ai-sges@mailbase.ac.uk, cphc-jobs@ukc.ac.uk,
jimc@cogs.susx.ac.uk, kaw@swi.psy.uva.nl, kdd@gte.com, ml@ics.uci.edu
Cc: sleeman@csd.abdn.ac.uk
Subject: CHAIR VACANCY (for Posting)

Announcement of Post (Closing date: early MARCH)

University of Aberdeen

Chair of Computing Science


Applications are invited for the post of Professor of Computing
Science. The new Professor will play a key role in strengthening
the teaching and research activities of the Department of
Computing Science. The new Professor will provide academic
leadership in the development of the Department's existing areas
of interest, Artificial Intelligence and Databases. Candidates
should have an international reputation with an excellent record
of innovative research as measured by publications and grant
income. Applications from academics, research managers and others
from Industry and public sector Institutions will be considered.
Further, as the University of Aberdeen has recently made a major
research investment in the Institute of Medical Sciences, it would
be an advantage if the person had experience of working with
Medical/Healthcare professionals. The person appointed will be
expected to acquire a significant role in the management of the
Department.

Informal enquiries may be directed to Professor A R Forrester,
Vice-Principal and Dean of the Faculty of Science and Engineering:

Email: a.r.forrester@admin.abdn.ac.uk
Tel: +44 (0)1224 272081
Fax: +44 (0)1224 272082

More details of the Department's research activities can be found
on our research pages at http://www.csd.abdn.ac.uk/research/index.html
or contact Professor Derek Sleeman, Head of Department:

Email: dsleeman@csd.abdn.ac.uk
Tel: +44 (0)1224 272295/6
Fax: +44 (0)1224 273422

For further particulars of this post, see:

http://www.csd.abdn.ac.uk/people/chair_fp.html


Previous  8 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~