Knowledge Discovery Nuggets Index
To KD Mine: main site for Data Mining and Knowledge Discovery.
Here is how to subscribe to KD Nuggets
Past Issues: 1997 Nuggets,
1996 Nuggets,
1995 Nuggets,
1994 Nuggets,
1993 Nuggets
Knowledge Discovery Nuggets 97:22, e-mailed 97-07-22
News:
*
Dorothy Firsching, New Datamining discussion list
*
Andy Pryke, 'Savagely networked bad decisions' is an anagaram for ..
Publications:
*
GPS, July '97 Datamation Article on Data Mining,
http://www.datamation.com/PlugIn/issues/1997/july/07mine.html
*
Michael Beddows, Infoworld 97-07-07, Data Mining articles
http://www.infoworld.com/cgi-bin/displayStory.pl?/features/970707mining.htm
*
Gerhard Widmer, MLJ Spec Issue on Context Sensitivity,
Deadline extended to September 20, 1997.
http://www.ai.univie.ac.at/mlj_specissue/
Siftware:
*
Mike Bell, Q-Why, a rule finding data mining product
http://www.qwhy.com/qwhy/
*
Stuart Inglis, WEKA 2.2 Machine Learning workbench,
http://www.cs.waikato.ac.nz/~ml
Positions:
*
Donal Lyons, Ireland: Experienced researcher in Data Mining
*
G. John, USA: IBM Data Mining Analyst and Data Engineer Positions
*
Brij Masand, KDD Job at GTE Laboratories, Waltham, Ma
Meetings:
*
Claire Nedellec, ECML'98, Chemnitz, Germany, April 21-24 1998
--
Data Mining and Knowledge Discovery community, focusing on the
latest research and applications.
Submissions are most welcome and should be emailed, with a
DESCRIPTIVE subject line (and a URL) to gps.
Please keep CFP and meetings announcements short and provide
a URL for details.
To subscribe, seehttp://www.kdnuggets.com/subscribe.html
KD Nuggets frequency is 3-4 times a month.
Back issues of KD Nuggets, a catalog of data mining tools
('Siftware'), pointers to Data Mining Companies, Relevant Websites,
Meetings, and more is available at Knowledge Discovery Mine site
athttp://www.kdnuggets.com/
-- Gregory Piatetsky-Shapiro (editor)
gps
********************* Official disclaimer ***************************
All opinions expressed herein are those of the contributors and not
necessarily of their respective employers (or of KD Nuggets)
*********************************************************************
~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Anagrams of 'knowledge discovery in databases' :
Evoke badly answered diagnostics
Savagely networked bad decisions.
thanks (??) to Andy Pryke
Previous
1 Next Top
Date: Tue, 10 Jul 1997
From: Dorothy Firsching (firschng@nautilus-systems.com)
Subject: New Data Mining / Knowledge Discovery / Data Warehousing List
A new discussion list, datamine-l, has been set up to provide a
world-wide e-mail forum for dicussing practical applications of
data mining, data warehousing, and knowledge discovery. Topics
can range from specific questions on tools to broad discussion
of issues.
This is an independent mailing list operated and maintained by
Nautilus Systems, and is not officially affiliated with any tool
vendor. We hope to provide an open and unbiased forum.
To subscribe, simply send to:
datamine-l-request@nautilus-sys.com
the text:
subscribe datamine-l
in the body of your message (leave the subject blank).
To post to the list, send your messages to:
datamine-l@nautilus-sys.com
We hope that you enjoy and profit from this open exchange of
techniques and ideas!
Sincerely,
Nautilus Systems, Inc.
firschng@nautilus-systems.com
http://www.nautilus-systems.com/
Dorothy Firsching
CEO
Nautilus Systems, Inc.
3867 Alder Woods Court
Fairfax, VA 22033
http://www.nautilus-systems.com/
firschng@nautilus-systems.com
Previous
2 Next Top
From: 'A.N.Pryke@cs.bham.ac.uk'
(A.N.Pryke@cs.bham.ac.uk)
Subject: KDD Anagrams
Date: Fri, 11 Jul 1997 14:07:26 -0400
Gregory,
thought you might find the following interesting (or dire!)
Anagrams of 'knowledge discovery in databases' are (lines with '*'):
Data mining, the perfect solution?
*Glory be! Advanced idiot's weakness.
*Evoke badly answered diagnostics.
Credit checks...
*Receives OK. Wasn't badly diagnosed.
Maybe we should reconsider that database interface...
*Nasty-looking, bad views decreased.
*Second-rate view badly assigned OK.
Application to text retreival:
*Keywords distance as enviable god.
Fraud detection:
*Wait! Badly envisaged crookedness.
Techniques are improving:
*Bravo! Steadily decoding weakness.
Computers tell us what to do...
*So dawned evangelistic keyboards.
Visualisation helps decision making?
*Okay, let's! Bad drawings evidence so.
Soon your computer will know what you're thinking...
*Envisage its scaled-down keyboard.
Privacy:
*Knew inadvisable secrets. Good day!
Problems net searching for 'data mining'?
*Inadvisable, as congested keyword.
Misc:
*Badly coordinated gives weakness.
*Okay! wild observances designated.
*A keyboards advising now selected.
Wierd:
*Naked good-bye invalidates screws.
And finally...
*Savagely networked bad decisions.
Anagrams courtesy of 'Anagram Genius'http://www.genius2000.com/anagram.html
Andy
--
Andy Pryke, Research Student, Computer Science, Birmingham University
Data Mining Information -http://www.cs.bham.ac.uk/~anp/TheDataMine.html
Previous
3 Next Top
Date: Tue, 22 Jul 1997 09:44:29 -0500 (EST)
From: GPS (gps)
Subject: Datamation July 1997 on Data Mining
July 1997 Datamation has an article on data mining,
'Datamining unearths dollars from data', by Eva Freeman.
The article describes an application of data mining at
Milwaukee-based Firstar bank, and profiles 5 data mining
tools: TMC Darwin, DataMind Professional Edition, IBM Intelligent Miner,
Angoss KnowledgeSEEKER, and HNC Marksman.
See full article at
http://www.datamation.com/PlugIn/issues/1997/july/07mine.html
Here is the main part of the article, without the tool reviews.
----
Have you learned anything new from your
data lately? Datamining will help you find
subtle, unexpected patterns hidden in your
database, which could lead to increased sales
and healthier profits.
By Eva Freeman
It was obvious to everyone in the marketing department
at Firstar Bank that bombarding customers with
advertisements was a waste of resources. But how could
the $20 billion bank holding company, based in
Milwaukee, know when a customer was ready to look at a new product?
Enter datamining.
Ted Bratanow, Firstar Bank's director of market
research and database marketing, found that the corporate database
held a tremendous amount of information about every customer.
The trick was to find patterns in the data that
would reveal why customers had moved into new products, and to exploit
those patterns through targeted direct mailings.
For the analyses, Firstar used the Marksman tool
from HNC Software of San Diego. 'Marksman can read 800 to 1,000
variables and can attach scores to each one,' says Bratanow. 'We
rank-ordered customers into different groups, according to whether
they had home-equity loans, charge cards, certificates of deposit or
other savings accounts, or investment products. Then we used the
datamining process to predict which products would be right to offer
to each customer at which time.'
The bottom line? 'Direct marketers are usually
pleased when they can increase response rates by a few percent,' says
Bratanow. 'Our response rate improved by a factor of four.'
Define the problem first
'We love to tell stories about correlations no one would have
expected,' says Tricia Beardslee, principal at Fairfax, Va.-based AMS
Center for Advanced Technologies and the lead analyst at the
datamining and modeling laboratory. 'Did you know that...chain saws
and beds sell well together in Minnesota in October? That sounds
pretty strange, until you think about all the people getting their
vacation homes ready for hunting season.'
But Beardslee cautions that datamining is not a
panacea. 'Success comes when the business
problem has been specified clearly. You can't
expect to find anything useful if you just throw a
data warehouse into a datamining tool.'
Other experts advise users to temper their
expectations. 'Even if you don't get a killer
insight, you can see a huge return on your
investment if you find something that will, for
example, increase the response to direct
mailings by 2 or 3%,' says Herb Edelstein,
president of Two Crows, a datamining
consulting firm based in Potomac, Md.
Use the right tool
Once the value of datamining for a business
problem has been identified, IT execs must
determine the best approach. Aaron Zornes,
executive VP for application delivery strategies
with the Burlingame, Calif., office of the META
Group, divides datamining tools into
micromining and macromining.
Micromining tools are inexpensive, have short learning curves, and
usually run on PCs. A good example of a micromining product would be
Angoss Software's KnowledgeSEEKER. These tools are not universally
useful, however, because they offer only a single algorithm, and that
algorithm may not work equally well in different business
applications.
Zornes contrasts the less-expensive datamining tools with macromining
products like IBM's Intelligent Miner. Macromining tools can operate
in massively parallel architectures as well as in other types of
servers. They offer a full suite of algorithms: statistical, decision
tree, and neural network. But these tools cost more and, on top of
that, you'll need outside help.
Before embarking on a datamining project, you should make sure your
data's worth mining in the first place, warns Zornes. 'Just remember
that the cost of datamining tools may not be all that large a factor
in a datamining activity,' he says. 'About 60 to 80% of the investment
usually is in data preparation.'
Data preparation is, in fact, the key to success in datamining. Without clean data and good models, all you'll have is garbage in, garbage out--even if the results are calculated to three decimal places.
Eva Freeman is a freelance high-technology writer based in Bellevue, Wash. She can be reached at freeman@real.com.
Datamation also has a special section on Data Mining
athttp://www.datamation.com/PlugIn/workbench/datamine/datamine.htm
Previous
4 Next Top
From: 'Michael R. Beddows' (mbeddows@kstream.com)
Date: Fri, 11 Jul 1997 13:55:33 -0400
Subject: Infoworld 97-07-07 data mining articles
Infoworld has several data mining related articles:
* Users find tangible rewards digging into data mines
* Data mining defined
* Know your customers
* U.S. Department of Energy finds clues to terrorist activities
athttp://www.infoworld.com/cgi-bin/displayStory.pl?/features/970707mining.htm
Here is the first one.
Users find tangible rewards digging into data mines
Although the term is often confused with OLAP, data mining proves =
its worth in retail and banking
By Steve Alexander
The truth about data mining is as elusive
as the nuggets of information it is designed to find.
Data mining is called the next step beyond online analytical processing
(OLAP) for querying data warehouses. Rather than seek out known
relationships -- such as a list of all catalog customers who recently
moved -- it sifts through data for unknown relationships, such as a
previously unsuspected link between gourmet food purchases and
motorcycle ownership.
'If you say, `How many widgets did we sell in the spring of 1996 in
sales region A vs. sales region B?' that's OLAP,' says Mark Brown,
program manager for data mining at SAS Institute, in Cary, N.C. 'If you
say, `What are the drivers that caused people to buy these widgets from
my catalog?' that's data mining.'
But is data mining still unproven and overhyped? Or is it more
successful than is generally acknowledged and simply kept quiet because
it provides users with a competitive advantage?
Aaron Zornes, executive vice president of the Meta Group, in Burlingame,
Calif., says it's a little of both. The value of data mining is proven,
but it remains difficult to use.
'Data mining is a deep, dark, secret weapon within corporations that is
providing such a competitive advantage they don't want the world to know
what they're doing. But the tools are not easy enough for most
corporations to use,' Zornes says.
MICRO MINES. There is some evidence that Zornes is right. In
most corporations, traditional, server-based data mining remains mainly
in the hands of IS professionals. Client-based data mining that
analyzes a subset of the data-warehouse contents and touts ease of use
is new, and its effectiveness has yet to be determined. Among the major
data-mining players are IBM, Thinking Machines, DataMind, Pilot
Software, Business Objects, SAS Institute, Angoss International,
NeoVista Solutions, Magnify, and Cognos.
Zornes says that high-end, server data-mining software licenses
typically cost $150,000 to $200,000, while desktop or 'micromining'
software licenses typically cost $500 to $50,000.
Although these high fees may put you off, some vendors and users say
data mining offers a clear return on investment and a clear competitive
weapon.
'Data mining is quickly becoming a necessity, and those who do not do it
will soon be left in the dust,' Brown says. 'Data mining is one of the
few software activities with measurable return on investment associated
with it. The banking and catalog industries are making lots of [returns
on investment] today.'
The SAS server-based Enterprise Miner license fee costs $45,000 or
more.
Brant Davison, product manager of business-intelligence software
solutions for IBM's Software Group, in Somers, N.Y., says companies can
most easily use data mining if they've invested in a data warehouse,
although they can get along without one if they're willing to assemble
the data from various database sources. IBM's Intelligent Miner can
deal with data warehouses containing hundreds of gigabytes or terabytes
of information.
IBM entered high-end, server-based data mining a year ago, but its
efforts in low-end desktop data mining -- where it deals with mining PC
spreadsheets containing as much as hundreds of megabytes of data --
remain a research venture. IBM's Intelligent Miner licenses range in
price from $25,000 to $150,000 for systems from Risc/6000 Unix machines
to System/ 390 mainframes.
'I think data mining is taking off, although it's not taking off in all
segments of the marketplace,' Davison says. 'Where a lot of customers
are finding value is in developing models for customer buying behavior.
In particular, we are seeing initial acceptance of data mining in
insurance, finance, retail, and telephone companies. Those industries
have a lot of customers, products, and transactions, and they need a
system to help them understand the value within that information.'
BEHAVIOR. Banks, for example, may use data mining to identify
their most profitable credit-card customers or their highest-risk loan
applicants, Davison says. They also may seek to prevent fraud by using
a data-mining technique called 'deviation detection': Rather than
finding relationships between different groups of data records, it finds
events that are outside the norm that could be a sign of fraudulent
activities.
Brown says companies that use data mining to study their customers
usually are focused on how to retain customers, separate profitable
customers from unprofitable ones, uncover fraud, sell existing customers
new products, and understand why some customers leave.
Davison sees data mining and OLAP working together.
'Using data mining, you may come up with a model to find who are the
most profitable customers. Then you may do more traditional OLAP
analysis of that subset of data to see what the impact would be if you
lost those customers, how it would affect your bottom line,' Davison
says.
But how accurate is data mining? When it comes to analyzing data, it's
pretty accurate, says Alex Moissis, director of product marketing for
Business Objects, which has headquarters in Paris and in San Jose,
Calif.
'But, to the extent to which the tool makes a prediction based on this
data set, you start getting into the issue of accuracy,' Moissis
says.
In other words, there is no guarantee that all gourmet food eaters will
want motorcycles just because the ones in your data set did; 'garbage
in, garbage out' applies to data mining, as well.
Moissis says desktop data mining on a PC strikes at the three major
obstacles to wider adoption of mining technology: the customer
perception that it is too difficult, too expensive, and conceptually
difficult to understand (and therefore difficult for management decision
makers to believe in.)
'The audience for desktop data mining is the mainstream business user,'
Moissis says.
For example, although an automobile manufacturer probably has a large,
server-based data-mining application for corporate use, its individual
auto dealers may need desktop data mining to look at buying patterns for
just their own customers. As a result, desktop data mining will
complement rather than replace server-based data mining and offer far
lower prices to spur adoption.
Business Objects' Business Miner desktop data-mining software,
introduced earlier this year, costs $995 as a stand-alone application on
a Windows-based PC.
REALLY CLEAN DATA. In general, data mining involves refining
data so that it uses the same variables, then searching for patterns in
the data using statistical software models. Users report that preparing
data for mining is frequently 80 percent of the work. Among the
data-mining techniques are 'neural networks' (programs that mimic the
brain's ability to learn from its mistakes), 'time-series analysis'
(year-to-year comparisons), and 'tree-based models' (branching systems
that show relationships in the form of a hierarchy, such as an
organizational chart).
Although banks, financial services, and direct-marketing companies have
been doing something similar to data mining for the past 15 to 20 years,
many have relied on data-service companies to provide them with
predictive statistical models, Brown says. Now new data-mining software
allows those corporations to do the work in-house while coupling
traditional statistical techniques with software-industry technologies, =
such as neural networks and decision trees.
Early users of server-based data mining report that the somewhat
esoteric technology is bringing back some tangible results.
At Pittsburgh's Mellon Bank, IBM's Intelligent Miner is being used with
S/390 and RS/6000 servers and DB2 databases to study as much as 10GB of
information on consumer bank customers, with an eye to retaining the
most profitable ones. Based on historical data, the bank tries to
predict which customers will be profitable in the future. The bank also
uses data mining to project which customers are likely to switch from a
Mellon credit card to another bank's card based on historical patterns
of use.
Data mining showed that the best predictive factors for Mellon
credit-card customer attrition are the frequency of card use and the
types of purchases that are or aren't made, says Pete Johnson, vice
president of Mellon's advanced technology group. The bank tracks broad
categories of credit-card purchases, such as whether they were made at a
grocery store, a department store, or a gas station. Johnson declines
to name the purchase categories that data mining showed were leading
indicators of customer attrition because he considers it valuable
competitive information.
'Data mining helps us play in the national credit-card market more
effectively. That's what all the national banks are trying to do with
their data-mining efforts,' Johnson says. 'Small credit-card companies
that are unable to embrace this technology can no longer survive because
they can't manage their customer bases.'
Although banks such as Mellon have been using statistical models for
many years, 'current data-mining software is more scalable and can
analyze bigger quantities of data. And the engineering of the software
is such that we don't have to write lines and lines of code to do it,'
Johnson says.
ROCKET SCIENCE. But data mining remains complex. Although
Mellon's goal is to have nontechnical business analysts use the IBM
data-mining software, today it is used only by IT people and
sophisticated business analysts.
Fingerhut Companies, a Minneapolis-based direct-mail catalog company
using SAS' server-based Enterprise Miner on IBM mainframes and RS/6000
Unix machines, is sifting through a database of 10 million to 12 million
current customers to find which are most likely to buy products from one
of the company's many catalogs.
Fingerhut, which has 9,500 employees and mails 130 different catalogs
each year, is among the true believers in data mining: All catalog
mailings, credit-granting decisions, and inventory-stocking decisions
are based on it, says Andy Johnson, Fingerhut's senior vice president of
marketing.
Fingerhut wants to find out which customers it could profitably mail
catalogs. It recently used data mining to study past purchases of
customers who had changed residences to see if they had preferences.
Data mining showed those customers were three times more likely to buy
items such as tables, fax machines, phones, and decorative products, but
that they were not more likely to purchase high-end consumer
electronics, Johnson says. Fingerhut used that information to create a
special catalog that it mailed only to those customers who had recently
moved.
Johnson's only caveat: You need good data that has been properly
prepared in order to make money using data mining.
'People who can't see the value in data mining as a concept either don't
have the data or don't have data with integrity. We've spent a lot of
time, money, and energy getting those two things.'
Although Fingerhut has used statistical modeling for about 20 years, new
data-mining software allows the company to look at a broader range of
information and larger databases, says Bill Flach, Fingerhut's director
of marketing analysis and research. For example, before data mining,
Fingerhut's statistical analysis was limited to taking samples of 10
percent to 20 percent of its customers. With data mining, it can
examine 300 specific characteristics of each of the 10 million to 12
million customers in a much more focused way.
Flach believes new data-mining software will be easier to use by people
who are not IS employees or statisticians. But for now, data mining
remains in the hands of about 30 Fingerhut people trained in statistics
plus another 100 IS users.
But even data mining's present accomplishments are impressive, says
Mellon's Johnson. Data-mining results have caused the bank to rethink
its view of transaction data.
'Traditional information-management systems are not designed to collect
transactions as information assets. We didn't know until we did data
mining that transaction detail is very valuable,' Johnson says. 'I like
to describe data mining as the carrot that justifies the expensive stick
of building a data warehouse.'
Steve Alexendar is a free-lance writer based in Edina, Minn.
Copyright © 1997 InfoWorld Publishing Company
Previous
5 Next Top
From: Gerhard Widmer (gerhard@ai.univie.ac.at)
Subject: MLJ Special Issue: Deadline Extension
Date: Wed, 16 Jul 1997 06:02:09 -0400
This is to announce that for organizational reasons,
the deadline for submissions to the
SPECIAL ISSUE ON CONTEXT SENSITIVITY AND CONCEPT DRIFT
of the
MACHINE LEARNING JOURNAL
Miroslav Kubat and Gerhard Widmer, Guest Editors
has been extended to SEPTEMBER 20, 1997.
full information is athttp://www.ai.univie.ac.at/mlj_specissue/
Previous
6 Next Top
Date: Mon, 23 Jun 1997 11:57:44 -0400
From: Mike Bell (mbell@qwhy.com)
Subject: New Siftware Entry (Q-Why)
*URL:http://www.qwhy.com/qwhy/
*Description: Q-Why is a rule finding data mining product for small to medium sized databases.
*Discovery tasks: Classification/Rule Discovery Approach
*Comments:
Q-Why uses a heuristic search to find possible explanations as to why one
set of records (e.g. customers who bought a particular product) are different
from other records in the database. This has a number of applications, in
market segmentation, survey analysis, direct marketing, etc. There is a
Q-Why Light freeware version for academic/personal use, and higher end versions
available for commercial use on larger databases.
*Platform(s): Windows (95, NT)
*Contact:
Les Horn, President
Quintillion Corporation
address: 380 Pinhey Point Road, Dunrobin, Ontario K0A 1T0, Canada
phone +1 (613) 832-4894
fax +1 (613) 832-0547
email les@qwhy.com
*Status: Commercial Product + Freeware version for academic/personal use
*Source of information: vendor
*Updated: 1997-06-23 by Mike Bell, (mbell@qwhy.com)
Mike Bell (mbell@qwhy.com)
R&D Manager
Quintillion Corporation
Previous
7 Next Top
From: Stuart Inglis (singlis@lucy.cs.waikato.ac.nz)
Subject: WEKA 2.2
Date: Wed, 16 Jul 1997 21:58:25 -0400
Software system for machine learning WEKA has
been updated.
It includes M5' and K*, two of the best tools available.
cheers
--
Stuart Inglis,
Department of Computer Science
University of Waikato, Hamilton, New Zealand
===========================================================================
The WEKA Machine Learning workbench
-----------------------------------
WEKA 2.2 is now available fromhttp://www.cs.waikato.ac.nz/~ml
for downloading and experimentation. WEKA is a software workbench
for applying machine learning techniques to practical problems.
It integrates many different machine learning tools within a
common framework and a uniform user interface. It runs on a
Unix/X system using Tcl8.0/Tk8.0.
WEKA 2.2 includes:
Uniform user interface
Tutorial
1R and T2 programs for simple rules
Induct program for more complex rules
IB1-4, PEBLS and K* programs for instance-based learning
M5' program for regression model trees
FOIL program for relational rules
Sample data sets in WEKA format
Programs for processing WEKA data files
Rule evaluator.
Makefile based experiment editor for experiments
(includes t-tests and other stats)
It can be extended by adding modules (which need additional
software and/or licences) for:
C5.0
Dotty tree visualization
XGOBI data visualization
Autoclass and classweb clustering programs
More comprehensive rule evaluator
(using Eclipse Prolog)
More details are athttp://www.cs.waikato.ac.nz/~ml
--------------------
* WEKA stands for Waikato Environment for Knowledge Analysis. Found only
on the islands of New Zealand, the weka is a flightless bird with an
inquisitive nature.
Previous
8 Next Top
From: Donal Lyons (dlyons@stats.tcd.ie)
Subject: Experienced researcher in Data Mining
Date: Wed, 16 Jul 1997 11:55:22 -0400
The School of Systems and Data Studies, Trinity College, Dublin, Ireland is
interested in discussing a visiting researcher position with an experienced
researcher in the Data Mining field. The intention is to apply for EU
funding for this position, as outlined below.
If any EU (or Associated State) researcher is interested in exploring this,
please contact me.
============================================================================
EU funding is available to enable experienced researchers to come to less
favoured regions such as Ireland for one year - salary and mobility costs
are provided by the EU - there is a high success rate.
One of the priorities of the fourth framework is the training and mobility
of researchers - through the training and mobility of researchers programme
(TMR), also called Marie Curie Fellowships. There are different type of
categories open, pre-doctoral, post-doctoral, return grant and experienced
researcher.
The category of experienced researcher is very under utilised and good
proposals will be funded. Engineering and Mathematics are underrepresented.
A proposal for the School of Systems and Data Studies:
The School has an interest in forming an ongoing Data Mining Interest
Group. This would consist of a number of staff whose research interests
lie in this area and some graduate students. To set up this group, an
experienced researcher with expertise in the area of Data Mining would be
of great value. The specific tasks most needed are:
1) Setting up the Data Mining Interest Group;
2) Technology transfer to staff, post-graduate and undergraduate
students via Seminars and Courses;
3) Development of course material for subsequent use;
4) Data Mining consultancy with Irish companies;
5) Co-authorship of research papers;
6) Development of algorithms for use on parallel processing
Supercomputer.
The exact mix of these and other tasks could be varied depending on the
interests and experience of the visitor.
What is an Experienced Researcher?
The experienced researcher category is reserved for scientists who wish to
join a research team in a less-favoured region of a country other than
their own nationality. These researchers who will have at least 8 years
full-time research experience at post graduate level will, in the capacity
of 'visiting professors', impart their knowledge and research experience.
The researcher must be a citizen of an EU Member (or Associated) State.
Funding available:
Negotiable with the commission - will cover salary and mobility costs for
the experienced researcher for a year (based on what the researcher was
earning in his home country - it is normally generous).
Deadline:
15-12-1997 for a decision date of 15-05-1998.
Donal Lyons, Phone (1000-1700 GMT) +353 1 608 1919
Lecturer (Information Systems) Phone Messages +353 1 608 1767
School of Systems & Data Studies
Trinity College, Dublin 2, FAX on request
Ireland.
http://www2.tcd.ie/Statistics/staff/dlyons.html
Previous
9 Next Top
From: gjohn@almaden.ibm.com
Subject: IBM Data Mining Jobs
Date: Wed, 16 Jul 1997 01:39:22 -0400
IBM DATA MINING ANALYST and DATA ENGINEER POSITIONS
IBM's data mining group is growing again! We need 12 more analysts
and data engineers for our highly successful data mining group.
Join our team in an exciting multi-faceted career in data mining!
SENIOR DATA MINING ANALYST
Duties:
Interact directly with customers, help them understand data mining,
define a project, analyze their data, and present results; teach data
mining classes and develop course materials; travel (eg, Maui, Paris,
Tokyo, Sydney, Rio de Janiero,... and maybe a few less exciting
places); interact with researchers and product developers, discuss
ideas for new data mining algorithms, new business applications, new
product features, and new visualizations; assist sales reps in
customer visits; assist marketers in developing demos and brochures.
The ideal candidate has:
Excellent understanding of the data analysis process; experience in
using data mining to solve problems; PhD in statistics, machine
learning, neural nets, pattern recognition, or related, or MS/MBA with
several years' experience; excellent communication and presentation
skills; Unix & PC skills.
DATA ENGINEER
Duties:
Assist senior analysts on data mining projects: data extraction,
loading, cleaning, transformation; work with databases, create tables,
extract data from tables; work with statistical tools, transform
variables, reduce variables, calculate summary statistics and plots;
run data mining tools; assist with other Senior Analyst duties if
interested and capable.
The ideal candidate has:
BS or MS in computer science, statistics, or related fields, or several
years' related experience; good UNIX and Win95 skills; good SQL, PERL,
AWK, or statistical tool skills; experience in data cleaning and
transformation; experience working with large databases; interest
in learning more about data mining.
Salaries are competitive, and based on experience. IBM's data mining
group is growing quickly, and offers excellent career opportunities.
Candidates for both positions should be energetic, fast learners,
dedicated to quality, and fun to work with.
For more information on data mining at IBM, see the webpage for IBM
Global Business Intelligence Solutions (our parent organization) at
http://www.ibm.com/bi
To apply for a position, send resume to
Dr. Ion Ratiu
Manager, Data Mining Analytical Services
IBM Global Business Intelligence Solutions
11400 Burnet Rd / IBM Zip 9661
Austin, TX 78758
Email: ratiu@us.ibm.com
FAX: 512-838-2457
ASCII (plain text) via email is *strongly* preferred. Please put
'DMJOBS-97.2:' then your name in the subject.
IBM is an equal opportunity employer.
--George H. John, PhD gjohn@almaden.ibm.com
--Senior Analyst, Data Mining Solutions, IBM Almaden
--(408) 927-2088 FAX (408) 927-2100 IBM Tie: 8-457-
Previous
10 Next Top
Date: Mon, 21 Jul 1997 10:23:31 -0400
From: Brij Masand (brij@gte.com)
Subject: KDD Job at GTE Laboratories, Waltham, Ma
**** An Applied Researcher/Developer with Database experience *****
**** needed for The Knowledge Discovery in Databases group at *****
**** GTE Laboratories *****
Description: Participate in the design and development of
state-of-the-art systems for data mining and knowledge discovery.
The focus of the job is on integrating relational databases with
KDD systems, including development of prototypes to demonstrate
innovative business applications of KDD.
The candidate will join one of the leading R&D teams in the
area of data mining and knowledge discovery. Principal
responsibilities will include managing our relational database system
and related support for the various KDD activities. Our current
projects include predictive customer modeling for GTE's cellular
telephone markets ( 4 million customers), analysing web usage data and
discovering interesting changes in customer databases. We are applying
multiple learning and discovery methods to large, high-dimensional
real-world databases, involving millions of records and Gbytes of data
and have created KDD-based solutions that are being deployed in the
field.
The ideal candidate will have a Ph.D. or M.S. in Machine
Learning/Databases/related fields and 2-3 years of experience. The
candidate should have significant experience with relational database
systems and be proficient in SQL. Experience with machine learning
algorithms and statistical techniques is expected. Experience with Web
programming and proficiency with HTML, Javascript, Java would be a
plus. Excellent coding skills in C/Unix environment and an ability to
quickly pick up new languages and technologies are needed. Good
communication skills, the ability to work in a team, and good system
maintenance practices are very desirable.
GTE Laboratories Incorporated, located in Waltham, Ma is the
central research facility for GTE. GTE is among the largest local
telephone carriers and provides local, long distance, cellular and
internet services. Our research facility is located on a quiet 50
acre campus-like setting in Waltham, MA, 20 minutes from downtown
Boston. Our salaries are competitive, and our outstanding benefits
include medical/life/dental insurance, saving and investment plans,
and an on-site fitness center.
Please send a resume and a cover letter
(preferably by e-mail, in ASCII) to:
kddjob@gte.com
or by fax to 617.466.3342 (Attn: Brij Masand)
Previous
11 Next Top
Date: Wed, 09 Jul 1997 11:44:29 +0200
From: Claire Nedellec (Claire.Nedellec@lri.fr)
Subject: ECML'98 - 1st announcement
TENTH EUROPEAN CONFERENCE ON MACHINE LEARNING (ECML-98)
Chemnitz, Germany, April 21-24 1998
-------------------------------------------------------------------------
GENERAL INFORMATION:
The 10th European Conference on Machine Learning (ECML-98) will be
held in Chemnitz (ex- Karl Marx Stadt, near Dresden), Germany, from
April, 21st to 24th 1998.
Submissions are invited that describe empirical, theoretical research
in all areas of machine learning. In addition, papers from related
disciplines (for instance, information retrieval, pattern recognition,
cognitive modeling, evolutionary computation, artificial neural
networks, grammatical inference, reinforcement learning, etc.) that
deal with adaptive intelligence, (semi-)automated knowledge
acquisition, or (semi-)automated knowledge organization are welcome.
Submissions that describe the application of machine learning methods
to real-world problems are encouraged (for instance, natural language
processing, robotics, data mining, etc.), but such submissions should
speak of general issues of machine learning, perhaps illustrating
novel learning methods or demonstrating the utility of established
methods in previously unexplored settings.
PROGRAM CHAIRPERSONS
Claire Nedellec and Celine Rouveirol (University of Paris-Sud, France)
LOCAL CHAIR:
Andreas Ittner (Chemnitz University of Technology, Germany)
PROGRAM COMMITTEE:
A. Aamodt (Norway) N. Lavrac (Slovenia)
D. Aha (USA) R. Lopez de Mantaras (Spain)
F. Bergadano (Italy) S. Matwin (Canada)
I. Bratko (Slovenia) K. Morik (Germany)
P. Brazdil (Portugal) G. Nakhaeizadeh (Germany)
W. Daelemans (Netherlands) D. Page (UK)
L. De Raedt (Belgium) L. Saitta (Italy)
M. Dorigo (Italy) D. Sleeman (UK)
F. Esposito (Italy) M. Van Someren (Netherlands)
T. Fogarty (UK) P. Vitanyi (Netherlands)
J. Fuernkranz (Austria) S. Wrobel (Germany)
Y. Kodratoff (France) G. Widmer (Austria)
IMPORTANT DATES:
Submission deadline: 31 October 1997
Conference: 21-24 April 1998
IMPORTANT ADDRESS
Submitted papers should be sent to :
Claire Nedellec and Celine Rouveirol
LRI, Bat 490 e-mail: cn/celine@lri.fr
Universite Paris-Sud Tel: +33 (0)1 69 15 66 26
F-91405 Orsay Fax: +33 (0)1 69 15 65 86
FRANCE
Previous
12 Next Top