Knowledge Discovery Nuggets Index


To
KD Mine: main site for Data Mining and Knowledge Discovery.
Here is how to subscribe to KD Nuggets
Past Issues: 1997 Nuggets, 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets


Knowledge Discovery Nuggets 97:22, e-mailed 97-07-22

News:
* Dorothy Firsching, New Datamining discussion list
* Andy Pryke, 'Savagely networked bad decisions' is an anagaram for ..
Publications:
* GPS, July '97 Datamation Article on Data Mining,
  • http://www.datamation.com/PlugIn/issues/1997/july/07mine.html

  • * Michael Beddows, Infoworld 97-07-07, Data Mining articles
  • http://www.infoworld.com/cgi-bin/displayStory.pl?/features/970707mining.htm

  • * Gerhard Widmer, MLJ Spec Issue on Context Sensitivity,
    Deadline extended to September 20, 1997.
  • http://www.ai.univie.ac.at/mlj_specissue/

  • Siftware:
    * Mike Bell, Q-Why, a rule finding data mining product
  • http://www.qwhy.com/qwhy/

  • * Stuart Inglis, WEKA 2.2 Machine Learning workbench,
  • http://www.cs.waikato.ac.nz/~ml

  • Positions:
    * Donal Lyons, Ireland: Experienced researcher in Data Mining
    * G. John, USA: IBM Data Mining Analyst and Data Engineer Positions
    * Brij Masand, KDD Job at GTE Laboratories, Waltham, Ma
    Meetings:
    * Claire Nedellec, ECML'98, Chemnitz, Germany, April 21-24 1998
    --
    Data Mining and Knowledge Discovery community, focusing on the
    latest research and applications.

    Submissions are most welcome and should be emailed, with a
    DESCRIPTIVE subject line (and a URL) to gps.
    Please keep CFP and meetings announcements short and provide
    a URL for details.

    To subscribe, see
  • http://www.kdnuggets.com/subscribe.html


  • KD Nuggets frequency is 3-4 times a month.
    Back issues of KD Nuggets, a catalog of data mining tools
    ('Siftware'), pointers to Data Mining Companies, Relevant Websites,
    Meetings, and more is available at Knowledge Discovery Mine site
    at
  • http://www.kdnuggets.com/


  • -- Gregory Piatetsky-Shapiro (editor)
    gps

    ********************* Official disclaimer ***************************
    All opinions expressed herein are those of the contributors and not
    necessarily of their respective employers (or of KD Nuggets)
    *********************************************************************

    ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Anagrams of 'knowledge discovery in databases' :
    Evoke badly answered diagnostics
    Savagely networked bad decisions.
    thanks (??) to Andy Pryke


    Previous  1 Next   Top
    Date: Tue, 10 Jul 1997
    From: Dorothy Firsching (firschng@nautilus-systems.com)
    Subject: New Data Mining / Knowledge Discovery / Data Warehousing List

    A new discussion list, datamine-l, has been set up to provide a
    world-wide e-mail forum for dicussing practical applications of
    data mining, data warehousing, and knowledge discovery. Topics
    can range from specific questions on tools to broad discussion
    of issues.

    This is an independent mailing list operated and maintained by
    Nautilus Systems, and is not officially affiliated with any tool
    vendor. We hope to provide an open and unbiased forum.

    To subscribe, simply send to:

    datamine-l-request@nautilus-sys.com

    the text:

    subscribe datamine-l

    in the body of your message (leave the subject blank).

    To post to the list, send your messages to:

    datamine-l@nautilus-sys.com

    We hope that you enjoy and profit from this open exchange of
    techniques and ideas!

    Sincerely,

    Nautilus Systems, Inc.
    firschng@nautilus-systems.com
  • http://www.nautilus-systems.com/

  • Dorothy Firsching
    CEO
    Nautilus Systems, Inc.
    3867 Alder Woods Court
    Fairfax, VA 22033
  • http://www.nautilus-systems.com/

  • firschng@nautilus-systems.com


    Previous  2 Next   Top
    From: 'A.N.Pryke@cs.bham.ac.uk' (A.N.Pryke@cs.bham.ac.uk)
    Subject: KDD Anagrams
    Date: Fri, 11 Jul 1997 14:07:26 -0400

    Gregory,
    thought you might find the following interesting (or dire!)

    Anagrams of 'knowledge discovery in databases' are (lines with '*'):

    Data mining, the perfect solution?
    *Glory be! Advanced idiot's weakness.
    *Evoke badly answered diagnostics.

    Credit checks...
    *Receives OK. Wasn't badly diagnosed.

    Maybe we should reconsider that database interface...
    *Nasty-looking, bad views decreased.
    *Second-rate view badly assigned OK.

    Application to text retreival:
    *Keywords distance as enviable god.

    Fraud detection:
    *Wait! Badly envisaged crookedness.

    Techniques are improving:
    *Bravo! Steadily decoding weakness.

    Computers tell us what to do...
    *So dawned evangelistic keyboards.

    Visualisation helps decision making?
    *Okay, let's! Bad drawings evidence so.

    Soon your computer will know what you're thinking...
    *Envisage its scaled-down keyboard.

    Privacy:
    *Knew inadvisable secrets. Good day!

    Problems net searching for 'data mining'?
    *Inadvisable, as congested keyword.

    Misc:
    *Badly coordinated gives weakness.
    *Okay! wild observances designated.
    *A keyboards advising now selected.

    Wierd:
    *Naked good-bye invalidates screws.

    And finally...
    *Savagely networked bad decisions.

    Anagrams courtesy of 'Anagram Genius'
  • http://www.genius2000.com/anagram.html



  • Andy

    --
    Andy Pryke, Research Student, Computer Science, Birmingham University
    Data Mining Information -
  • http://www.cs.bham.ac.uk/~anp/TheDataMine.html



  • Previous  3 Next   Top
    Date: Tue, 22 Jul 1997 09:44:29 -0500 (EST)
    From: GPS (gps)
    Subject: Datamation July 1997 on Data Mining

    July 1997 Datamation has an article on data mining,
    'Datamining unearths dollars from data', by Eva Freeman.

    The article describes an application of data mining at
    Milwaukee-based Firstar bank, and profiles 5 data mining
    tools: TMC Darwin, DataMind Professional Edition, IBM Intelligent Miner,
    Angoss KnowledgeSEEKER, and HNC Marksman.

    See full article at
  • http://www.datamation.com/PlugIn/issues/1997/july/07mine.html

  • Here is the main part of the article, without the tool reviews.

    ----
    Have you learned anything new from your
    data lately? Datamining will help you find
    subtle, unexpected patterns hidden in your
    database, which could lead to increased sales
    and healthier profits.

    By Eva Freeman


    It was obvious to everyone in the marketing department
    at Firstar Bank that bombarding customers with
    advertisements was a waste of resources. But how could
    the $20 billion bank holding company, based in
    Milwaukee, know when a customer was ready to look at a new product?
    Enter datamining.

    Ted Bratanow, Firstar Bank's director of market
    research and database marketing, found that the corporate database
    held a tremendous amount of information about every customer.
    The trick was to find patterns in the data that
    would reveal why customers had moved into new products, and to exploit
    those patterns through targeted direct mailings.

    For the analyses, Firstar used the Marksman tool
    from HNC Software of San Diego. 'Marksman can read 800 to 1,000
    variables and can attach scores to each one,' says Bratanow. 'We
    rank-ordered customers into different groups, according to whether
    they had home-equity loans, charge cards, certificates of deposit or
    other savings accounts, or investment products. Then we used the
    datamining process to predict which products would be right to offer
    to each customer at which time.'

    The bottom line? 'Direct marketers are usually
    pleased when they can increase response rates by a few percent,' says
    Bratanow. 'Our response rate improved by a factor of four.'

    Define the problem first

    'We love to tell stories about correlations no one would have
    expected,' says Tricia Beardslee, principal at Fairfax, Va.-based AMS
    Center for Advanced Technologies and the lead analyst at the
    datamining and modeling laboratory. 'Did you know that...chain saws
    and beds sell well together in Minnesota in October? That sounds
    pretty strange, until you think about all the people getting their
    vacation homes ready for hunting season.'

    But Beardslee cautions that datamining is not a
    panacea. 'Success comes when the business
    problem has been specified clearly. You can't
    expect to find anything useful if you just throw a
    data warehouse into a datamining tool.'

    Other experts advise users to temper their
    expectations. 'Even if you don't get a killer
    insight, you can see a huge return on your
    investment if you find something that will, for
    example, increase the response to direct
    mailings by 2 or 3%,' says Herb Edelstein,
    president of Two Crows, a datamining
    consulting firm based in Potomac, Md.

    Use the right tool

    Once the value of datamining for a business
    problem has been identified, IT execs must
    determine the best approach. Aaron Zornes,
    executive VP for application delivery strategies
    with the Burlingame, Calif., office of the META
    Group, divides datamining tools into
    micromining and macromining.

    Micromining tools are inexpensive, have short learning curves, and
    usually run on PCs. A good example of a micromining product would be
    Angoss Software's KnowledgeSEEKER. These tools are not universally
    useful, however, because they offer only a single algorithm, and that
    algorithm may not work equally well in different business
    applications.

    Zornes contrasts the less-expensive datamining tools with macromining
    products like IBM's Intelligent Miner. Macromining tools can operate
    in massively parallel architectures as well as in other types of
    servers. They offer a full suite of algorithms: statistical, decision
    tree, and neural network. But these tools cost more and, on top of
    that, you'll need outside help.

    Before embarking on a datamining project, you should make sure your
    data's worth mining in the first place, warns Zornes. 'Just remember
    that the cost of datamining tools may not be all that large a factor
    in a datamining activity,' he says. 'About 60 to 80% of the investment
    usually is in data preparation.'

    Data preparation is, in fact, the key to success in datamining. Without clean data and good models, all you'll have is garbage in, garbage out--even if the results are calculated to three decimal places.

    Eva Freeman is a freelance high-technology writer based in Bellevue, Wash. She can be reached at freeman@real.com.


    Datamation also has a special section on Data Mining
    at
  • http://www.datamation.com/PlugIn/workbench/datamine/datamine.htm



  • Previous  4 Next   Top
    From: 'Michael R. Beddows' (mbeddows@kstream.com)
    Date: Fri, 11 Jul 1997 13:55:33 -0400
    Subject: Infoworld 97-07-07 data mining articles

    Infoworld has several data mining related articles:

    * Users find tangible rewards digging into data mines
    * Data mining defined
    * Know your customers
    * U.S. Department of Energy finds clues to terrorist activities

    at
  • http://www.infoworld.com/cgi-bin/displayStory.pl?/features/970707mining.htm


  • Here is the first one.

    Users find tangible rewards digging into data mines



    Although the term is often confused with OLAP, data mining proves =
    its worth in retail and banking



    By Steve Alexander



    The truth about data mining is as elusive
    as the nuggets of information it is designed to find.



    Data mining is called the next step beyond online analytical processing
    (OLAP) for querying data warehouses. Rather than seek out known
    relationships -- such as a list of all catalog customers who recently
    moved -- it sifts through data for unknown relationships, such as a
    previously unsuspected link between gourmet food purchases and
    motorcycle ownership.



    'If you say, `How many widgets did we sell in the spring of 1996 in
    sales region A vs. sales region B?' that's OLAP,' says Mark Brown,
    program manager for data mining at SAS Institute, in Cary, N.C. 'If you
    say, `What are the drivers that caused people to buy these widgets from
    my catalog?' that's data mining.'



    But is data mining still unproven and overhyped? Or is it more
    successful than is generally acknowledged and simply kept quiet because
    it provides users with a competitive advantage?



    Aaron Zornes, executive vice president of the Meta Group, in Burlingame,
    Calif., says it's a little of both. The value of data mining is proven,
    but it remains difficult to use.



    'Data mining is a deep, dark, secret weapon within corporations that is
    providing such a competitive advantage they don't want the world to know
    what they're doing. But the tools are not easy enough for most
    corporations to use,' Zornes says.





    MICRO MINES. There is some evidence that Zornes is right. In
    most corporations, traditional, server-based data mining remains mainly
    in the hands of IS professionals. Client-based data mining that
    analyzes a subset of the data-warehouse contents and touts ease of use
    is new, and its effectiveness has yet to be determined. Among the major
    data-mining players are IBM, Thinking Machines, DataMind, Pilot
    Software, Business Objects, SAS Institute, Angoss International,
    NeoVista Solutions, Magnify, and Cognos.



    Zornes says that high-end, server data-mining software licenses
    typically cost $150,000 to $200,000, while desktop or 'micromining'
    software licenses typically cost $500 to $50,000.



    Although these high fees may put you off, some vendors and users say
    data mining offers a clear return on investment and a clear competitive
    weapon.



    'Data mining is quickly becoming a necessity, and those who do not do it
    will soon be left in the dust,' Brown says. 'Data mining is one of the
    few software activities with measurable return on investment associated
    with it. The banking and catalog industries are making lots of [returns
    on investment] today.'



    The SAS server-based Enterprise Miner license fee costs $45,000 or
    more.



    Brant Davison, product manager of business-intelligence software
    solutions for IBM's Software Group, in Somers, N.Y., says companies can
    most easily use data mining if they've invested in a data warehouse,
    although they can get along without one if they're willing to assemble
    the data from various database sources. IBM's Intelligent Miner can
    deal with data warehouses containing hundreds of gigabytes or terabytes
    of information.



    IBM entered high-end, server-based data mining a year ago, but its
    efforts in low-end desktop data mining -- where it deals with mining PC
    spreadsheets containing as much as hundreds of megabytes of data --
    remain a research venture. IBM's Intelligent Miner licenses range in
    price from $25,000 to $150,000 for systems from Risc/6000 Unix machines
    to System/ 390 mainframes.



    'I think data mining is taking off, although it's not taking off in all
    segments of the marketplace,' Davison says. 'Where a lot of customers
    are finding value is in developing models for customer buying behavior.
    In particular, we are seeing initial acceptance of data mining in
    insurance, finance, retail, and telephone companies. Those industries
    have a lot of customers, products, and transactions, and they need a
    system to help them understand the value within that information.'





    BEHAVIOR. Banks, for example, may use data mining to identify
    their most profitable credit-card customers or their highest-risk loan
    applicants, Davison says. They also may seek to prevent fraud by using
    a data-mining technique called 'deviation detection': Rather than
    finding relationships between different groups of data records, it finds
    events that are outside the norm that could be a sign of fraudulent
    activities.



    Brown says companies that use data mining to study their customers
    usually are focused on how to retain customers, separate profitable
    customers from unprofitable ones, uncover fraud, sell existing customers
    new products, and understand why some customers leave.



    Davison sees data mining and OLAP working together.



    'Using data mining, you may come up with a model to find who are the
    most profitable customers. Then you may do more traditional OLAP
    analysis of that subset of data to see what the impact would be if you
    lost those customers, how it would affect your bottom line,' Davison
    says.



    But how accurate is data mining? When it comes to analyzing data, it's
    pretty accurate, says Alex Moissis, director of product marketing for
    Business Objects, which has headquarters in Paris and in San Jose,
    Calif.



    'But, to the extent to which the tool makes a prediction based on this
    data set, you start getting into the issue of accuracy,' Moissis
    says.



    In other words, there is no guarantee that all gourmet food eaters will
    want motorcycles just because the ones in your data set did; 'garbage
    in, garbage out' applies to data mining, as well.



    Moissis says desktop data mining on a PC strikes at the three major
    obstacles to wider adoption of mining technology: the customer
    perception that it is too difficult, too expensive, and conceptually
    difficult to understand (and therefore difficult for management decision
    makers to believe in.)



    'The audience for desktop data mining is the mainstream business user,'
    Moissis says.



    For example, although an automobile manufacturer probably has a large,
    server-based data-mining application for corporate use, its individual
    auto dealers may need desktop data mining to look at buying patterns for
    just their own customers. As a result, desktop data mining will
    complement rather than replace server-based data mining and offer far
    lower prices to spur adoption.



    Business Objects' Business Miner desktop data-mining software,
    introduced earlier this year, costs $995 as a stand-alone application on
    a Windows-based PC.





    REALLY CLEAN DATA. In general, data mining involves refining
    data so that it uses the same variables, then searching for patterns in
    the data using statistical software models. Users report that preparing
    data for mining is frequently 80 percent of the work. Among the
    data-mining techniques are 'neural networks' (programs that mimic the
    brain's ability to learn from its mistakes), 'time-series analysis'
    (year-to-year comparisons), and 'tree-based models' (branching systems
    that show relationships in the form of a hierarchy, such as an
    organizational chart).



    Although banks, financial services, and direct-marketing companies have
    been doing something similar to data mining for the past 15 to 20 years,
    many have relied on data-service companies to provide them with
    predictive statistical models, Brown says. Now new data-mining software
    allows those corporations to do the work in-house while coupling
    traditional statistical techniques with software-industry technologies, =
    such as neural networks and decision trees.



    Early users of server-based data mining report that the somewhat
    esoteric technology is bringing back some tangible results.



    At Pittsburgh's Mellon Bank, IBM's Intelligent Miner is being used with
    S/390 and RS/6000 servers and DB2 databases to study as much as 10GB of
    information on consumer bank customers, with an eye to retaining the
    most profitable ones. Based on historical data, the bank tries to
    predict which customers will be profitable in the future. The bank also
    uses data mining to project which customers are likely to switch from a
    Mellon credit card to another bank's card based on historical patterns
    of use.



    Data mining showed that the best predictive factors for Mellon
    credit-card customer attrition are the frequency of card use and the
    types of purchases that are or aren't made, says Pete Johnson, vice
    president of Mellon's advanced technology group. The bank tracks broad
    categories of credit-card purchases, such as whether they were made at a
    grocery store, a department store, or a gas station. Johnson declines
    to name the purchase categories that data mining showed were leading
    indicators of customer attrition because he considers it valuable
    competitive information.



    'Data mining helps us play in the national credit-card market more
    effectively. That's what all the national banks are trying to do with
    their data-mining efforts,' Johnson says. 'Small credit-card companies
    that are unable to embrace this technology can no longer survive because
    they can't manage their customer bases.'



    Although banks such as Mellon have been using statistical models for
    many years, 'current data-mining software is more scalable and can
    analyze bigger quantities of data. And the engineering of the software
    is such that we don't have to write lines and lines of code to do it,'
    Johnson says.





    ROCKET SCIENCE. But data mining remains complex. Although
    Mellon's goal is to have nontechnical business analysts use the IBM
    data-mining software, today it is used only by IT people and
    sophisticated business analysts.



    Fingerhut Companies, a Minneapolis-based direct-mail catalog company
    using SAS' server-based Enterprise Miner on IBM mainframes and RS/6000
    Unix machines, is sifting through a database of 10 million to 12 million
    current customers to find which are most likely to buy products from one
    of the company's many catalogs.



    Fingerhut, which has 9,500 employees and mails 130 different catalogs
    each year, is among the true believers in data mining: All catalog
    mailings, credit-granting decisions, and inventory-stocking decisions
    are based on it, says Andy Johnson, Fingerhut's senior vice president of
    marketing.



    Fingerhut wants to find out which customers it could profitably mail
    catalogs. It recently used data mining to study past purchases of
    customers who had changed residences to see if they had preferences.
    Data mining showed those customers were three times more likely to buy
    items such as tables, fax machines, phones, and decorative products, but
    that they were not more likely to purchase high-end consumer
    electronics, Johnson says. Fingerhut used that information to create a
    special catalog that it mailed only to those customers who had recently
    moved.



    Johnson's only caveat: You need good data that has been properly
    prepared in order to make money using data mining.



    'People who can't see the value in data mining as a concept either don't
    have the data or don't have data with integrity. We've spent a lot of
    time, money, and energy getting those two things.'



    Although Fingerhut has used statistical modeling for about 20 years, new
    data-mining software allows the company to look at a broader range of
    information and larger databases, says Bill Flach, Fingerhut's director
    of marketing analysis and research. For example, before data mining,
    Fingerhut's statistical analysis was limited to taking samples of 10
    percent to 20 percent of its customers. With data mining, it can
    examine 300 specific characteristics of each of the 10 million to 12
    million customers in a much more focused way.



    Flach believes new data-mining software will be easier to use by people
    who are not IS employees or statisticians. But for now, data mining
    remains in the hands of about 30 Fingerhut people trained in statistics
    plus another 100 IS users.



    But even data mining's present accomplishments are impressive, says
    Mellon's Johnson. Data-mining results have caused the bank to rethink
    its view of transaction data.



    'Traditional information-management systems are not designed to collect
    transactions as information assets. We didn't know until we did data
    mining that transaction detail is very valuable,' Johnson says. 'I like
    to describe data mining as the carrot that justifies the expensive stick
    of building a data warehouse.'



    Steve Alexendar is a free-lance writer based in Edina, Minn.



    Copyright © 1997 InfoWorld Publishing Company


    Previous  5 Next   Top
    From: Gerhard Widmer (gerhard@ai.univie.ac.at)
    Subject: MLJ Special Issue: Deadline Extension
    Date: Wed, 16 Jul 1997 06:02:09 -0400

    This is to announce that for organizational reasons,
    the deadline for submissions to the

    SPECIAL ISSUE ON CONTEXT SENSITIVITY AND CONCEPT DRIFT
    of the
    MACHINE LEARNING JOURNAL

    Miroslav Kubat and Gerhard Widmer, Guest Editors

    has been extended to SEPTEMBER 20, 1997.

    full information is at
  • http://www.ai.univie.ac.at/mlj_specissue/




  • Previous  6 Next   Top
    Date: Mon, 23 Jun 1997 11:57:44 -0400
    From: Mike Bell (mbell@qwhy.com)
    Subject: New Siftware Entry (Q-Why)

    *URL:
  • http://www.qwhy.com/qwhy/

  • *Description: Q-Why is a rule finding data mining product for small to medium sized databases.
    *Discovery tasks: Classification/Rule Discovery Approach
    *Comments:
    Q-Why uses a heuristic search to find possible explanations as to why one
    set of records (e.g. customers who bought a particular product) are different
    from other records in the database. This has a number of applications, in
    market segmentation, survey analysis, direct marketing, etc. There is a
    Q-Why Light freeware version for academic/personal use, and higher end versions
    available for commercial use on larger databases.

    *Platform(s): Windows (95, NT)
    *Contact:
    Les Horn, President
    Quintillion Corporation
    address: 380 Pinhey Point Road, Dunrobin, Ontario K0A 1T0, Canada
    phone +1 (613) 832-4894
    fax +1 (613) 832-0547
    email les@qwhy.com

    *Status: Commercial Product + Freeware version for academic/personal use
    *Source of information: vendor
    *Updated: 1997-06-23 by Mike Bell, (mbell@qwhy.com)

    Mike Bell (mbell@qwhy.com)
    R&D Manager
    Quintillion Corporation


    Previous  7 Next   Top
    From: Stuart Inglis (singlis@lucy.cs.waikato.ac.nz)
    Subject: WEKA 2.2
    Date: Wed, 16 Jul 1997 21:58:25 -0400

    Software system for machine learning WEKA has
    been updated.

    It includes M5' and K*, two of the best tools available.

    cheers
    --
    Stuart Inglis,
    Department of Computer Science
    University of Waikato, Hamilton, New Zealand

    ===========================================================================

    The WEKA Machine Learning workbench
    -----------------------------------

    WEKA 2.2 is now available from
  • http://www.cs.waikato.ac.nz/~ml

  • for downloading and experimentation. WEKA is a software workbench
    for applying machine learning techniques to practical problems.
    It integrates many different machine learning tools within a
    common framework and a uniform user interface. It runs on a
    Unix/X system using Tcl8.0/Tk8.0.

    WEKA 2.2 includes:

    Uniform user interface
    Tutorial
    1R and T2 programs for simple rules
    Induct program for more complex rules
    IB1-4, PEBLS and K* programs for instance-based learning
    M5' program for regression model trees
    FOIL program for relational rules
    Sample data sets in WEKA format
    Programs for processing WEKA data files
    Rule evaluator.
    Makefile based experiment editor for experiments
    (includes t-tests and other stats)

    It can be extended by adding modules (which need additional
    software and/or licences) for:

    C5.0
    Dotty tree visualization
    XGOBI data visualization
    Autoclass and classweb clustering programs
    More comprehensive rule evaluator
    (using Eclipse Prolog)

    More details are at
  • http://www.cs.waikato.ac.nz/~ml

  • --------------------

    * WEKA stands for Waikato Environment for Knowledge Analysis. Found only
    on the islands of New Zealand, the weka is a flightless bird with an
    inquisitive nature.


    Previous  8 Next   Top
    From: Donal Lyons (dlyons@stats.tcd.ie)
    Subject: Experienced researcher in Data Mining
    Date: Wed, 16 Jul 1997 11:55:22 -0400

    The School of Systems and Data Studies, Trinity College, Dublin, Ireland is
    interested in discussing a visiting researcher position with an experienced
    researcher in the Data Mining field. The intention is to apply for EU
    funding for this position, as outlined below.

    If any EU (or Associated State) researcher is interested in exploring this,
    please contact me.

    ============================================================================

    EU funding is available to enable experienced researchers to come to less
    favoured regions such as Ireland for one year - salary and mobility costs
    are provided by the EU - there is a high success rate.

    One of the priorities of the fourth framework is the training and mobility
    of researchers - through the training and mobility of researchers programme
    (TMR), also called Marie Curie Fellowships. There are different type of
    categories open, pre-doctoral, post-doctoral, return grant and experienced
    researcher.

    The category of experienced researcher is very under utilised and good
    proposals will be funded. Engineering and Mathematics are underrepresented.


    A proposal for the School of Systems and Data Studies:
    The School has an interest in forming an ongoing Data Mining Interest
    Group. This would consist of a number of staff whose research interests
    lie in this area and some graduate students. To set up this group, an
    experienced researcher with expertise in the area of Data Mining would be
    of great value. The specific tasks most needed are:

    1) Setting up the Data Mining Interest Group;
    2) Technology transfer to staff, post-graduate and undergraduate
    students via Seminars and Courses;
    3) Development of course material for subsequent use;
    4) Data Mining consultancy with Irish companies;
    5) Co-authorship of research papers;
    6) Development of algorithms for use on parallel processing
    Supercomputer.

    The exact mix of these and other tasks could be varied depending on the
    interests and experience of the visitor.


    What is an Experienced Researcher?
    The experienced researcher category is reserved for scientists who wish to
    join a research team in a less-favoured region of a country other than
    their own nationality. These researchers who will have at least 8 years
    full-time research experience at post graduate level will, in the capacity
    of 'visiting professors', impart their knowledge and research experience.

    The researcher must be a citizen of an EU Member (or Associated) State.


    Funding available:
    Negotiable with the commission - will cover salary and mobility costs for
    the experienced researcher for a year (based on what the researcher was
    earning in his home country - it is normally generous).

    Deadline:
    15-12-1997 for a decision date of 15-05-1998.

    Donal Lyons, Phone (1000-1700 GMT) +353 1 608 1919
    Lecturer (Information Systems) Phone Messages +353 1 608 1767
    School of Systems & Data Studies
    Trinity College, Dublin 2, FAX on request
    Ireland.
  • http://www2.tcd.ie/Statistics/staff/dlyons.html



  • Previous  9 Next   Top
    From: gjohn@almaden.ibm.com
    Subject: IBM Data Mining Jobs
    Date: Wed, 16 Jul 1997 01:39:22 -0400

    IBM DATA MINING ANALYST and DATA ENGINEER POSITIONS

    IBM's data mining group is growing again! We need 12 more analysts
    and data engineers for our highly successful data mining group.
    Join our team in an exciting multi-faceted career in data mining!

    SENIOR DATA MINING ANALYST

    Duties:
    Interact directly with customers, help them understand data mining,
    define a project, analyze their data, and present results; teach data
    mining classes and develop course materials; travel (eg, Maui, Paris,
    Tokyo, Sydney, Rio de Janiero,... and maybe a few less exciting
    places); interact with researchers and product developers, discuss
    ideas for new data mining algorithms, new business applications, new
    product features, and new visualizations; assist sales reps in
    customer visits; assist marketers in developing demos and brochures.

    The ideal candidate has:
    Excellent understanding of the data analysis process; experience in
    using data mining to solve problems; PhD in statistics, machine
    learning, neural nets, pattern recognition, or related, or MS/MBA with
    several years' experience; excellent communication and presentation
    skills; Unix & PC skills.


    DATA ENGINEER

    Duties:
    Assist senior analysts on data mining projects: data extraction,
    loading, cleaning, transformation; work with databases, create tables,
    extract data from tables; work with statistical tools, transform
    variables, reduce variables, calculate summary statistics and plots;
    run data mining tools; assist with other Senior Analyst duties if
    interested and capable.

    The ideal candidate has:
    BS or MS in computer science, statistics, or related fields, or several
    years' related experience; good UNIX and Win95 skills; good SQL, PERL,
    AWK, or statistical tool skills; experience in data cleaning and
    transformation; experience working with large databases; interest
    in learning more about data mining.


    Salaries are competitive, and based on experience. IBM's data mining
    group is growing quickly, and offers excellent career opportunities.
    Candidates for both positions should be energetic, fast learners,
    dedicated to quality, and fun to work with.

    For more information on data mining at IBM, see the webpage for IBM
    Global Business Intelligence Solutions (our parent organization) at
  • http://www.ibm.com/bi


  • To apply for a position, send resume to
    Dr. Ion Ratiu
    Manager, Data Mining Analytical Services
    IBM Global Business Intelligence Solutions
    11400 Burnet Rd / IBM Zip 9661
    Austin, TX 78758
    Email: ratiu@us.ibm.com
    FAX: 512-838-2457

    ASCII (plain text) via email is *strongly* preferred. Please put
    'DMJOBS-97.2:' then your name in the subject.

    IBM is an equal opportunity employer.


    --George H. John, PhD gjohn@almaden.ibm.com
    --Senior Analyst, Data Mining Solutions, IBM Almaden
    --(408) 927-2088 FAX (408) 927-2100 IBM Tie: 8-457-


    Previous  10 Next   Top
    Date: Mon, 21 Jul 1997 10:23:31 -0400
    From: Brij Masand (brij@gte.com)
    Subject: KDD Job at GTE Laboratories, Waltham, Ma

    **** An Applied Researcher/Developer with Database experience *****
    **** needed for The Knowledge Discovery in Databases group at *****
    **** GTE Laboratories *****

    Description: Participate in the design and development of
    state-of-the-art systems for data mining and knowledge discovery.
    The focus of the job is on integrating relational databases with
    KDD systems, including development of prototypes to demonstrate
    innovative business applications of KDD.
    The candidate will join one of the leading R&D teams in the
    area of data mining and knowledge discovery. Principal
    responsibilities will include managing our relational database system
    and related support for the various KDD activities. Our current
    projects include predictive customer modeling for GTE's cellular
    telephone markets ( 4 million customers), analysing web usage data and
    discovering interesting changes in customer databases. We are applying
    multiple learning and discovery methods to large, high-dimensional
    real-world databases, involving millions of records and Gbytes of data
    and have created KDD-based solutions that are being deployed in the
    field.
    The ideal candidate will have a Ph.D. or M.S. in Machine
    Learning/Databases/related fields and 2-3 years of experience. The
    candidate should have significant experience with relational database
    systems and be proficient in SQL. Experience with machine learning
    algorithms and statistical techniques is expected. Experience with Web
    programming and proficiency with HTML, Javascript, Java would be a
    plus. Excellent coding skills in C/Unix environment and an ability to
    quickly pick up new languages and technologies are needed. Good
    communication skills, the ability to work in a team, and good system
    maintenance practices are very desirable.
    GTE Laboratories Incorporated, located in Waltham, Ma is the
    central research facility for GTE. GTE is among the largest local
    telephone carriers and provides local, long distance, cellular and
    internet services. Our research facility is located on a quiet 50
    acre campus-like setting in Waltham, MA, 20 minutes from downtown
    Boston. Our salaries are competitive, and our outstanding benefits
    include medical/life/dental insurance, saving and investment plans,
    and an on-site fitness center.

    Please send a resume and a cover letter
    (preferably by e-mail, in ASCII) to:

    kddjob@gte.com

    or by fax to 617.466.3342 (Attn: Brij Masand)


    Previous  11 Next   Top
    Date: Wed, 09 Jul 1997 11:44:29 +0200
    From: Claire Nedellec (Claire.Nedellec@lri.fr)
    Subject: ECML'98 - 1st announcement

    TENTH EUROPEAN CONFERENCE ON MACHINE LEARNING (ECML-98)
    Chemnitz, Germany, April 21-24 1998

    -------------------------------------------------------------------------

    GENERAL INFORMATION:

    The 10th European Conference on Machine Learning (ECML-98) will be
    held in Chemnitz (ex- Karl Marx Stadt, near Dresden), Germany, from
    April, 21st to 24th 1998.

    Submissions are invited that describe empirical, theoretical research
    in all areas of machine learning. In addition, papers from related
    disciplines (for instance, information retrieval, pattern recognition,
    cognitive modeling, evolutionary computation, artificial neural
    networks, grammatical inference, reinforcement learning, etc.) that
    deal with adaptive intelligence, (semi-)automated knowledge
    acquisition, or (semi-)automated knowledge organization are welcome.

    Submissions that describe the application of machine learning methods
    to real-world problems are encouraged (for instance, natural language
    processing, robotics, data mining, etc.), but such submissions should
    speak of general issues of machine learning, perhaps illustrating
    novel learning methods or demonstrating the utility of established
    methods in previously unexplored settings.



    PROGRAM CHAIRPERSONS

    Claire Nedellec and Celine Rouveirol (University of Paris-Sud, France)


    LOCAL CHAIR:

    Andreas Ittner (Chemnitz University of Technology, Germany)



    PROGRAM COMMITTEE:

    A. Aamodt (Norway) N. Lavrac (Slovenia)
    D. Aha (USA) R. Lopez de Mantaras (Spain)
    F. Bergadano (Italy) S. Matwin (Canada)
    I. Bratko (Slovenia) K. Morik (Germany)
    P. Brazdil (Portugal) G. Nakhaeizadeh (Germany)
    W. Daelemans (Netherlands) D. Page (UK)
    L. De Raedt (Belgium) L. Saitta (Italy)
    M. Dorigo (Italy) D. Sleeman (UK)
    F. Esposito (Italy) M. Van Someren (Netherlands)
    T. Fogarty (UK) P. Vitanyi (Netherlands)
    J. Fuernkranz (Austria) S. Wrobel (Germany)
    Y. Kodratoff (France) G. Widmer (Austria)



    IMPORTANT DATES:

    Submission deadline: 31 October 1997
    Conference: 21-24 April 1998


    IMPORTANT ADDRESS

    Submitted papers should be sent to :

    Claire Nedellec and Celine Rouveirol
    LRI, Bat 490 e-mail: cn/celine@lri.fr
    Universite Paris-Sud Tel: +33 (0)1 69 15 66 26
    F-91405 Orsay Fax: +33 (0)1 69 15 65 86
    FRANCE


    Previous  12 Next   Top