KDD-CUP-98

The Second International Knowledge Discovery and
Data Mining Tools Competition
Held in Conjunction with KDD-98
The Fourth International Conference on Knowledge
Discovery and Data Mining


Sponsored by the
American Association for Artificial Intelligence (AAAI)
Epsilon Data Mining Laboratory
Paralyzed Veterans of America (PVA)


Dates | Data Set | Evaluation | Committee | Participants | Results

KDD-CUP is a knowledge discovery and data mining (KDDM) tools competition held in conjunction with the Fourth International Conference on Knowledge Discovery and Data Mining.

Last year, the KDD-CUP-97 enjoyed worldwide participation of 45 data mining tools. The Gold Miner award was jointly shared by UCSD's BNB (Boosted Naive Bayes Classifier) software and Urban Science's GainSmarts software. SGI's MineSet was the runner-up and has earned the Bronze Miner award. For more information on KDD-CUP-97, please refer to the URL: www.epsilon.com/new. Some of the highlights from last year's competition are as follows:

KDD-CUP-98 will follow on the success of last year's competition. The CUP is again open to all KDDM tool vendors, academics with research prototypes and corporations with significant applications. Attendance of the KDD-98 conference is not required to participate in the CUP.

KDD-CUP Process and Important Dates

KDD-CUP Data Set

The data set for this year's Cup has been generously provided by the Paralyzed Veterans of America (PVA). PVA is a not-for-profit organization that provides programs and services for US veterans with spinal cord injuries or disease. With an in-house database of over 13 million donors, PVA is also one of the largest direct mail fund raisers in the country.

Participants in the CUP will demonstrate the performance of their tool by analyzing the results of one of PVA's recent fund raising appeals. This mailing was dropped in June 1997 to a total of 3.5 million PVA donors. It included a gift "premium" of personalized name & address labels plus an assortment of 10 note cards and envelopes. All of the donors who received this mailing were acquired by PVA through premium-oriented appeals like this. The analysis data set will include:

Unlike least year, all available information about the fields will be made available in the project documentation.

The objective of the analysis will be to identify response to this mailing -- a classification or discrimination problem.

Performance Evaluation Criteria

The CUP is aimed at recognizing the most accurate, innovative, efficient and methodologically advanced data mining tools in the marketplace.

The participants will again be evaluated based on the performance of their algorithm on the validation or hold-out data set. The KDD-CUP program committee will consider the following metrics in their evaluations:

Last year, the performance in the top 10 percent of the file was considered as a measure of precision while the performance in the top 40 percent of the file was considered as a measure of stability and marketing coverage. The average performance up to the 40th percentile was also looked at as a measure of overall performance.

KDD-CUP-97 Program Committee

Participants

Last year, the CUP enjoyed worldwide participation of 45 data mining tools. This year, it is enjoying worldwide participation of 57 contestants. 18 of the 57 participants have elected to stay anonymous. The software status of those that elected anonymity is as follows: The following 39 participants wish to be identified.
 SOFTWARE/TOOL/RESEARCH PROTOTYPE        VENDOR/INSTITUTION
 --------------------------------------- --------------------------------------
					
 APN (Adaptive Probabilistic Networks)   Berkeley/SRI/Stanford                
 BAYDA/PRO                               Complex Systems Computation Group   
                                           (CoSCo), University of Helsinki   
 BNB (Boosted Naive Bayes Classifier)    University of California San Diego  
 BPSOM                                   Eindhoven University of Technology  
 CARRL                                   Austrian Research Institute for AI  
 DataBase Mining Marksman                HNC Software Inc.                    
 DataDetective                           Sentient Machine Research            
 DataLamp                                University of East Anglia            
 Discovery Board                         Rutgers University                   
 DMZ                                     Yongwon Lee, Lockheed Martin ATC    
                                           (tool not affiliated with Lockheed
                                           Martin ATC.)			     
 DTI v5.0                                ECCI-University of Costa Rica        
 Enterprise Miner                        SAS Institute                        
 Fragment-Potential                      QueryObject Systems, NY & 	      
                                           Institute for Information 	      
                                           Transmission Problems, Moscow      
 GainSmarts                              Urban Science Applications, Inc.    
 ICL                                     Katholieke Universiteit Leuven       
 IGLUE                                   CRIL                                 
 Information Network                     Tel Aviv University                  
 JABC                                    University of Constance, Germany    
 JAM                                     Florida Institute of Technology &   
                                           Columbia Univeristy                
 JAWS                                    University of Waikato, New Zealand  
 Kepler                                  Dialogis Software & Services GmbH   
 KnowledgeMiner                          Frank Lemke, Script Software         
 KnowMan DataMiner(research version)     Intellix / Riso National Laboratory 
 LPDT                                    Rensselaer Polytechnic Institute    
 MineSet                                 Silicon Graphics, Inc.               
 Mixtures of Trees                       Massachusetts Institute of Technolog
 Model 1                                 Unica Technologies, Inc.             
 ModelQuest Enterprise                   AbTech Corp.                         
 Otis                                    Randy Kerber, NCR (tool not affiliat
                                           with NCR)                          
 PolyAnalyst                             Megaputer Intelligence Ltd.          
 QS                                      Iona Corp.                           
 Rdt/Db                                  Informatik LS VIII, Universitaet Dortmund
 SENN Sales                              Siemens Nixdorf Business Service         
 The Shrunken-Belly Method               Edward Malthouse, Northwestern University
 TILDE                                   Katholieke Universiteit Leuven            
 Tutti 0.1                               Tampere University of Technology         
 WARMR                                   Katholieke Universiteit Leuven            
 WhiteCross HeatSeeker                   MRJ Technology Solutions/WhiteCross      
 WizWhy                                  WizSoft                                   

REGISTRATION BROCHURE

All participants are required to complete the application form below and send it in plain ASCII format to (e-mail preferred):
+-----------------------------+
| Ismail Parsa                |
|                             |
| Epsilon                     |
| 50 Cambridge Street         |
| Burlington MA 01803 USA     |
|                             |
| E-MAIL: iparsa@epsilon.com  |
| V-MAIL: (781) 273-0250*6734 |
| FAX:    (781) 272-8604      |
+-----------------------------+
The participants will receive the NDA (non-disclosure agreement) before the July 15, 1998 deadline. Please contact Ismail Parsa if you did not receive the NDA before July 15.

Last year, the KDD-CUP program committee publicly announced the names of only the top 3 performing tools. The names of the 45 participants were not released. This year, although we will again only announce the names of the top 3 performing tools, we will make the list of participants publicly available UNLESS THE PARTICIPANTS INDICATE THAT THEY WILL PRESERVE THEIR ANONYMITY BY CHECKING THE APPROPRIATE BOX IN THE REGISTRATION BROCHURE. We think it's fair for everyone to know who they are competing with. Here is the Registration Brochure in ASCII