KDnuggets : News : 2006 : n19 : item4 < PREVIOUS | NEXT >

Features

From: Colin Shearer
Date: 09 Oct 2006
Subject: First CRISP-DM 2.0 Workshop Held

As reported in KDnuggets News (06:n14, Jul 25, 2006), an initiative was launched earlier this year to update the CRISP-DM data mining methodology. CRISP-DM 1.0 has been unchanged since its publication in 1999, and this revision aims to address the requirements of today’s data mining projects and to bring CRISP-DM into alignment with the latest best practice.

CRISP-DM 2.0 follows the structure that successfully produced CRISP-DM 1.0: a core consortium tasked with delivering the methodology documents, supported by an open membership Special Interest Group (SIG) of interested parties who provide input and ideas, and who review, discuss and critique draft work on the methodology. In CRISP-DM 1.0, the core Consortium comprised ISL/SPSS, Teradata, automotive giant Daimler-Benz and Dutch insurer OHRA. SPSS and Teradata continue to lead CRISP-DM 2.0, and suitable end-user partners are being sought to join them in the Consortium. The SIG consists of vendors, service providers, researchers and end-users. Over the three years of the CRISP-DM 1.0 project, the SIG membership grew to just over 200; since the CRISP-DM 2.0 SIG was announced in July, over 300 members have already signed up.

On 26th September, the first CRISP-DM 2.0 SIG Workshop was held in Chicago. SIG members from a cross-section of the membership were present: vendors, service organisations and consultants, academics, and end users from the telco, finance and energy sectors. The Workshop content included: background on CRISP-DM 2.0 and review of SIG status; presentations by both the Consortium and SIG members on priorities for CRISP-DM 2.0; and discussion on content priorities, timing and approach for producing the revision, and ways to best engage the SIG in the revision process.

Presentations were stimulating and thought provoking, and the discussion was wide-ranging and enthusiastic. While many detailed points were discussed, and topics like the fit of CRISP-DM with other business and IT methodologies and processes aroused much interest, there was consensus that two important areas to address are:

  • The proliferation of new data types – text, web data, etc. – that need dedicated preprocessing and appropriate consideration in analysis.
  • Deployment in the broadest sense – including delivery of results/models to operational systems, integration with business processes, and ongoing monitoring and updating of solutions.
There was also agreement that the best approach is to bring out CRISP-DM 2.0 quickly, fixing the main points that are needed to bring the methodology up to date, and defer possibly more radical changes to a longer-term 3.0 version. The Consortium hopes to complete CRISP-DM 2.0 within approximately six months.

A further Workshop is planned for Europe early in 2007. Meantime, a SIG Members area is being established on the CRISP-DM website to facilitate dissemination, discussion and review. A SIG webinar will also be scheduled in the near future.

SIG membership remains open and is free. Members can register at http://www.crisp-dm.org/new.htm , or contact Colin Shearer (cshearer@spss.com) for further information.


KDnuggets : News : 2006 : n19 : item4 < PREVIOUS | NEXT >

Copyright © 2006 KDnuggets.   Subscribe to KDnuggets News!