KDnuggets : News : 2009 : n01 : item8 < PREVIOUS | NEXT >

Software

From: Graham Williams
Date: Mon, 5 Jan 2009
Subject: Rattle version 2.4.0 released - open source data mining

Version 2.4.0 of the free data mining software, Rattle, has been released.

Rattle (the R Analytic Tool To Learn Easily) provides a Gtk2 based data mining focussed graphical interface to R, a free software environment for statistical computing and graphics.

The aim of Rattle is to provide a simple and intuitive interface that allows a user to quickly load data from a CSV file (or via ODBC), transform and explore the data, build and evaluate models, and export models as PMML (predictive modelling markup language) or as scores. All of this with little knowledge of R.

All R commands are logged and commented through the log tab and so available to the user as a script file for repeatable data mining or as an aide to the user to interact directly with R itself.

The rattle package runs on multiple platforms (GNU/Linux, Mac/OSX, MS/Windows). It has undergone a lot of development over the past year. A companion book introducing data mining using Rattle is under development, with a draft available for review.

See the Installing Rattle page of the Rattle website for a guide to installing the software. See the Getting Started page for a quick start to data mining, in 4 clicks.

Recent updates include:

  • Many bug fixes, GUI simplifications, improved colour usage (through the vcd package).
  • Streamlined handling of datasets (combined the Data and Select tabs), including support for CSV, ARFF, and ODBC data sources.
  • Addition of Test and Transforms tabs, with scripting support for transforms, both in building a model and in scoring.
  • Experimental support for automatic report generation using odfWeave (and hence, generation of OpenOffice documents).
  • Supported modelling includes:
    • Cluster (kmeans, hclust)
    • Association Rules (arules)
    • Linear Models (lm, glm)
    • Trees (rpart, party)
    • Neural Nets (nnet)
    • Support Vector Machines (ksvm)
    • Boosting (ada)
    • Random Forests (randomForest)
  • Supported data exploration tools include GGobi and PlayWith (latticist).
  • ROC curves, CostCurves, and many standard plots are supported.
  • Export to PMML is supported for many models, allowing R (and hence Rattle) models to be readily exported to other tools. (See Zementis for an example consumer of PMML).
  • The Business Intelligence vendor, Information Builders, will soon release RStat (sharing Rattle's open source code base) for data mining within WebFocus.

Comments, bugs, suggestions are always welcome.

Regards, Graham Williams


KDnuggets : News : 2009 : n01 : item8 < PREVIOUS | NEXT >

Copyright © 2009 KDnuggets.   Subscribe to KDnuggets News!