Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News :: 2013 :: Sep :: News, Software :: Mikut Data Mining Tools Big List - Update ( 13:n22 )

Mikut Data Mining Tools Big List – Update

An update of the Excel table describing 325 recent and historical data mining tools is now online (Excel format), 31 of them were added since the last update in November 2012. These new updated tools include new published tools and some well-established tools with a statistical background.

Here is the full updated table of tools, (XLS format) which contains additional material to the paper

R. Mikut, M. Reischl: "Data Mining Tools". Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. DOI: 10.1002/widm.24., September/October 2011, Vol. 1

Please help the authors to improve this Excel table:

Here are parts of the table with the active tools:

License code: CO - commercial, OS - open source.

Data Mining Systems:

Tool Company License Remarks
11 Ants 11Ants Analytics CO family of data mining tools with a focus on business applications
ADAPA Zementis Inc. CO develops the ADAPA decision engine which is a framework to deploy, integrate, and execute predictive models in PMML, add-ins for Excel, IBM cloud solution (Software as a Service - SaaS)
Coheris SPAD Data Mining Coheris CO company provides also solutions for text mining, former company SPAD
D2K - Data to Knowledge U. of Illinois CO/OS additional tools for EA and text mining, tool I2K for images under development, free academic version, see Alcala09, no developments since 2004
Data Applied Data Applied CO web service for Data Analysis, SAAS
DataDetective Sentient CO with tools for fuzzy matching, applications on CRM, crime analysis, fraud detection
GhostMiner FQS Poland / Fujitsu CO multi model support
IBM SPSS Modeler IBM CO former Clementine, now in cooperation with IBM, Predictive Analytics Software (PASW), SPSS is an IBM company since 2009
InfiniteInsight KXEN CO (Knowledge eXtraction ENgines) providing predictive software tools (based on Vapnik Learning Theory) to application providers and system integrators
JMP SAS Institute CO free trial, additional special tools for genomics
KnowledgeStudio ANGOSS Software CO PMML support and code generation
Model Builder FICO CO company's former name Fair Isaac Corporation
Oracle Data Mining (ODM) Oracle CO provides GUI, PL/SQL-interface, and Java-interface to Attribute Importance, Bayes Classification, Association Rules, Clustering, SVM
Partek Discovery Suite Partek Incorporated CO additional special solutions for genomics, free demos
PolyAnalyst Megaputer CO from Goebel99, support for text mining
Predixion Enterprise Insight Predixion Software CO data mining suite with a focus to standard worksflows, big data support, cloud options, OEM options possible
RapidAnalytics Rapid-I GmbH CO/OS server built on top of RapidMiner, focussed on client-server solutions, user and user rights management, web interfaces, web services, process scheduler, reports, dashboards; collaborative access for teams and companies with many users
RapidMiner Rapid-I GmbH OS formerly YALE, more than 1000 algorithms and operators for data mining, text mining, web mining, time series analysis and forecasting, audio mining, image mining, predictive analytics, ETL, reporting, integrates Weka and R and Hadoop (Radoop), repository under
Revolution R Enterprise Revolution Analytics OS/CO based on open source software R with many additional tools for big data (e.g. Hadoop support) and database coupling, some commercial parts also free for academic use
Salford Predictive Modeling Suite (SPM) Salford Systems CO includes former separate tools CART, MARS, TreeNet, Random Forests
SAS Enterprise Miner SAS Institute CO one of the world's leading tools, enterprise oriented
SQL Server Analysis Service Microsoft CO special coupling to SAP software
Stata StataCorp LP CO actually coming from statistics, many methods included
STATISTICA StatSoft CO additional tools for text mining
Think Enterprise Data Miner (EDM) thinkAnalytics CO massively scalable, embeddable, Java-based real-time data-mining platform, former name K.wiz
TIBCO Spotfire Miner TIBCO CO coupling to S-Plus, R
scikit learn various OS Python-based collection of data mining tools
WEKA U. of Waikato OS most well-known software, integrated in many other tools, different extensions, e.g. for human genetics WEKA-CG

Libraries for Data Mining

Name Company License Remarks
Fast Artificial Neural Network Library (FANN) various OS multilayer artificial neural networks in C
JAVA Data Mining Package various OS JAVA based, alpha version, no update since 2009
Julia various OS open source language for technical computing, yet under development (started in 2012), includes some data mining libraries (as e.g. decision trees, clustering, LIBSVM), aims at fast analysis for big data, parallel processing etc.
LibSVM National Taiwan University OS for support vector classification and regression, C++, JAVA-based
MLC++ Silicon Graphics, U. of Stanford OS C++ library for supervised learning, included in SGI's MineSet
NAG Data Mining Components Numerical Algorithms Group Ltd (NAG) CO components in C++
Neurofusion Alyuda Research CO is a general-purpose ANN C++ library that can be used to create, train and apply constructive neural networks for solving both regression and classification problems
OpenNN various OS open ANN library, multilayer perceptron neural network in the C++, former name Flood
OpenPR various OS library for image processing, pattern reognition, computer vision and natural language processing, based on C++, Scilab support
Orange U. Ljubljana OS Python scripts, extensions for text mining and bioinformatics, see Chen07, Alcala09
ROOT Cern OS C++ support, LPGL license, general parallel processing framework
SMILE U. of Pittsburgh OS specialized to Bayesian Networks, developed since 1998
Waffles various OS C++ library, additional command line functionality, some exotic methods
XELOPES Library Prudsys CO/OS in Java, C++, different license models, PMML support
WEKA U. of Waikato OS most well-known software, integrated in many other tools, different extensions, e.g. for human genetics WEKA-CG

Get the full table at

The color code for Excel tools table is:

  • green: active and relevant tools
  • yellow: less active and/or less relevant tools
  • red: historical tools or not yet available tools

Sign Up

By subscribing you accept KDnuggets Privacy Policy