KDnuggets Home » News :: 2013 :: Oct :: Publications :: Rexer Analytics 2013 Data Miner Survey Highlights ( 13:n24 )

Rexer Analytics 2013 Data Miner Survey Highlights

          

Top 5 most used tools were R (used by 70% of data miners), IBM SPSS Statistics, Rapid Miner, SAS, and Weka, while STATISTICA, KNIME, SAS JMP, IBM SPSS Modeler, and RapidMiner had the the highest satisfaction. Big Data is actually used only in a small fraction of projects.

By Gregory Piatetsky, Oct 5, 2013.

Last week at Predictive Analytics World in Boston, Karl Rexer, the president of Rexer Analytics, presented the initial results of the very popular Data Miner Survey his company conducts since 2007. I attended his talk and he kindly shared his findings for publication in KDnuggets.

Full results will be published later in 2013, and results of all past surveys are freely available at www.rexeranalytics.com/ .

This was the 6th survey since 2007, and over 1,200 data miners from 75 countries have responded to 68 questions. The respondents breakdown by occupation was:

  • 35%, Corporate
  • 26%, Consultants
  • 18%, Vendors
  • 15%, Academics
  • 6%, NGO/Government

While geographic distribution was

  • 41% North America
  • 41% Europe
  • 11% Asia/Pacific
  • 4% Central & South America
  • 3% Middle East and Africa

Some the highlights from the survey

  • Over 85% of data miners working in corporate and consulting settings foresee increases in the number of projects
  • data miner job satisfaction is high, highest among vendors, lowest in government/NGO settings
  • The most common self-descriptions were Data Scientist, Researcher, Data Analyst, and Business Analyst

The average data miner reports using 5 different software tools. The top 10 most used tools were R (used by 70% of data miners), IBM SPSS Statistics, Rapid Miner, SAS, Weka, Matlab, Microsoft SQL, IBM SPSS Modeler, SAS Enterprise Miner, and KNIME.

Here is the chart:

Rexer Analytics 2013 Data Miner Survey - Tools Usage

The top 10 tools with ranked by usage as the primary tool were:

  1. R
  2. STATISTICA
  3. Rapid Miner
  4. SAS
  5. KNIME
  6. IBM SPSS Modeler
  7. IBM SPSS Statistics
  8. Weka
  9. SAS Enterprise Miner
  10. Matlab

The survey also measured tool satisfaction (with vendors excluded) and STATISTICA, KNIME, SAS JMP, IBM SPSS Modeler, and RapidMiner received the highest satisfaction ratings - see chart below.

Rexer Analytics 2013 Data Miner Survey - Tools Satisfaction

The survey also looked at Big Data. While reported data volumes have increased in 2007, only about 8% work with really big data, over 100,000,000 records, vs. 7% in 2007. Only 13% report having an Active Big Data program.

Rexer Analytics 2013 Data Miner Survey - Data Size

Full results will be freely available at www.rexeranalytics.com/ .


KDnuggets Home » News :: 2013 :: Oct :: Publications :: Rexer Analytics 2013 Data Miner Survey Highlights ( 13:n24 )