Computing Platforms for Analytics, Data Mining, Data Science

The poll results suggest a split between a majority of data miners and data scientists who work with growing but still "PC-size", small GB-sized data, and a smaller group of Big Data analysts who work with cloud-sized data. Cloud computing, Unix, and especially Mac gained in popularity.

How much memory your typical data mining job actually uses:
< 1 GB (28)  10%
1-2 GB (44)  16%
2-4 GB (58)  21%
5-16 GB (94)  33%
17 - 64 GB (33)  12%
> 64 GB (25)  9%

Median memory size is in 5-16 GB range (say 8GB), also double the amount in 2010 Computer configuration poll, which was 3GB.

Data Mining Computing Resources Poll: CPU vs Memory
Fig 2. Data Mining Computing Resource Distribution: Processors (cores) vs Memory (GB). Circle size corresponds to the number of responses.

The most common memory size is 5 to 16 GB, and most common CPU size is 3-4 cores. We see that the bulk (~ 78%) of data mining analysis is done within a PC-style range, with up to 16GB of memory and up to 16 cores. However, there is also a small, but noticeable number of Big Data analysts, who used over 64GB of memory and over 64 processors.

Next poll question was on operating system.

Operating System for your typical data mining job is:
Windows (164)  58%
Unix/Linux (123)  44%
Apple/Mac OS (47)  17%
Other (4)  1.4%

Comparing to 2010 poll, Windows share went down from 67% to 58%, while Apple/Mac share more than doubled, from 7.6% to 17%. Unix/Linux share also increased, from 28% to 44%.

The Average number of operating systems used was only 1.2, and 67% of users only used one OS.
Data Mining Os Win Unix Mac
The Venn diagram above approximately shows the overlaps - the strongest affinity is between Unix/Linux users and Windows, and the weakest is between Windows and Mac.
23% of Windows users also use Unix/Linux, and 31% of Unix users also use Windows
32% of Mac users also use other Unix/Linux, and 12% of Unix/Linux users also use Mac
Only 2% of Win users also use Mac, and only 6% of Mac users also use Windows.

Regional Participation
  • US/Canada, 44%
  • Europe, 38%
  • Asia, 9.2%
  • Latin America, 4.6%
  • Australia/NZ, 2.1%
  • Africa/MidEast, 1.8%

