Computing Platforms for Analytics, Data Mining, Data Science
The poll results suggest a split between a majority of data miners and data scientists who work with growing but still "PC-size", small GB-sized data, and a smaller group of Big Data analysts who work with cloud-sized data. Cloud computing, Unix, and especially Mac gained in popularity.
|How much memory your typical data mining job actually uses:|
|< 1 GB (28)|
|1-2 GB (44)|
|2-4 GB (58)|
|5-16 GB (94)|
|17 - 64 GB (33)|
|> 64 GB (25)|
Median memory size is in 5-16 GB range (say 8GB), also double the amount in 2010 Computer configuration poll, which was 3GB.
Fig 2. Data Mining Computing Resource Distribution: Processors (cores) vs Memory (GB). Circle size corresponds to the number of responses.
The most common memory size is 5 to 16 GB, and most common CPU size is 3-4 cores. We see that the bulk (~ 78%) of data mining analysis is done within a PC-style range, with up to 16GB of memory and up to 16 cores. However, there is also a small, but noticeable number of Big Data analysts, who used over 64GB of memory and over 64 processors.
Next poll question was on operating system.
|Operating System for your typical data mining job is:|
|Apple/Mac OS (47)|
Comparing to 2010 poll, Windows share went down from 67% to 58%, while Apple/Mac share more than doubled, from 7.6% to 17%. Unix/Linux share also increased, from 28% to 44%.
The Average number of operating systems used was only 1.2, and 67% of users only used one OS.
The Venn diagram above approximately shows the overlaps - the strongest affinity is between Unix/Linux users and Windows, and the weakest is between Windows and Mac.
23% of Windows users also use Unix/Linux, and 31% of Unix users also use Windows
32% of Mac users also use other Unix/Linux, and 12% of Unix/Linux users also use Mac
Only 2% of Win users also use Mac, and only 6% of Mac users also use Windows.
- US/Canada, 44%
- Europe, 38%
- Asia, 9.2%
- Latin America, 4.6%
- Australia/NZ, 2.1%
- Africa/MidEast, 1.8%