KDnuggets Home » Polls » Computing resources for analytics, data mining, data science work or research Poll (Mar 2015)

Computing resources for analytics, data mining, data science work or research Poll


 
  
The tables below summarize the results of KDnuggets Poll:
Computing resources for your analytics, data mining, data science work or research, based on 282 voters.

The Venn diagram below shows the relative popularity of PC/Laptop (85%), Server (30%), and Cloud platforms (24%), and also the overlaps.
Interestingly, PC remains the most popular platform for data mining and analytics work, although a significant part is also done in the cloud and on a server.

Platform Popularity for Analytics / Data Mining: PC vs Server vs Cloud
Fig 1: Platform Popularity for Analytics / Data Mining: PC vs Server vs Cloud.

The average data miner in this poll used 1.4 platforms.
54% used only PC/laptop, 9% used only dept server, and 5% only the cloud.

Among those that used cloud computing, 59% of them used private cloud, 46% used public cloud, and 6% used both.

Next poll question was on processing power.

How many processors/cores your typical data mining job actually uses?
1 core (69)  24%
2 cores (45)  16%
3-4 cores (86)  30%
5-16 cores (51)  18%
17-64 cores (17)  6.0%
> 64 cores (14)  5.0%


Median number of cores is 3-4
Although my note on the poll said "if you have 8-core processor but your job only uses 1, choose 1", I suspect most voters ignored this, since the most common answer is 3-4 cores which probably corresponds to the number of cores in the CPU (many popular laptops/PCs now have 4 cores), but I doubt that data mining jobs are so well parallelized that they actually use all 4 cores.

Comparing with a similar poll 2010 KDnuggets poll: Computer configuration for your main Analytics / Data Mining machine, we see that the median number of cores only doubled, to 3-4 in 2015 from 2 in 2010.

Next poll question was on memory.

How much memory your typical data mining job actually uses:
< 1 GB (28)  10%
1-2 GB (44)  16%
2-4 GB (58)  21%
5-16 GB (94)  33%
17 - 64 GB (33)  12%
> 64 GB (25)  9%

Median memory size is in 5-16 GB range (say 8GB), also double the amount in 2010 Computer configuration poll, which was 3GB.

Data Mining Computing Resources Poll: CPU vs Memory
Fig 2. Data Mining Computing Resource Distribution: Processors (cores) vs Memory (GB). Circle size corresponds to the number of responces.

The most common memory size is 5 to 16 GB, and most common CPU size is 3-4 cores. We see that the bulk (~ 78%) of data mining analysis is done within a PC-style range, with up to 16GB of memory and up to 16 cores.

Next poll question was on operating system.

Operating System for your typical data mining job is:
Windows (164)  58%
Unix/Linux (123)  44%
Apple/Mac OS (47)  17%
Other (4)  1.4%

The Average number of operating systems used was only 1.2, and 67% of users only used one OS.
Data Mining Os Win Unix Mac
The Venn diagram above approximately shows the overlaps - the strongest affinity is between Unix/Linux users and Windows, and the weakest is between Windows and Mac.
23% of Windows users also use Unix/Linux, and 31% of Unix users also use Windows
32% of Mac users also use other Unix/Linux, and 12% of Unix/Linux users also use Mac
Only 2% of Win users also use Mac, and only 6% of Mac users also use Windows.

Regional Participation
  • US/Canada, 44%
  • Europe, 38%
  • Asia, 9.2%
  • Latin America, 4.6%
  • Australia/NZ, 2.1%
  • Africa/MidEast, 1.8%

KDnuggets Home » Polls » Computing resources for analytics, data mining, data science work or research Poll (Mar 2015)