KDnuggets Home » News » 2012 » May » Poll Results: Largest Dataset Analyzed  (  12:n11 | Next > )

Poll Results: Largest Dataset Analyzed


 
  
The median largest dataset is now in 11-100 GB range for almost all the regions, and about 20% of all analysts now have experience with Terabyte size datasets.


The previous KDnuggets Poll asked: What was the largest database / dataset you analyzed?

The results show that the peak of the distribution shifted from 1.1-10 GB in 2011 to 11-100 GB in 2012. The median answer in 2012 can be estimated to be in 20-40 GB range, compared to 10-20 GB range median answer in 2011.

What was the largest database / dataset you analyzed? [347 votes]
Largest dataset analyzed in 2012 vs 2011

We note that more than twice as many analysts participated in 2012 poll (347 people) than in a similar 2011 Poll: Largest Dataset analyzed/ data mined, (148 people).
Note: In 2011 the maximum range given in the poll was 1 PB and over, so no direct comparison for Petabyte range is possible between 2011 and 2012 results.

The percentage of analysts with experience in the upper range of datasets (over 100 GB) has remained around 35% - same as in 2011, but this could be due to larger participation in 2012 poll. In 2010, about 32% of respondents worked with 100GB and larger DB.

Regional breakdown (below) shows that almost every region now has a median in 11-100 GB range.

Region (voters)Largest Dataset Analyzed (median)% analyzed TB+ data
US/Canada (154) 11-100 GB  24.0%
Europe (106) 11-100 GB  18.4%
Asia (46) 11-100 GB  17.4%
Latin America (21) 11-100 GB  23.8%
AU/New Zealand (10) 11-100 GB  20.0%
Africa/Middle East (10) 1-10 GB  20.0%


KDnuggets Home » News » 2012 » May » Poll Results: Largest Dataset Analyzed  (  12:n11 | Next > )