The median largest dataset is now in 11-100 GB range for almost all the regions, and about 20% of all analysts now have experience with Terabyte size datasets.
The previous KDnuggets Poll asked:
What was the largest database / dataset you analyzed?
The results show that the peak of the distribution shifted from
1.1-10 GB in 2011 to 11-100 GB in 2012. The median answer in 2012 can be estimated to be in 20-40 GB range, compared to 10-20 GB range median answer in 2011.
What was the largest database / dataset you analyzed? [347 votes]|
We note that more than twice as many analysts participated in 2012 poll (347 people) than in a similar
2011 Poll: Largest Dataset analyzed/ data mined, (148 people).
Note: In 2011 the maximum range given in the poll was
1 PB and over, so no direct comparison for Petabyte range is possible between 2011 and 2012 results.
The percentage of analysts with experience in the upper range of datasets (over 100 GB) has remained
around 35% - same as in 2011, but this could be due to larger participation in 2012 poll.
In 2010, about 32% of respondents worked with 100GB and larger DB.
Regional breakdown (below) shows that almost every region now has a median in 11-100 GB range.
|Region (voters)||Largest Dataset Analyzed (median)||% analyzed TB+ data|
|Latin America (21)
|AU/New Zealand (10)
|Africa/Middle East (10)