KDnuggets Home » News » 2011 » May » Poll results: Largest dataset analyzed  ( < Prev | 11:n12 | Next > )

Poll results: Largest dataset analyzed


 
  
Globally, 21% of data miners worked with Terabyte or larger datasets, and 30% in the US/Canada. The median was in 10-20 GB range.



The latest KDnuggets Poll asked:
What was the largest database / dataset you analyzed?

Comparing the results of 2011 poll with a similar 2010 Poll: Largest Database Data Mined / Analyzed, we see that median dataset size in 2011 is in 10-20 GB range, while the median in 2010 was in 8-10 GB range.

Largest dataset analyzed in 2011 vs 2010
Largest dataset analyzed in 2011 vs 2010

We note the steady growth of analysts with experience in the web-scale range of datasets.
In 2011 about 35.4% reported analyzing over databases over 100 GB (vs 32.2% in 2010), and 21.4% - over 1 Terabyte (vs 18.3% in 2010).

Regional breakdown shows that US leads in percent of data miners who worked with terabyte range datasets (about 30%).
(Note: Australia/NZ region not included, since not enough responses were received).

Region (voters)Largest Dataset Analyzed (median)% analyzed TB+ data
US/Canada (53) 11-100 GB 30.2%
Europe (49) 11-100 GB 18.4%
Asia (20) 1-10 GB 10%
Latin America (15) 1 GB 6.7%
Africa/Middle East (7) 1-10 GB 28.6%

Here is another breakdown of Largest Dataset Analyzed by region.

Largest dataset analyzed in 2011 by region

Comments:

Gregory Piatetsky
see next KDnuggets Poll: Which data mining/analytic tools you used in the past 12 months for a real project
www.kdnuggets.com/2011/05/new-poll-analytics-data-mining-tools.html

Zoltán Prekopcsák
It would be interesting to know what tools they use for analyzing TBs of data.


KDnuggets Home » News » 2011 » May » Poll results: Largest dataset analyzed  ( < Prev | 11:n12 | Next > )