KDnuggets Home » Polls » largest dataset analyzed / data mined? Poll (Aug 2015)

What was the largest dataset you analyzed / data mined?


 
  

What was the largest dataset you analyzed / data mined? [459 voters]

less than 1 MB (9)  2.0%
1.1 to 10 MB (19)  4.1%
11 to 100 MB (16)  3.5%
101 MB to 1 GB (55)  12.0%
1.1 to 10 GB (90)  19.6%
11 to 100 GB (80)  17.4%
101 GB to 1 Terabyte (85)  18.5%
1.1 to 10 TB (52)  11.3%
11 to 100 TB (21)  4.6%
101 TB to 1 Petabyte (11)  2.4%
1.1 PB to 10 Petabyte (4)  0.9%
11 to 100 PB (5)  1.1%
over 100 PB (12)  2.6%


here is the more detailed analysis of poll results:

Where is Big Data? For most, Largest Dataset Analyzed is in laptop-size GB range
Comments

kevin, 101GB-1TB
Seems like the 101GB to 1TB group should be broken down to more granular groups - particularly due to limitations such as in-memory analytics - there's still substantial difference between 128GB RAM and 1TB RAM - granularity in this group at reveal meaningful insights. Something to keep in mind for next time

Gregory Piatetsky, Editor, Data size - RAW
This poll looks at the raw data size - at the beginning of analysis process. After preprocessing, even largest data may shrink to only a few useful rules.

Laura Squier, Largest dataset mined
Are we considering the size of the original raw data? Or the size of the dataset once it is prepped and ready for modeling?
Two very different answers.

For comparison, here are results of previous polls:

KDnuggets Home » Polls » largest dataset analyzed / data mined? Poll (Aug 2015)