KDnuggets Home » Polls » Largest Database Data Mined / Analyzed Poll (Jun 2010)

Largest Database Data Mined / Analyzed Poll

What was the largest database or dataset you data-mined / analyzed? [296 votes total]
less than 1 MB (14)  5%
1.1 to 10 MB (19)  6%
11 to 100 MB (25)  8%
101 MB to 1 GB (36)  12%
1.1 to 10 GB (69)  23%
11 to 100 GB (37)  13%
101 GB to 1 Terabyte (TB) (42)  14%
1.1 to 10 TB (29)  10%
11 to 100 TB (7)  2%
101 TB to 1 Petabyte (6)  2%
over 1 Petabyte (12)  4%

We can estimate the median answer as 8 - 10 GB (upper end of 1-10 GB range). Also, we note that 32% reported analyzing databases over 100 GB,
and 18% - over 1 Terabyte. New in this poll was a petabyte (1,000,000 GB) range and 12 people had worked with such humongous databases.

In comparison, a similar 2009 poll, What was the largest database or dataset you data-mined?, had the median answer in slighly larger 10-20 GB range (probably because fewer people participated).
However, in 2009 only 20% had experience working with over over 100 GB databases, and only 10% had experience working with over 1 Terabyte databases.

Here is a breakdown of responses by region. Note that Asia and Australia/New Zealand have a higher proportion of terabyte/petabyte data miners than the US and Europe.

Largest Database Data Mined/ Analyzed by region
Largest DB analyzed by region

KDnuggets Home » Polls » Largest Database Data Mined / Analyzed Poll (Jun 2010)