KDnuggets : News : 2002 : n16 : item22    (previous | next)

Briefs

Petabyte (a thousand terabytes) Data Mining coming in a decade ?

A deluge of digital data in life sciences and astronomy has scientists at Johns Hopkins University and Microsoft concluding that the titan of supersized data storage, the petabyte, may be as commonplace as the megabyte in less than a decade.

In a yet-to-be-published paper entitled, "Petabyte Scale Data Mining: Dream or Reality?" Microsoft research division distinguished engineer Jim Gray said, "Today's astronomy data sets -- with tens of millions of galaxies -- already present substantial challenges for data mining. In less than 10 years, the catalogs are expected to grow to billions of objects, and image archives will reach petabytes."

Computer storage is measured in bytes. A petabyte is 2 to the fiftieth power, or approximately a million gigabytes. Presently, only the largest database operations use petabyte-size data storage. "Google uses two petabytes and many others are in the petabyte range, such as AOL, Hotmail and Yahoo," Gray told NewsFactor.

Today, routinely managing -- or mining -- such vast volumes of data is cost-prohibitive for most firms. However, envisioning future data management at NASA's Sloan Digital Sky Survey, Gray and Johns Hopkins University astronomers Alexander Szalay and Jan vandenBurg predicted that six years from now, petabyte-sized data will be as "trivial to manipulate as operating a laptop computer."

See full story at

http://story.news.yahoo.com/news?tmpl=story&u=/nf/20020826/tc_nf/19162


KDnuggets : News : 2002 : n16 : item22    (previous | next)

Copyright © 2002 KDnuggets.   Subscribe to KDnuggets News!