KDnuggets Home » News » 2010 » Nov » Software » Google Refine  ( < Prev | 10:n27 | Next > )

Google Cleans Up Messy Data with Refine

Google Refine is a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase.

Mashable, Jolie O'Dell, Nov 10, 2010


If you live for data, slave over spreadsheets and constantly find yourself sifting through endless rows and columns of facts and figures, Google's got a lovely new product just for you - and it's free and open-source, too.

For example, if you're writing an academic paper, government study or news article that requires you to download and parse spreadsheets from Data.gov or similar source of free information, you might notice all kinds of inconsistencies when you try to sort the data. This is a particular problem when you're using free, open-to-the-public data that no one has maintained or cleaned up in the past.

Google Refine builds on its Gridworks roots by helping its users correct inconsistencies, changing data formats, extending data sets with data from web sources and other databases and much more. Refine also brings "a new extensions architecture, a reconciliation framework for linking records to other databases (like Freebase (Freebase)) and a ton of new transformation commands and expressions," according to the official Google Open Source blog (blog).

Here is the first introductory video

More at code.google.com/p/google-refine/ and

Read more of Mashable review.

KDnuggets Home » News » 2010 » Nov » Software » Google Refine  ( < Prev | 10:n27 | Next > )