KDnuggets : News : 2009 : n12 : item23 < PREVIOUS | NEXT >

Briefs

Extracting Meaning from Millions of Pages

University of Washington software pulls facts from 500 million Web pages.

By David Talbot. A software engine that pulls together facts by combing through more than 500 million Web pages has been developed by researchers at the University of Washington. The tool extracts information from billions of lines of text by analyzing basic relationships between words.

...

"The significance of TextRunner is that it is scalable because it is unsupervised," says Peter Norvig, director of research at Google, which donated the database of Web pages that TextRunner analyzes. "It can discover and learn millions of relations, not just one at a time. With TextRunner, there is no human in the loop: it just finds relations on its own."

Read more.


KDnuggets : News : 2009 : n12 : item23 < PREVIOUS | NEXT >

Copyright © 2009 KDnuggets.   Subscribe to KDnuggets News!