KDnuggets : News : 2005 : n16 : item39 < PREVIOUS | NEXT >

Briefs

Automated mining of Deep Web

Aug. 17, 2005, By Michael Bazeley, Mercury News

The Web is made up of hundreds of billions of Web documents -- far more than the 8 billion to 20 billion claimed by Google or Yahoo. But most of these Web pages are largely unreachable by most search engines because they are stored in databases that cannot be accessed by Web crawlers.

Now a San Mateo start-up called Glenbrook Networks -- says it has devised a way to tunnel far into the ``deep web'' and extract this previously inaccessible information.

Glenbrook, run by a father-daughter team, demonstrated its technology by building a search engine that scoops up job listings from the databases of various Web sites (see www.glendor.com), something the company claims most search engines cannot do. But there are myriad other applications as well, the founders say.

``Most of the information out there, people want you to see,'' said Julia Komissarchik, Glenbrook Networks' vice president of products. ``But it's not designed to be accessed by a machine like a search engine. It requires human intervention.''

Here is the rest of the story.


KDnuggets : News : 2005 : n16 : item39 < PREVIOUS | NEXT >

Copyright © 2005 KDnuggets.   Subscribe to KDnuggets News!