KDnuggets : News : 2006 : n19 : item8 < PREVIOUS | NEXT >


Subject: Open source search technology goes beyond keywords

September 25, 2006, By: Michael Stutz

For several years a group of academic researchers has been quietly working on a new kind of search engine -- one that recognizes the semantic meaning of a query instead of only taking input as a keyword to be literally matched. The technology is licensed under the GPL, and a desktop version is imminent.

In its simplest form, semantic indexing can recognize synonyms, or for example a search in an inventory database for "fruit" could turn up documents listing "apples" and "oranges."

Aaron Coburn, lead developer of the Semantic Indexing Project at Middlebury College, says that his team is currently documenting its open source search toolkit and finishing up a new desktop search application that should be released later this month.

All of the source code is available for download, published under the terms of the GNU General Public License. The project's core technology is the Semantic Engine, which is distributed with its C++ code, Perl bindings, and all the necessary code for building the GUI. There's also a Subversion archive for development versions. The new desktop application, called the the Standalone Engine, will be available later this month.

... Coburn added some software to visualize the semantic data in the database, and the search software became a powerful tool for plot visualization. He began using it to make visualizations of characters in Jane Austen novels, charting their various interactions through the course of the narrative. "And the algorithms seemed to do a really good job of detecting how the characters interacted!"


Read more.

KDnuggets : News : 2006 : n19 : item8 < PREVIOUS | NEXT >

Copyright © 2006 KDnuggets.   Subscribe to KDnuggets News!