KDnuggets : News : 2004 : n15 : item18 | PREVIOUS | NEXT |
SoftwareFrom: Ke WangDate: 7 Aug 2004 Subject: Document clustering software - FIHC 1.0 (Free) A new document clustering software, FIHC 1.0, is available to academic and research community: http://www.cs.sfu.ca/~ddm. The package includes executable code, source code, sample data. FIHC, Frequent Itemset-based Hierarchical Clustering, is a program that constructs a document cluster hierarchy from a set of unlabeled documents based on "frequent itemsets". As an abstraction of "English sentences", frequent itemsets serve a natural measure of cohesiveness of a cluster: documents in the same cluster are expected to share more itemsets than those in different clusters. FIHC produces a hierarchy of clusters in a XML file that can be browsed interactively based on the cluster description that is also frequent itemsets. |
KDnuggets : News : 2004 : n15 : item18 | PREVIOUS | NEXT |
Copyright © 2004 KDnuggets. Subscribe to KDnuggets News!