Date: Jun 8, 2010
I have manually annotated all 139 abstracts from the ACM KDD'09 proceedings. All their data mining concept mentions are linked to a data mining ontology. It could make a good benchmark dataset for text mining tasks.
The following paper summarizes both the corpus and ontology.
The ICDM'09 abstracts are currently being annotated, and the data mining ontology updated.
Contact gmelli AT gabormelli DOT com if interested in participating in this knowledge discovery benchmark effort.