ECML/PKDD 2012 Discovery Challenge:
Large Scale Hierarchical Text Classification

The challenge comprises three tracks and it is based on two large datasets created from the ODP web directory (DMOZ) and Wikipedia.

Date:

Web site: lshtc.iit.demokritos.gr/
Email: lshtc_info@iit.demokritos.gr

This year's discovery challenge hosts the third edition of the successful PASCAL challenges on large scale hierarchical text classification. The challenge comprises three tracks and it is based on two large datasets created from the ODP web directory (DMOZ) and Wikipedia. The datasets are multi-class, multi-label and hierarchical. The number of categories ranges between 13,000 and 325,000 roughly and the number of documents between 380,000 and 2,400,000.

The tracks of the challenge are organized as follows:

1. Standard large-scale hierarchical classification
a) On collection of medium size from Wikipedia
b) On a large collection from Wikipedia

2. Multi-task learning, based on both DMOZ and Wikipedia category systems

3. Refinement-learning
a) Semi-Supervised approach
b) Unsupervised approach

In order to register for the challenge and gain access to the datasets you must have an account at the challenge Web site.

Important dates:

March 30, start of the challenge
April 20, opening of the evaluation
June 29, closing of evaluation
July 20, paper submission deadline
August 3, paper notifications

For more information and to participate, visit lshtc.iit.demokritos.gr/

Related
→ Data Mining Competitions

ECML/PKDD 2012 Discovery Challenge:Large Scale Hierarchical Text Classification

ECML/PKDD 2012 Discovery Challenge:
Large Scale Hierarchical Text Classification