Large-Scale Hierarchical Classification Workshop

Related to the Pascal2 LSHTC challenge

32nd European Conference on Information Retrieval (ECIR)

The Open University in Milton Keynes, UK

Sunday, March 28th, 2010 (8:30am - 6:00pm)


Hierarchies are becoming ever more popular for the organization of documents, particularly on the Web (Web directories are an example of such hierarchies). Along with their widespread use comes the need for automated classification of new documents to the categories in the hierarchy. As the size of the hierarchy grows and the number of documents to be classified increases, a number of interesting problems arise. In particular it is one of the rare situations where data sparsity remains an issue despite the vastness of available data. The reasons for this are the simultaneous increase in the number of classes and their hierarchical organization. The latter leads to a very high imbalance between the classes at different levels of the hierarchy. Additionally, the statistical dependence of the classes poses challenges and opportunities for the learning methods.

Research on large-scale classification so far has focused on situations involving a large number of documents and/or a large numbers of features, with a limited number of categories. However, this is not the case in hierarchical category systems, such as DMOZ, or the International Patent Classification, where in addition to the large number of documents and features, a large number of categories exist, in the order of tens or hundreds of thousands. Approaching this problem, either existing large-scale classifiers can be extended, or new methods need to be developed. The goal of this workshop is to discuss and assess some of these strategies, covering all or part of the issues mentioned above.

Workshop Format

The workshop is intended for one day. All participants will be asked to prepare papers, which will be presented either as oral presentations or posters. Oral presentations will last 30 mns, including questions. All the papers will be 12 pages long and will be published in the proceedings of the workshop. Additionally, the program will include one invited talk and a round-table discussion.

The submissions to the workshop are elicited through an open call for papers and will undergo peer review by the programme committee. We expect submissions by the various teams who have actively participated to the LSHTC challenge. At the same time, we strongly encourage submissions from researchers not participating to the challenge.

Invited Speaker

Yiming Yang

Important dates

  • Paper submission - Jan 18
  • Acceptance notification - Feb 15
  • Final paper - Mar 1


Eric Gaussier, LIG, Grenoble, France
George Paliouras, NCSR "Demokritos", Athens, Greece
Aris Kosmopoulos, NCSR "Demokritos" and AUEB, Athens, Greece
Sujeevan Aseervatham, LIG, Grenoble, France