Discovery Challenge Workshop on Large Scale Hierarchical Classification

ECML/PKDD 2012, Discovery Challenge Workshop on Large Scale Hierarchical Classification

Bristol, UK on September 28th, 2012


Hierarchies are becoming ever more popular for the organization of documents, particularly on the Web (Web directories are an example of such hierarchies). Along with their widespread use comes the need for automated classification of new documents to the categories in the hierarchy. As the size of the hierarchy grows and the number of documents to be classified increases, a number of interesting problems arise. In particular it is one of the rare situations where data sparsity remains an issue despite the vastness of available data. The reasons for this are the simultaneous increase in the number of classes and their hierarchical organization. The latter leads to a very high imbalance between the classes at different levels of the hierarchy. Additionally, the statistical dependence of the classes poses challenges and opportunities for the learning methods

Research on large-scale classification so far has focused on situations involving a large number of documents and/or a large numbers of features, with a limited number of categories. However, this is not the case in hierarchical category systems, such as DMOZ, the International Patent Classification or Wikipedia, where, in addition to the large number of documents and features, a large number of categories exist, in the order of tens or hundreds of thousands. Approaching this problem, either existing large-scale classifiers can be extended, or new methods need to be developed. The goal of this workshop, which follows two editions, is to discuss and assess some of these strategies, covering all or part of the issues mentioned above.

Workshop Format

The workshop is intended for one day. All participants will be asked to prepare papers, which will be presented either as oral presentations or posters. Submissions must be written in English, following the LNCS guidelines and must not exceed 12 pages including references and figures. Additionally, the program will include one invited talk and a round-table discussion.

The submissions to the workshop are elicited through an open call for papers and will undergo peer review by the programme committee. We encourage submissions on all aspects of large-scale categorization, from purely theoretical work to practical developments of large-scale categorizers.

Invited Speaker

Zaid Harchaoui: Large-scale learning with gauge regularization for visual classification

Important dates

  • Paper submission - July 27
  • Notification - August 10
  • Final paper - August 24
  • Workshop - September 28


Ion Androutsopoulos, AUEB, Athens, Greece
Thierry Artières, LIP6, Paris, France
Patrick Gallinari, LIP6, Paris, France
Eric Gaussier, LIG, Grenoble, France
Aris Kosmopoulos, NCSR "Demokritos" & AUEB, Athens, Greece
George Paliouras, NCSR "Demokritos", Athens, Greece
Ioannis Partalas, LIG, Grenoble, France