Sie sind hier

Finding Good URLs Evaluation Dataset

Queries, corpus information and relevance judgements used for evaluating the task of mapping entities in a knowledge base to public web documents. The dataset provides graded relevance judgements in a format that can directly be processed by the trec_eval tool. Two additional files provide a mapping from document-IDs to URLs and topic-IDs to human readable queries.

The dataset has been introdcued in a paper at the ISWC workshop on Web of Linked Entities 2012. 

License

The URL collection, relevance judgements and query selection by Christian Hachenberg and Thomas Gottron are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

If you are using the files in a scientific context that leads to publications, please cite the related publication provided below.

Dataset

Compressed in ZIP format: wole-2012-dataset.zip

Related Publications

C. Hachenberg and T. Gottron, “Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Document Representations” in WoLE’12: Proceedings of the ISWC workshop on Web of Linked Entities, Nov. 2012.