Web Retrieval

Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it.

(Samuel Johnson)


Information Retrieval (IR) is dealing with the storage, representation and management of information items. In a classical setting the information items correspond to text documents. With the advent of the World Wide Web, the methods of IR have been transferred to retrieval on the web. This poses different challenges and has spawned the area of Web Retrieval.

The lecture will give an introduction in established retrieval models for text based documents, models that exploit the graph structure of the WWW, the topic of evaluating the performance of retrieval systems and related tasks like classification and clustering of web documents.   


Lecture Material

Slides and additional material will be provided along with the progress of the lecture. We will try to publish the material at least one day before the lecture.

Lecture slides

Tutorials and Exercises

Recommended Reading

C. Manning, P. Raghavan, H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008. Online version available

R. Baeza-Yates, B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999. (A second, updated edition of the book is also be available)

C.J. van Rijsbergen. Information Retrieval. Buttersworth, 1979. Online version available