Sie sind hier

Web Information Retrieval

Information Retrieval (IR) is dealing with the storage, representation and management of information items. In a classical setting the information items correspond to text documents. With the advent of the World Wide Web, the methods of IR have been transferred to retrieval on the web. This poses different challenges and has spawned the area of Web Retrieval.

The lecture will give an introduction in established retrieval models for text based documents, models that exploit the graph structure of the WWW, the topic of evaluating the performance of retrieval systems and related tasks like classification and clustering of web documents.

Web Information Retrieval (6 ECTS-Credits) is a lecture given in English that

  • is a mandatory course for master students of Web Science
  • can be taken as an elective course by bachelor and master students of Informatik and Computervisualistik, and by master students of Wirtschaftsinformatik and Information Management

This course has the following prerequisites that we expect students to be familiar with — if you feel like you should freshen up on these topics, do so in the first few weeks:

  • For the theorical parts, students should have knowledge about Algorithms and Data Structures and Linear Algebra.
  • For the programming assignments, students should be familiar with Java. (For building and executing, we will use Maven, the important parts will be shown in the tutorial. As a programming IDE, we recommend Eclipse.)


  • The second trial Oral Exam for Web Retrieval will take place on November 15th 13:00-16:00. Students who need to register for the exam should send me an email to Dr. Kumar.
  • Date for the written exam is August 15, 2016. The exam will take place in lecture room E 011 from 16:00 (s.t.) to 18:00. Prior registration in KLIPS is required to participate in the exam.
  • No lecture on April 25, 2016.
  • The tutorials will start April 15, 2016.
  • The lectures will start April 11, 2016.

Organisational Information

Lecture (Klips)

Lecturer Dr. Chandan Kumar

Date Mo 16:00 - 18:00
B 016, KO Building B

Tutorial (Klips)

Instructor Lukas Schmelzeisen
Date Fr 10:00 - 12:00
B 016, KO Building B

If you ever have any questions or want to discuss something about the lecture or the assignments, feel free to ask

  • your colleagues
  • in the Web Science newsgroup infko.webscience.
  • in our Facebook group (Don't worry if you don't have/want Facebook, joining this group is by no means required, all news will be published here and in the newsgroup.)
  • the teachers via e-mail


Slides and additional material will be provided along with the progress of the lecture. 


  1. Organization (PDF)
  2. Introduction (PPT) (PDF)
  3. Preprocessing (PPT) (PDF)
  4. Evaluation (PPT) (PDF)
  5. Boolean Model (PPT) (PDF)
  6. Vector Space Model (PPT) (PDF)
  7. Probabilistic Language Model (PPT) (PDF)
  8. Web Search Characteristics  (PPT) (PDF)
  9. Web Crawling (PPT) (PDF)
  10. Authority Ranking - PageRank  (PPT) (PDF)
  11. User interfaces, Visualizations, Eyetracking  (PPT) (PDF)
  12. Probability Retrieval Principle (PPT) (PDF)
  13. Geographic IR (PDF), Personalized (PDF), Multimedia (PDF)


Assignment Submission until Programming solution Tutorial slides
Tutorial 0
Assignment 1 April 21, 2016, 10:00 a.m. Solution 1 Tutorial 1
Assignment 2 May 4, 2016, 10:00 a.m. Solution 2 Tutorial 2
Assignment 3 May 11, 2016, 10:00 a.m. Solution 3 Tutorial 3
Assignment 4 May 25, 2016, 10:00 a.m. Solution 4 Tutorial 4
Assignment 5 June 1, 2016, 10:00 a.m. Solution 5 Tutorial 5
Assignment 6 June 15, 2016, 10:00 a.m. Tutorial 6
Assignment 7 June 22, 2016, 10:00 a.m. Tutorial 7
Assignment 8 July 6, 2016, 10:00 a.m. Solution 8 Tutorial 8
Assignment 9 July 20, 2016, 10:00 a.m. Tutorial 9

It is highly recommended that you follow a textbook while taking the lecture. The textbooks are probably able to address most question you might have about the content of the lecture:

  • Introduction to Information Retrieval. Manning, Raghavan, Schütze, Cambridge University Press, 2008.
    Free, electronic versions available at
  • Web Data Mining. Liu. Springer, 2007.
  • Modern Information Retrieval. Baeza-Yates, Ribeiro-Neto, ACM Press, 2012.

Additional Material:


You will be expected to solve weekly mandatory assignments in order to reach admission to the exam. The assignments will be comprised of theoretical and programming parts in Java. Follow this guide to install the software for the programming assignments.

You should complete the assignments in groups of 2-3 people. Please register in a group until April 19th at


In order to obtain ECTS-Credits (6 ECTS-Credits) you need to both gain admission to the exam and you need to pass the exam. The exam is passed if you obtain a score of at least 50% in it.

Only students who have gained admission are allowed to participate in the exam. Admission is reached by obtaining a total score of at least 50% over all excercise assignments. Admissions from previous semesters are not recognized, with the only exception that you failed the exam in SS 2015 and are thus required to take it again. Nevertheless, participation in the lecture and exercise is strongly recommended by us.

The exam will take place on Monday, August 15, 2016, 16:00-18:00 in E 011. Registration in Klips is opened, mandatory, and possible until August 14. A second exam (most probably oral) will be offered at the start of the winter semester.

The second trial Oral Exam for Web Retrieval will take place on November 15th 13:00-16:00. Students who need to register for the exam should send me an email to Dr. Kumar.

For your reference, the exam from last year can be found here.

Dr. Chandan Kumar

Lukas Schmelzeisen