Course Information Retrieval
Winter semester 2006/2007
The results of the oral exam are available on WebCT.
- Location: Wednesdays 4 p.m. c.t. (i.e. 16.15) in A-120
- The volume of this course is 2 academic hours per week.
- Target audience: Computer Science, CV.
- Recommended prerequisites: successful participation in the course "Database Systems", background knowledge of linear algebra, probability theory und stochastics. The course will offer major basics in these disciplines as they are crucial for several course themes. However, better understanding of the material may require self-study of recommended related work.
- Examination: oral exam at the end of the winter semester
Information Retrieval is a collective term for methods and technologies of search, analyse and automatically organisation from different types of data pools: text documents, multimedia volumes, structured or semi-structured representation of knowledge. The lecture imparts more profound insight in mathematical modules and algorithms of the search engines for Word Wide Web, intranets and digital libraries. Therefore provides as basis the mathematically tools of linear algebra and regression analysis (as Singular Value Decomposition), calculus of probabilities and statistic (as Maarkov chain and Bayesian networks).
Goals of the course: better understanding of internals of modern search systems and endines and their limitations. Ability of design and improvements of IR systems using state of the art methods
- Motivation, overview, system architectures
- Technical basics: linear algebra, probability theory and stochastic
- Classical IR systems: vector space models, link analysies and authority ranking, multimedia retrieval, architecture and operating mode from modern search engines, organisation and ranking of search results Advances in IR-Systems: advanced link analysies, top-k retrieval algorithms, ontologies and concept based information search, focused crawling, Deep Web Information sources, search and ranking for semi-structured data and XML Automatic knowledge acquisition from web data and heterogeous document collections * State-of-the-art peer-to-peer search systems and algorithms