Sie sind hier

Big Data


This course will cover a variety of topics relating to managing Big Data, underlying algorithms and practices, as well as methods for analysis. 


Students are expected to have background knowledge on (i) data bases, (ii) data analysis, e.g. data science or machine learning, and (iii) must know how to program. Students without programming experience are not welcome.   


The lectures will be held by Prof. Steffen Staab

You find the lecture recordings at Videoakademie.

KLIPS entry:


Students will deepen their understanding of this course during the practical programming exercises.
The source code of the tutorials as well as the example datasets can be found in the SVN.
Please register until 2019-04-21 as member of a group in Teams!.

Time and Location

Tuesday 14:00 ~ 16:00 weekly 09.04.2019 to 15.07.2019 G 310
Friday 12:00 ~ 14:00 weekly 12.04.2019 to 19.07.2019 H 009


Jure Leskovec, Anand Rajaraman, JeffreyD. Ullman. Mining Massive Data Sets
Bill Chambers & Matei Zaharia. Spark. The Definitive Guide. O'Reilly 2018. Additional material/code:
Pramod J. Sadalage & Martin Fowler. NoSQL Distilled. A Brief Guide to the Emerging World of Polyglot Persistence
Dmitriy Lyubimov & Andrew Palumbo. ApacheMahout: BeyondMapReduce. Distributed Algorithm Design. 2016


Announced in Klips.
Date: Friday, July 26, 2019, 12.00-14.00hrs
Location: M001

Second exam will happen in October. Time and place yet to be determined.


When What Who Slides Assignment Tutorial
April 9 - Tue Lecture Steffen 0-introduction.pptx
April 12 - Fr Tutorial Daniel slides    
April 16 - Mo Lecture Steffen 1-cloudcomputing.pptx
April 19 - Fr No lecture / tutorial - public holiday
April 23 - Tue No lecture 
April 26 - Fr Tutorial Daniel slides    
April 30 - Tue Lecture Steffen 2-spark-intro.pptx
2-spark-intro.pdf (updated May 7, 2019)
May 3 - Fr Tutorial Daniel slides    
May 7 - Tue Lecture Steffen 3-OLAP.pptx
May 10 - Fr Tutorial Daniel slides    
May 14 - Tue Lecture Steffen 3-OLAP.pptx (updated May 20, 2019)
3-OLAP.pdf (updated May 20, 2019)
May 17 - Fr Tutorial Daniel slides    
May 21 - Tue Lecture Steffen 4-Joins+More.pptx
May 24 - Fr Tutorial Daniel slides    
May 28 - Tue Lecture Claudia 4-Joins+More.pptx
May 31 - Fr Tutorial Daniel slides    
June 4 - Tue Lecture Claudia      
June 7 - Fr Tutorial Daniel slides    
June 11 - Mo No lecture / tutorial - public holiday
June 15 - Fr No lecture / tutorial - public holiday
June 18 - Tue Lecture Steffen      
June 21 - Fr Tutorial Daniel      
June 25 - Tue Lecture Steffen      
June 28 - Fr Lecture Daniel      
July 2 - Tue Lecture Claudia      
July 5 - Fr Tutorial Daniel      
July 9 - Tue Lecture Steffen      
July 12 - Fr Tutorial Daniel      
July 16 - Tue Lecture Sarah      
July 19 - Fr Q&A        



  1. Overview
  2. Cloud computing
  3. Spark and Map-Reduce
  4. Olap
  5. Algorithm Engineering for Big Data
  6. NoSQL Stores
  7. Graph Stores

Prof. Dr. Steffen Staab

Dr. Claudia Schon

Daniel Janke