Institute for Web Science and Technologies · Universität Koblenz - Landau

Big Data

[go to overview]

Introduction

This course will cover a variety of topics relating to managing Big Data, underlying algorithms and practices, as well as methods for analysis. 

Prerequisite

Students are expected to have background knowledge on (i) data bases, (ii) data analysis, e.g. data science or machine learning, and (iii) must know how to program. Students without programming experience are not welcome.   

Lecture

The lectures will be held by Prof. Steffen Staab

You find the lecture recordings at Videoakademie.

KLIPS entry: https://klips.uni-koblenz-landau.de/v/112828

Exercise

Students will deepen their understanding of this course during the practical programming exercises.
The source code of the tutorials as well as the example datasets can be found in the SVN.
Please register until 2019-04-21 as member of a group in Teams!.

Time and Location

Tuesday 14:00 ~ 16:00 weekly 09.04.2019 to 15.07.2019 G 310
Friday 12:00 ~ 14:00 weekly 12.04.2019 to 19.07.2019 H 009

Literature

Jure Leskovec, Anand Rajaraman, JeffreyD. Ullman. Mining Massive Data Sets http://mmds.org/
Bill Chambers & Matei Zaharia. Spark. The Definitive Guide. O'Reilly 2018. Additional material/code: https://github.com/databricks/Spark-The-Definitive-Guide/
Pramod J. Sadalage & Martin Fowler. NoSQL Distilled. A Brief Guide to the Emerging World of Polyglot Persistence
Dmitriy Lyubimov & Andrew Palumbo. ApacheMahout: BeyondMapReduce. Distributed Algorithm Design. 2016

Exam

Announced in Klips.
Date: Friday, July 26, 2019, 12.00-14.00hrs
Location: M001

Second exam will happen October 15, 2019, 10.00-12.00hrs. 

Schedule

When What Who Slides Assignment Tutorial
April 9 - Tue Lecture Steffen 0-introduction.pptx
0-introduction.pdf
   
April 12 - Fr Tutorial Daniel slides    
April 16 - Mo Lecture Steffen 1-cloudcomputing.pptx
1-cloudcomputing.pdf
   
April 19 - Fr No lecture / tutorial - public holiday
April 23 - Tue No lecture 
April 26 - Fr Tutorial Daniel slides    
April 30 - Tue Lecture Steffen 2-spark-intro.pptx
2-spark-intro.pdf (updated May 7, 2019)
   
May 3 - Fr Tutorial Daniel slides    
May 7 - Tue Lecture Steffen 3-OLAP.pptx
3-OLAP.pdf
   
May 10 - Fr Tutorial Daniel slides    
May 14 - Tue Lecture Steffen 3-OLAP.pptx (updated May 20, 2019)
3-OLAP.pdf (updated May 20, 2019)
   
May 17 - Fr Tutorial Daniel slides    
May 21 - Tue Lecture Steffen 4-Joins+More.pptx
4-Joins+More.pdf
   
May 24 - Fr Tutorial Daniel slides    
May 28 - Tue Lecture Claudia 4-Joins+More.pptx
4-Joins+More.pdf
5-NoSQL.pptx
5-NoSQL.pdf
   
May 31 - Fr Tutorial Daniel slides    
June 4 - Tue Lecture Claudia      
June 7 - Fr Tutorial Daniel slides    
June 11 - Mo No lecture / tutorial - public holiday
June 15 - Fr No lecture / tutorial - public holiday
June 18 - Tue Lecture Steffen 6-Distribution+Consistency.pptx
6-Distribution+Consistency.pdf
   
June 21 - Fr Tutorial Daniel slides    
June 25 - Tue Lecture Steffen 7-ScalableAnalysis-AlgebraicModeling.pptx (updated June 28, 2019)
7-ScalableAnalysis-AlgebraicModeling.pdf (updated June 28, 2019)
   
June 28 - Fr Tutorial Daniel slides    
July 2 - Tue Lecture Claudia 8-AdvancedAnalytic-and-ML-spark.pptx
8-AdvancedAnalytic-and-ML-spark.pdf
   
July 5 - Fr Tutorial Daniel slides    
July 9 - Tue Lecture Steffen 9-streaming.pptx
9-streaming.pdf
   
July 12 - Fr Tutorial Daniel slides    
July 16 - Tue Q&A Steffen      
July 19 - Fr Q&A Daniel slides    

 

Plan

  1. Overview
  2. Cloud computing
  3. Spark and Map-Reduce
  4. Olap
  5. Algorithm Engineering for Big Data
  6. NoSQL Stores
  7. Graph Stores

Lecturers

  • staab@uni-koblenz.de
  • Professor
  • B 108
  • +49 261 287-2761
  • schon@uni-koblenz.de
  • Scientific Employee
  • B 114
  • +49 261 287-2773
  • danijank@uni-koblenz.de
  • Scientific Employee
  • B 103
  • +49 261 287-2747