Big Data
[go to overview]Summer Term 2019
Introduction
This course will cover a variety of topics relating to managing Big Data, underlying algorithms and practices, as well as methods for analysis.
Prerequisite
Students are expected to have background knowledge on (i) data bases, (ii) data analysis, e.g. data science or machine learning, and (iii) must know how to program. Students without programming experience are not welcome.
Lecture
The lectures will be held by Prof. Steffen Staab
You find the lecture recordings at Videoakademie.
KLIPS entry: https://klips.uni-koblenz-landau.de/v/112828
Exercise
Students will deepen their understanding of this course during the practical programming exercises.
The source code of the tutorials as well as the example datasets can be found in the SVN.
Please register until 2019-04-21 as member of a group in Teams!.
Time and Location
Tuesday | 14:00 ~ 16:00 | weekly | 09.04.2019 to 15.07.2019 | G 310 |
Friday | 12:00 ~ 14:00 | weekly | 12.04.2019 to 19.07.2019 | H 009 |
Literature
Jure Leskovec, Anand Rajaraman, JeffreyD. Ullman. Mining Massive Data Sets http://mmds.org/
Bill Chambers & Matei Zaharia. Spark. The Definitive Guide. O'Reilly 2018. Additional material/code: https://github.com/databricks/Spark-The-Definitive-Guide/
Pramod J. Sadalage & Martin Fowler. NoSQL Distilled. A Brief Guide to the Emerging World of Polyglot Persistence
Dmitriy Lyubimov & Andrew Palumbo. ApacheMahout: BeyondMapReduce. Distributed Algorithm Design. 2016
Exam
Announced in Klips.
Date: Friday, July 26, 2019, 12.00-14.00hrs
Location: M001
Second exam will happen October 15, 2019, 10.00-12.00hrs. Registration end is Friday, October 10, 2019.
Schedule
When | What | Who | Slides | Assignment | Tutorial |
April 9 - Tue | Lecture | Steffen | 0-introduction.pptx 0-introduction.pdf |
||
April 12 - Fr | Tutorial | Daniel | slides | ||
April 16 - Mo | Lecture | Steffen | 1-cloudcomputing.pptx 1-cloudcomputing.pdf |
||
April 19 - Fr | No lecture / tutorial - public holiday | ||||
April 23 - Tue | No lecture | ||||
April 26 - Fr | Tutorial | Daniel | slides | ||
April 30 - Tue | Lecture | Steffen | 2-spark-intro.pptx 2-spark-intro.pdf (updated May 7, 2019) |
||
May 3 - Fr | Tutorial | Daniel | slides | ||
May 7 - Tue | Lecture | Steffen | 3-OLAP.pptx 3-OLAP.pdf |
||
May 10 - Fr | Tutorial | Daniel | slides | ||
May 14 - Tue | Lecture | Steffen | 3-OLAP.pptx (updated May 20, 2019) 3-OLAP.pdf (updated May 20, 2019) |
||
May 17 - Fr | Tutorial | Daniel | slides | ||
May 21 - Tue | Lecture | Steffen | 4-Joins+More.pptx 4-Joins+More.pdf |
||
May 24 - Fr | Tutorial | Daniel | slides | ||
May 28 - Tue | Lecture | Claudia | 4-Joins+More.pptx 4-Joins+More.pdf 5-NoSQL.pptx 5-NoSQL.pdf |
||
May 31 - Fr | Tutorial | Daniel | slides | ||
June 4 - Tue | Lecture | Claudia | |||
June 7 - Fr | Tutorial | Daniel | slides | ||
June 11 - Mo | No lecture / tutorial - public holiday | ||||
June 15 - Fr | No lecture / tutorial - public holiday | ||||
June 18 - Tue | Lecture | Steffen | 6-Distribution+Consistency.pptx 6-Distribution+Consistency.pdf |
||
June 21 - Fr | Tutorial | Daniel | slides | ||
June 25 - Tue | Lecture | Steffen | 7-ScalableAnalysis-AlgebraicModeling.pptx (updated June 28, 2019) 7-ScalableAnalysis-AlgebraicModeling.pdf (updated June 28, 2019) |
||
June 28 - Fr | Tutorial | Daniel | slides | ||
July 2 - Tue | Lecture | Claudia | 8-AdvancedAnalytic-and-ML-spark.pptx 8-AdvancedAnalytic-and-ML-spark.pdf |
||
July 5 - Fr | Tutorial | Daniel | slides | ||
July 9 - Tue | Lecture | Steffen | 9-streaming.pptx 9-streaming.pdf |
||
July 12 - Fr | Tutorial | Daniel | slides | ||
July 16 - Tue | Q&A | Steffen | |||
July 19 - Fr | Q&A | Daniel | slides |
Plan
- Overview
- Cloud computing
- Spark and Map-Reduce
- Olap
- Algorithm Engineering for Big Data
- NoSQL Stores
- Graph Stores