Big data is one of those emerging technologies that nowadays appear in almost every novel scientific and industrial work associated with the data science and applied scalable analytics. It provides a massive scalability to the analytics including machine learning approaches as well. Furthermore, scalable machine learning algorithms such as Apache Mahout have been developed on top of big data using MapReduce and Hadoop. According to the Garner hype cycle of 2016, most of emerging technologies of the next 10 years rely on innovations made by big data technologies.
This seminar is organized by Prof. Dr. Steffen Staab and Dr. Mahdi Bohlouli. It deepens the students' knowledge through providing an introduction to big data analysis, MapReduce, Hadoop Distributed File System, NoSQL and New SQL DBs as well as scalable machine learning. The seminar consists of theoretical, practical (research) and students parts. The theoretical part will be the first 5 weeks of the course, continued by implementation and research works done by students. There may be a need for some small implementation exercises in the practical part. The results achieved by students will be presented as individual talks by students.
The seminar is intended for master students in the area of Web Science, Computer Science and related fields. The seminar will be held in English, and will consist of individual talks of students. Each student will research, implement and prepare one topic in the area of Big Data Analytics, will give a presentation (30 minutes) about the topic, and will write a technical report (12 pages) about it, associated with results achieved in the practical work. For the presentation as well as the preparation of the technical report it is recommended to follow the guidelines outlined in the document "Seminar Presentations and Technical Reports - Guidelines".
Knowledge of big data analytics is not mandatory to take part in the seminar. However, visiting the Data Science course is recommended. Students who are not sure if they meet this requirement can contact tutors.
- Introductory meeting: April 18th, 08:30 AM, room number E523.
- Seminar block: at the end of the lecture period
The registration through an email as wel as in Klips system is mandatoy. Please send a short email to Dr. Mahdi Bohlouli for getting further information as well as register through Klips via the following link (https://klips.uni-koblenz-landau.de/v/94069).
 Mohammed Guller, Big Data Analytics with Spark: A Practitioner's Guide to Using Spark for Large Scale Data Analysis Paperback, Apress, 2015.
 Bello-Orgaz, Gema, Jason J. Jung, and David Camacho. "Social big data: Recent achievements and new challenges." Information Fusion 28 (2016): 45-59.
 Landset, Sara, et al. "A survey of open source tools for machine learning with big data in the Hadoop ecosystem." Journal of Big Data 2.1 (2015): 24.