Institute for Web Science and Technologies · Universität Koblenz - Landau
Institute WeST
This course is from a past semester. If you are looking for current courses, go to the course overview.

Data Science

[go to overview]

Winter Terms 2018 / 2019

Lecture:
2:15 pm - 3:45 pm G310

Topics that will be covered in this course:

  • Data Collection Methods and Ethics
  • Data Analytics: Mainly Statistics & Probability Theory (Descriptive Statistic, Bayesian versus Frequentist thinking, Statistical Inference, Causal Inference)
  • Data Visualizations, Interpretations and Data Story Telling

Exercise:
4:00 pm - 5:30pm E313

  • You will get hands-on experience and learn practical data science stuff (e.g. how to do data science in python)
  • Paper and pen exercises, small programming exercises, reading homework.
  • Try to solve the problems yourself before the exercise class (self-assessment: how well have I understood the content of the lecture? Use exercise class to ask open questions). You don’t have to hand in exercise sheets. But bring them to class, we correct them together during class.

 

Lecture Schedule and Materials

Date LECTURE (14:15) EXERCISE (16:00)
24.10   Introduction to Data Science (pdf)           Python tutorial, Pandas tutorial 
31.10 Descriptive Stats & Probability (pdf) slides, thinkstats, roulette, dice
07.11 Data collection Methods & Ethics (pdf) slides, HTML parsing, Web scraping, sample HTML, samle wiki
14.11 Sampling Distributions & Confidence Intervals (pdf) slides, homework exercise sheet
21.11 Paremeter Estimation (pdf)   slides, homework on slide 8 
28.11 GMM, K Means, Graphical models (pdf) Sampling distribution, sampling simulationprobability distributions, homework 
05.12 Hypothesis testing (pdf) slides, Distribution fitting, homework - due 19.12
12.12 Non Param Tests (pdf) & Regressions (pdf) homework - due 19.12
19.12 Tutorial for hypothesis and non parametric homework - due 09.01 (last 3 questions due 16.01)
26.12 Holiday  
02.01 Holiday  
09.01 Causal relations (pdf) unstatistik slides, 09-regression.ipynb
16.01 Bayesian Stats (pdf) Bayesian Stats 
23.01 Bayesian Stats 2 (pdf) notebooks,slides-bayesian, RDD-GPA notebook, RDD-school notebook
30.01 No lecture Exam prep, notebook-solution
06.02 Data Visualisations & Data Stories No Tutorial
08.02 First Exam 4pm D028  
13.03 First Exam Review 2pm IMPORTANT: Location: GESIS, Cologne. GESIS Address  Please send an email to Daniel.Kostic@gesis.org if you are coming for this review.
27.03 Second Exam 2pm D028  
27.03 First Exam Review 3.30 pm B006  
24.04 Second Exam Review 13.00-14.00 B006  

 

Prerequisites:
A basic understanding of programming that will allow you to manipulate data and implement basic algorithms. Python will be the “official” programming language used during the hands-on sessions. We will use IPython Notebook as the environment. A basic understanding of statistics and algebra will help too.
 

Books & Learning Material:

  • Think Stats Probability and Statistics for Programmers by Downey (available for FREE as pdf)
  • Grinstead and Snell’s Introduction to Probability (FREE pdf) or A Modern Introduction to Probability and Statistics (pdf)
  • Dive into Python (FREE) or Python Data Science Handbook by VanderPlas (buy online ~30 EURpdf)
  • Storytelling With Data: A Data Visualization Guide for Business Professionals by Nussbaumer Knaflic (~30 EUR
  • Computer Age Statistical Inference by Efron and Hastie (FREE pdf)
  • Pattern Recognition and Machine Learning by Bishop (Springer, ~75 EUR)

Lecturers

  • clwagner@uni-koblenz.de
  • Professor
  • B 006