Data Science (cf. the Wikipedia definition of data science) describes an attitude towards treating problems with a set of capabilities that is not located in any classic community, but it is a set of capabilities that cross-breed between disciplines, such as physics, biology, social sciences and economics. It uses elaborate computer science paradigms and needs a background in statistics. It feeds the new as well as the classical economy as well as the medical field.
(Preliminary) Lecturing Schedule
|19.04||Intro to DS||Claudia|
|26.04.||Probability Theory and Statistics||Claudia|
|Exercise: Python stats||Claudia|
|03.05.||Hypothesis testing & Descriptive Statistic||Claudia|
|Exercise: Probability Theory, p-values||Claudia|
|Exercise: Hyp testing||Claudia|
|17.05.||Hypothesis testing & Nonparametric Stats||Claudia|
|31.05.||Relationships & Regression||Claudia|
|28.06.||Graphical Models 1||Claudia|
|05.07.||Graphical Models 2||CCK|
|12.07.||Sampling Methods? Inference Methods?||CCK|
|19.07.||Visualizations and Telling Data Stories||Claudia|
The exercises will be done in groups of X students. For taking part in the exam, solutions for all but one exercise have to be submitted. For this, each group will get an own SVN repository.
Programming will be in IPython with IPython notebooks :)
- Vasant Dhar. Data Science and Prediction. In: Communications of the ACM, December 2013, Vol. 56, No. 12, pp. 64-73
- Anand Rajaraman, Jeffrey Ullman, Jure Leskovec, Mining of Massive Datasets, Cambridge University Press (free download)
- Jeffrey Stanton, Introduction to Data Science (free download)
- John Hopcroft. Foundations of Data Science.
- * http://www.wolframscience.com/thebook.html
- Peter Norvig, Alon Halevy, Fernando Parreira. The unreasonable effectiveness of data. In: IEEE Intelligent Systems, March/April 2009.