Large Network Collections: The Power of Many Datasets[zur Übersicht]
In this talk, I will present multiple analysis methods that can be performed on network dataset collections which contain a large number of different datasets. Although network datasets (both social and otherwise) are used in a large fraction of studies in data mining and other areas, many papers base their work on only a single, or a small number of datasets. While this is adequate when answering research questions about a specific community or a specific dataset, the use of individual datasets cannot give insight into network analysis problems as a whole. In particular, statements such as "social networks are scale-free", "hyperlink networks have larger diameters than social networks" and "the matrix exponential is a good algorithm for predicting the formation of new links" cannot be answered in a generalizable way using only a few datasets. This talk will thus review multiple recent results from research performed at the Institute for Web Science and Technologies (WeST) at the University of Koblenz-Landau based on the Koblenz Network Collection (KONECT), a collection of 230+ network datasets of varying sizes, covering many different graph types and application areas. These recent results cover the verification of well-known graph models, structural differences between graphs from different categories (as in hyperlink vs social network), and the performance of link prediction-like algorithms.
06.10.16 - 10:15