Developing a Dataset Retrieval Model

Finding suitable datasets is a major task within many science topics. Especially social sciences rely heavily on existing studies since the creation of new data with hundreds of participants from different countries over a longer periods of time is resource intensive, so the reuse and recombination of existing studies is crucial. Finding the interesting datasets however is a complicated task. Datasets itself contain little to no textual information which would help a researcher if she or he could use the data. Most services for data repositories rely on meta-data like tags to fill the textual gap, but is often high level.

In this talk, we present our DFG proposal together with the Leibniz-Institut für Sozialwissenschaften, developing a general retrieval model for datasets, applied and evaluated on Social Science datasets. The talk presents the problem in greater detail, explains the project plan, and discusses our approach to solve the problem with a combination of semantic web technologies and statistical models.

15.09.16 - 10:15
B 016