Microtask crowdsourcing has become an extremely useful method to solve problems that are difficult for machines. However, there are still many challenges to overcome. One way to identify the problems that should be tackled is to pay attention to what the users of crowdsourcing have to say. In this thesis, I studied messages posted on Twitter and forums, that referred to four major microtask crowdsourcing platforms. I analysed who posted the messages, the step of the crowdsourcing process they refer to, the purpose of the messages and their sentiment. This is a first step toward the identification of strengths and weaknesses of crowdsourcing platforms.
As a free and collaborative knowledge base, created and maintained by volunteers, Wikidata may represent knowledge from different points of view. These points of view can be conflicting, especially when the data comes from different information sources.
Finding suitable datasets is a major task within many science topics. Especially social sciences rely heavily on existing studies since the creation of new data with hundreds of participants from different countries over a longer periods of time is resource intensive, so the reuse and recombination of existing studies is crucial.
Wikimedia Deutschland ist als Verein aus der Wikipedia-Bewegung hervorgegangen und hat unter anderem das Ziel, sich für freies Wissen stark zu machen. Freie Bildung als integraler Bestandteil von freiem Wissen kommt als Ergebnis in den schwächelnden Wikimedia-Projekten Wikibooks und Wikiversity jedoch zu kurz.
Semantic data fuels many different applications, but is still lacking proper integration into programming languages. Untyped access is error-prone while mapping approaches cannot fully capture the conceptualization of semantic data.
Recent developments in the SemGIS project In part one an update on recent developments in the SemGIS project concerning automated mapping approaches of geospatial data into the semantic web is given.
The efficiency of SPARQL query evaluation against Linked Open Data may benefit from schema-based indexing. However, many data items come with incomplete schema information or lack schema descriptions entirely. We outline an approach to an indexing of linked data graphs based on schemata induced through Formal Concept Analysis.
The Resource Description Framework (RDF) is a triple based representation of directed graphs with labelled edges. With the emergence of RDF graphs special databases, called RDF stores, were developed. In order to query graphs, which are stored in these RDF stores, the query language SPARQL Protocol And Query Language (SPARQL) is used.
Our objective is to identify and assess gender bias related to professions in the results of collaborative community work. We present the results of studies in which we characterize the gender inequality present over the three dimensions: redirections, images, and people mentioned in the articles.
To help in overcoming a shortage of citation information for the German social sciences, we contribute an approach for extracting author names from reference sections. Instead of relying on small amounts of manually labeled data, we use a distantly supervised approach in combination with the widely used probabilistic framework of conditional random fields.