Remark: This research lab will be organised together with the Projektpraktikum on Linked Data Anlytics.
Linked Open Data
Linked Data is the collective term for data following a paradigm based on four simple rules: (a) using URIs to identify entities, (b) using http-URIs allowing to de-reference and look up entities, (c) providing useful information at these URIs based on standard formats (e.g. RDF, SPARQL) and (d) connecting and interlinking to other entities in order to allow for further exploration. For the data to fully qualify as Linked Open Data (LOD) it should additionally be provided publicly, be available on the web and under an open license.
Making use of semantic web formats, LOD implements the vision of a web of data. The underlying technologies allow on the one hand for a globally unique identification of entities via URIs as well as clear semantics of the relations modelled by the links between the entities. As a consequence, LOD should be easier to integrate and reuse in multiple contexts and applications. On the other hand, the underlying technologies are well established in a web context. This means that there are many mature tools, programs and libraries available to interact and operate with LOD.
LOD has seen a tremendous growth in the last years. The resulting distributed graph of interlinked entities on the web is commonly referred to as the LOD cloud and spans hundreds of data sources providing billions of RDF triples.
Topic for the Lab
Aim of the research lab is to detect, track and analyse patterns on Linked Open Data. The general idea is to assume a process which continuesly explores and retrieves Linked Data from the Web. Novel data discovered in such a process will continuously be analysed. A first analytical process will identify patterns in the observed data on a schema level. Such patterns can be information about entities of a certain type or about certain relations between entities. This will lead to clusters of homogenous data which in a next step can be analysed for patterns on the entity level. Such patterns could be a correlation of particular values or the prediction of features based on other observations. The choice of methods applicable for the second phase operating on the entity level depends on the schema level patterns idenfied in the first phase. A suitable description of methods and their dependencies on specific types of input data will formalise such constraints. Finally, a reporting component will summarise the findings as well as the overall process that lead to the detected pattern.
More details will be given in the introduction presentation (see below).
We will have a first informative meeting about the research lab on
Tuesday, 10 Feburary 2015 at 14.00 (s.t.) in room B-110
Students interested in joining the lab are invited to come to the introduction presentation. As the total number of participants is limited, places will be allocated on a first-come first-serve basis.
Research lab - Linked Data Analytics
|Dozent(in)||Dr. Thomas Gottron|
|Termin(e)||Do. 12:00 - 14:00, E 524|