Institute for Web Science and Technologies · Universität Koblenz
Institute WeST


[go to overview]

Koldfish is a system that supports application developers in implementing many use cases of the Web of Data. We are aiming at providing a suite of APIs for accessing, querying and processing linked data , while managing problems with, e.g., sparseness of schema information, data quality, provenance or availability for the application developer.

When designing an application that accesses and processes the Linked Open Data (LOD) cloud, a developer faces several problems: The application may need to rely on schema information that is not explicitly present in all the relevant data sources; there can be issues with data quality; the application may need to make use of provenance information; or data sources may be unavailable.

Imagine Alice, who wants to develop an application that allows users to search for interesting places to visit based on topical keywords. What issues would she need to deal with? First, she would have to hand-pick relevant data sources. Then she would need to devise a schema-based keyword/query mapping scheme and prepare and map actual queries. After creating a suitable user interface, she could align corresponding queries. The result may look great, but this is non-trivial work.

Given the nature of the Linked Data Cloud, what will happen as it evolves? What if schemata evolve? What if availability of data sources varies. How could Alice be helped in maintaining her application?

It is the purpose of Koldfish to make life easier for developers like Alice and let them abstract from repeatedly appearing LOD management issues.

Key Features of Koldfish are:

  • service oriented middleware for Linked Data applications
  • RESTful APIs for accessing service functionality
  • automatic schema extraction for query support
  • design time support through schema-based data space exploration
  • built-in data quality management

Koldfish offers a number of services that can be used through REST APIs. A crawler feeds live data from the LOD cloud to its subscribers. This is stored by the data service for later retrieval. For remote dereferencing of IRIs the data service forwards respective requests to a data access module, which in turn accesses data sources of the LOD cloud directly. A provenance service allows for retrieving provenance information the system has acquired. For the purpose of schema based data access and querying, a schema service will automatically extract and maintain type hierarchy and relationship information and create an index pointing to relevant data statements and sources. Lastly, a quality service will assess and manage the quality of data hosted by the data service.


The Koldfish project has been renamed from SEPAL, but retains the ideas presented in the SEPAL project description.


  • Operating Time: 05/2015 - 12/2017