RDF graphs are sets of triples consisting of the subject, property and object resource. Nowadays, RDF graph can consist of trillion of triples. In order to cope with these huge graphs, distributed RDF stores combine the computational power of several compute nodes. In order to query the RDF graph stored in these distributed RDF stores efficiently, statistical information about the occurrences of resources on the different compute nodes is required. A naive way to store these statistical information is a table stored in a single random access file. This naive implementation leads to files that have a size of several tens of gigabytes and thus, to slow read and write operations.
While RDF in particular and graph based data models have gained traction in the last few years, programming with them is still error-prone. In case of RDF, part of the problem was the lack of integrity constraints which can make guarantees about the data. The recently introduced W3C standard SHACL can now provide such integrity constraints in the form of so called SHACL shapes. The talk presents an approach for integrating them into a programming language in order to avoid run-time errors. In particular, we use SHACL shapes as types for programming language constructs and queries that access the RDF data.
In this seminar, I shall introduce a (relatively) novel way to characterize the macroscopic states of a dynamic model - the XY spin model - on networks. The method is based on the spectral decomposition of time series by using topological information about the underlying networks.
Governmental geospatial data are usually considered a de-jure goldstandard by official authorities and many companies working with geospatial data. Yet, official geospatial data are far from perfect. Such kinds of datasets are updated in long intervals (e.g. yearly), only selectively according to the judgements and regulations of the local governmental organizations and updated in a migratory process at the state and federal level. However, volunteered geographic information projects such as OpenStreetMap can provide an alternative in both freshness of data and possibly a bigger coverage of semantic attributes in the respective areas.
First order theorem proving with large knowledge bases makes it necessary to select those parts of the knowledge base, which are necessary to prove the theorem at hand. We propose to extend syntactic axiom selection procedures like SInE to the use of the semantics of symbol names. For this not only occurrences of symbol names but also similar names are taken into account. We propose to use a similarity measure based on word embeddings like ConceptNet Numberbatch. An evaluation of this similarity based SInE is given by using problem sets from TPTP's CSR problem class and Adimen-SUMO. This evaluation is done with two very different systems, namely the HYPER tableau prover and the saturation based system E.
In this talk i will present the computation of inconsistency on a planning problem. Thus a function like in the inconsistency measure will be used to map the planning problem onto a number ranging from 0 to 1. Is the planning problem solvable, also consistent, then it gets mapped to a 0. For the unsolvable case a planning problem can be represented as inconsistent. While comparing two inconsistent planning problems, one can be found more inconsistent than the other. Thus it can be measured as in inconsistency measure and ordered by their value. The measuring can be done by stepwise erasement of the inconsistency or by directly calculating on the outcome of the search from the planning problem.
This talk will introduce WeST's new DFG project "Open Argument Mining". The intent is to look for potential overlaps with other projects and where collected data could be shared. The project's goal is to advance the SotA from the research fields argument mining and knowledge graph construction by (1) implementing a knowledge-aware lifelong learning approach, (2) aligning incomplete arguments with known ones and enriching them with background knowledge, and (3) automatically acquiring background knowledge by combining contemporary semantic knowledge basses with focused knowledge expansion.
Generative Adversarial Networks (GANs) are part of the family of generative models, which focus on producing samples that follow the probability distribution of the real dataset. This architecture is adversarial because it implies training two neural networks against each other. To make the training process more intuitive, the author of GANs, Ian Goodfellow, compared one network with a counterfeiter trying to create fake money and the other with the police, trying to detect counterfeit money from real one. This architecture gained a lot of interest in the computer vision field, in areas such as: image to image translation, improvement of image resolution and text to image generation.
Rumor detection and analysis is a complex and multi-dimensional problem in which the content, user’s communicational strategies, or diffusion of rumors can be taken into account. Several researches considering a dimension have been conducted. However, the semantics is an important gap that can be seen in existing studies.
Wikipedia is the biggest, free online encyclopaedia that can be expanded by anyone. For the users, who create content on a specific Wikipedia language edition, a social network exists. In this social network users are categorised into different roles. These are normal users, administrators and functional bots. Within the networks, a user can post reviews, suggestions or send simple messages to the "talk page" of another user. Each language in the Wikipedia domain has this type of social network. In this thesis characteristics of the three different roles are analysed in order to learn how they function in one language network of Wikipedia and apply them to another Wikipedia network to identify bots.