In anticipation of RDF graphs exceeding one trillion triples, the W3C tested RDF stores whether they can deal with such huge graphs. This amount of data can be stored in a cloud at a reasonable price. But storing a graph in a cloud consisting of several individual computers raises several issues like the triple placement or the efficient processing of interactive queries.
Distributed RDF stores try to cope these difficulties. They distribute the graph by assigning triples to the different computers. If a query is processed in such a system, it can be executed on the local graph chunks of each computer in parallel. But the smaller chunks have the disadvantage that some queries require the combination of triples from different chunks to produce their results. These inter-chunk queries lead to additional network traffic since the partial results have to be sent to a specific computer that produces the complete result. This network traffic must be kept to a minimum to render this approach feasible technically and economically.
In this talk, a query execution mechanism is presented, executes the query in parallel and reduces the network traffic. Furthermore, it is independent of the graph distribution strategy enabling an impact analysis of the graph distribution strategy on the query execution effort.