Wikidata, the free knowledge base of Wikipedia, is one of the largest collections of human-authored structured information that are freely available on the Web. It is curated by a unique community of tens of thousands of editors who contribute in up to 400 different languages. Data is stored in a language-independent way, so that most users can access information in their native language. To support plurality, Wikidata uses a rich content model that gives up on the idea that the world can be described as a set of “true” facts. Instead, statements in Wikidata provide additional context information, such as temporal validity and provenance (in particular, most statements in Wikidata already provide one or more references).
One could easily image this to lead to a rather chaotic pile of disjointed facts that are hard to use or even navigate. However, large parts of the data are interlinked with international authority files, catalogues, databases, and, of course, Wikipedia. Moreover, the community strives to reach “global” agreement on how to organise knowledge: over 1,000 properties and tens of thousands of classes are currently used as an ontology of the system, and many aspects of this knowledge model are discussed extensively in the community. Together, this leads to a multilingual knowledge base of increasing quality that has many practical uses.
This talk gives an overview of the project, explains design choices, and discusses emerging developments and opportunities related to Wikidata.