Semantic Relevance Distance - A Stable Metric for Computing Semantic Relatedness over Reference Corpora[go to overview]
We propose Semantic Relevance Distance (SRD): a novel metric for computing semantic relatedness between terms. SRD makes use of a controlled reference corpus for a statistical analysis of the relatedness of terms. It combines relevance weights of terms in documents and the joint occurrence of terms to identify a correlation of importance between two terms. Our hypothesis is, that a higher correlation indicates a higher semantic relatedness. We demonstrate that SRD outperforms state of the art approaches for computing semantic relatedness on established reference datasets. Furthermore, we also show that the quality of SRD is less dependent on the choice of the reference corpus than comparable approaches.
09.01.14 - 09:15