Author Extraction from Social Science Research Papers

Martin Körner

To help in overcoming a shortage of citation information for the German social sciences, we contribute an approach for extracting author names from reference sections. Instead of relying on small amounts of manually labeled data, we use a distantly supervised approach in combination with the widely used probabilistic framework of conditional random fields. The resulting model does not only decide if a word is part of an author, but also separates the listed authors in a reference string and labels first and last names as such.

04.08.16 - 10:15
B 016