Sie sind hier

EXCITE Workshop 2017: “Challenges in Extracting and Managing References”

About

When: 30.03.2017 - 31.03.2017
Where: GESIS-Leibniz-Institut für Sozialwissenschaften, Unter Sachsenhausen 6-8, 50667 Cologne, Germany

EXCITE is a collaborative activity of the GESIS – Leibniz Institute for the Social Sciences and the Institute for Web Science and Technologies (WeST) which has started in September 2016. The project develops a tool chain implementing the following steps: Extraction of text from the source documents, identification of individual references in the text, segmentation of those references, matching of reference strings against bibliographic databases, and export of the matched references in usable formats and services. Special attention will be paid to the overall optimization of individual components of the citation extraction.

Our first community meeting is planned as a “noon to noon” event and has the goal to bring together experts in reference extraction, text mining, and machine learning to explore the possibilities in the project. We plan to have scientific presentations with invited speakers on the first day and hands-on sessions on the second day. For the second day we will release a test corpus (PDF files of scientific papers and manually annotated data) for developers.

Agenda

Day One (Thursday, 30.03.2017)




Time Title Speaker
11:00 Arrival (Room: West II)  
12:00
Welcome and Introduction
Steffen Staab, WeST
Philipp Mayr, GESIS
12:20 Information Extraction out of Born-Digital Scientific Articles Roman Kern, TU Graz
12:40 Advanced citation matching and large-scale full-text analysis Nees Jan van Eck, Leiden U
13:00 Lunch Break (Cafeteria)  
14:20 APIs for third parties to extract and deposit output executions of automated extraction pipelines (via videoconferencing) Min-Yen Kan, NU Singapore
14:40 Extracting references from scientific articles in CERMINE system Dominika Tkaczyk, U Warsaw
15:00 Coffee Break (Cafeteria)  
15:30 CitEc to CitEcCyr. A stab at distributed citation systems. (via videoconferencing) Thomas Krichel, Open Library Society, NYC
15:50
EXCITE project: Status report
Behnam Ghavimi, GESIS
Martin Körner, WeST
    Heinrich Hartmann, Circonus
16:10 Processing of in-text References: Towards a Semantic Analysis Marc Bertin, U Toulouse
16:30 Citations in Utopia Documents David Thorne, U Manchester
16:50 Coffee Break (Cafeteria)  
17:20 Research around the Tagging System BibSonomy Andreas Hotho, U Würzburg
17:50
LOC-DB: A Linked Open Citation Database provided by Libraries. Motivation and Challenges.
Kai Eckert, HDM Stuttgart
Anne Lauscher, HDM Stuttgart
Akansha Bhardwaj, DFKI
18:20 tbd (via videoconferencing) Lee Giles, Penn State U
18:50 Break  
20:00 Dinner at Gaffel am Dom (paid by participants)  
22:00 Socializing  
23:00 End  

Day Two (Friday, 31.03.2017)





Time Title    
9:00 Second Day Kickoff (Room: West II)
9:15 Extraction Result Discussion Group Gold Standard Discussion Group Collaboration Discussion Group
11:15 Coffee Break (Cafeteria)
11:30 Extraction Result Discussion Group Gold Standard Discussion Group Collaboration Discussion Group
12:30 Closing Talks (Room: West II)    
13:00 End    

Resources

Gold Standard

One part of the discussions during the second workshop day will around a gold standard that we are currently building. The current version can be found on Github. Note that it is work in progress. The according PDFs can be found (for now) here.

Arrival and Accommodation

GESIS Cologne is located near the Cologne central train station. Further information on traveling to GESIS by air, rail, intercity bus, or car can be found on the GESIS website.
There are also special GESIS rates available for accommodations. More information can be found on this list.