Since deep learning is a very flexible framework, it works well for various tasks without expert knowledge, but it also has difficulty leveraging explicit knowledge. Deep learning always requires massive dataset and is applicable to limited tasks. I introduce deep generative model, which is a Bayesian network implemented on deep neural networks. By expressing our knowledge as the network structure, deep generative model works for a small-sized dataset and provides interpretable results.
I present a family of stochastic local search algorithms for finding a single stable extension in an abstract argumentation framework. These incomplete algorithms work on random labellings for arguments and iteratively select a random mislabeled argument and flip its label. We present a general version of this approach and an optimisation that allows for greedy selections of arguments. We conduct an empirical evaluation with benchmark graphs from the previous two ICCMA competitions and further random instances. Our results show that our approach is competitive in general and significantly outperforms previous direct approaches and reduction-based approaches for the Barabasi-Albert graph model.
In both conversation and writing, grammar gives us the opportunity to avoid articulating parts of a sentence, which are overtly expressed in the preceding linguistic context. For instance, in the sentence, /I wanted to play football but I couldn’t/, after /couldn’t/, /play football/ can be dropped because it can be understood from the context. In linguistics, this phenomenon is known as verb phrase (VP) ellipsis. Detection and resolution of ellipsis lead towards understanding text properly which could be helpful to improve language understanding systems. Since this phenomenon is optional, the challenge was to find a way to systematically distinguish auxiliaries and modals that indicate VP ellipsis from auxiliaries that do not.
The internet has often been hailed as an opportunity for democracy. Political parties especially have high hopes for the new possibilities for communication and participation. But the digital divide still mediates the access to, use of, and benefits derived from use of the internet. These online inequalities are a problem when combined with democratic processes: If equal opportunity to participate is the goal, how can an unequal tool be beneficial?
Most information extraction (IE) systems are designed to construct Knowledge Bases (KBs) consisting of high precision facts. The focus is primarily on the confidence in the correctness of the data. As KBs are being increasingly considered to be representations of the real world, it is imperative that the data in KBs is not only correct, but also complete. Yet, most widely used KBs today do not store the completeness information for many common predicates such as names of children or winners of an award. This incompleteness of KB facts is the result of oversight on the part of most information extraction processes, that emphasize on the optimization of precision, but largely ignore the recall. A recall oriented IE system is highly desirable for many use cases.
This talk will present the paper "Deep Contextualized Word Embeddings" by Peters et al, which won the best paper award at NAACL-HLT 2018. Its approach ELMo (Embeddings from Language Models) constitutes a fundamentally new way of representing words by considering linguistic context and achieved state-of-the-art results in six important NLP problems.
This talk will introduce the required background material (recurrent neural networks, word embeddings, and language models) and summarize the key contributions of the paper.
What is the challenge of interdisciplinary research on democracy? Based on the implications of text data, I present the study of democracy in the digital era as the dilemma between large data and deep validity.
This focus reflects that philosophies behind the choice of methods differ across disciplines, which is the main difficulty but also the desired advantage. Computer science prefers data on a very large level, while social and political science move flexibly between medium and small levels, with the smallest level being the time-intensive immersion in language, culture, and society. Against popular assumption, I argue
Extracting and parsing cited references from publications in PDF format is important to ensure the acknowledgement of the sources of information. However, the mention of these sources differs from a community to another and from a publication to another. This citation diversity lies mainly in the indexation style (e.g., one or several reference sections), the existence of components (e.g. editor, source, URL, etc.) and the type of references (e.g. grey literature, academic literature, etc.). In order to accurately extract and segment difference kinds of references, EXCITE proposes a generic approach that combines Random Forest and Conditional Random Fields (CRF) in a coherent mechanism.
Anomalous diffusion processes, both in the superdiffusive and subdiffusive regimes, have spurred a lot of theoretical research effort, along with experimental validation, for decades now. Their description, however, strongly relies on the existence of a metric in continuous space. Complex networks lack an intrinsic metric definition and, in this talk, I will present some theoretical "recipes" to work around this issue and recover such regimes on networks as well. On the applied side, some machine learning algorithms, like the celebrated Page Rank, exploit diffusion for classification and ranking tasks. Thus I will show how, through enhanced diffusion regimes, it is possible to address and correct some shortcomings of those algorithms and improve classification performance.
A compendium of applications of tailor-made network-theoretic tools have been devised and implemented in a data-driven fashion. In the first part, a (formerly) novel centrality metric, aptly named “bridgeness”, based on a decomposition of the standard betweenness centrality, will be introduced. A prominent feature is its agnosticism with regard to any possible community structure prior. A second application is aimed at describing dynamic features of temporal graphs which are apparent at the mesoscopic level. A dataset comprising 40 years' worth of selected scientific publications is used to highlight the appearance and evolution in time of a specific field of study: “wavelets”.