Welcome & Morning Coffee
Welcome: Menno van Zaanen
Presentations () are each
assigned 30 minutes, but we encourage you to leave 10 minutes for
questions and discussion, whereas Lighting Talks () have 2 minutes.
Please note that both the presentations and demos are BYOL (bring your own laptop).
We decompose multi-modal translation into two sub-tasks: learning to translate and learning visually grounded representations. In a multi-task learning framework, translations are learned in an attention-based encoder-decoder, and grounded representations are learned through image representation prediction. Our approach improves translation performance compared to the state of the art on the Multi30K dataset. Furthermore, it is equally effective if we train the image prediction task on the external MS COCO dataset, and we find improvements if we train the translation model on the external News Commentary parallel text.
We present results from speech recognition experiments designed to mimic conditions in human listeners with cochlear implants (CIs) -- devices which deliver sound to the brain in individuals with hearing loss. Postlingually deaf (PD) individuals, who receive CIs after a period of normal hearing, typically perform better on hearing-related tasks than congenitally deaf (CD) individuals, who are born deaf and are implanted at an early age. We consider two possible reasons for this: CD individuals might perform worse than PD individuals because they develop hearing after a period of auditory deprivation. Then, after implantation with a CI, the brain may have lost the plasticity necessary for full recovery. Alternatively, CD individuals might not have reduced listening capabilities. Rather, the signals delivered by CIs might not be rich enough for CD individuals to differentiate the fine-grained speech structure that PD listeners learn to discriminate during the period of normal hearing prior to implantation. To evaluate these possibilities, we train neural networks on (a) normal speech and (b) vo-coded speech that is modified to simulate the input received by people with CIs. We then compare the performance of two networks: a CD network, which is only trained on vocoded speech; and a PD network, which is pre-trained on normal speech before the training data are vocoded. Crucially, and in contrast to CD individuals, the CD network has the same learning capacity as the PD network. Thus, if pre-training the PD network improves performance on vocoded speech relative to the CD network, this would constitute evidence that exposure to intact speech can boost performance on subsequently degraded input. This would suggest that the difference in performance between PD and CD individuals is not due to reduced learning capabilities in CD individuals, but instead has to do with the impoverished signals delivered by CIs.
Life expectancy is a leading indicator when making decisions about end-of-life care, but good prognostication is notoriously challenging. Being overly optimistic about life expectancy, as doctors tend to be, greatly impedes the early identification of palliative patients and thereby delays appropriate care in the final phase of life. This research aimed to explore the feasibility of automatically predicting life expectancy based on electronic medical records, with the aid of machine learning and natural language processing techniques.
We trained a neural network (long short-term memory) with 1107 medical records, and validated the model with 127 medical records. Using identical evaluation criteria as were used to evaluate doctors' performance, our baseline model reached a level of accuracy similar to human accuracy. The inclusion of clinical narrative was enabled and optimized with the use of natural language processing techniques such as domain-specific spelling correction. The inclusion of keyword features improved the prediction accuracy with 9%, compared to both our baseline model and to the golden standard of human evaluation. Overall, we have shown that our approach for automatic prognostication is feasible and delivers promising results.
Machine reading comprehension is a form of question answering in which the correct answer to a query can be found by reading a document. In this talk, we present our work in machine comprehension by focusing on the creation of a large resource that allows us to study reading comprehension in the clinical domain. We analyze the performance of both human and machine readers on a gap-filling task, and examine what skills are required to understand the supporting text and to provide the right answers. We also position our dataset and task formulation with respect to other popular reading comprehension works.
Organic soup, worstenbrood, pistolets with organic cheese and/or ham with salad and dressing, coffee, tea, orange juice and organic milk.
In this study, we analyze correlations between Flemish adolescents' language use on social media and their social class. The participants' social class has been operationalized as a cluster of three subvariables (representing different aspects of class, such as cultural, financial, etc.). These three variables are the adolescents' educational track, their home language and the profession of their parents.
All three variables significantly influence the teenagers' 'non-standard' language use on social media. (In the present case study, the category of non-standard features covers a wide range of deviations from formal Dutch standard writing, such as dialect words, emoticons, chatspeak abbreviations, etc.). A gradual increase in non-standardness can be found for educational track (more non-standard features in more practical types of secondary education), for profession of the parents (more non-standard features when parents have 'lower' class professions) and for home language (more non-standard features when a language other than Dutch is spoken at home). After clustering the subvariables to create two groups of teenagers holding extreme positions on the social continuum, even more distinct linguistic profiles emerge.
Additional analyses reveal the complexity of adolescents' social class: interactions with both age and gender can be found (i.e. social class does not have the same impact on girls versus boys, or on younger versus older teenagers). When we expand the research to non-prototypical groups of teenagers (i.e. teenagers which do not clearly belong to higher or lower class, but have a more 'hybrid' profile, and are somewhere in-between on the social continuum), deviant linguistic patterns emerge, which seems to suggest that additional factors such as social mobility, ambition, and the structure of peer networks need to be taken into account.
Finally, we will discuss some ideas and challenges for future research.
The majority of research on extracting missing user attributes from social media profiles use costly hand-annotated labels for supervised learning. Distantly supervised methods exist, although these generally rely on knowledge gathered using external sources. This paper demonstrates the effectiveness of gathering distant labels for self-reported gender on Twitter using simple queries. We confirm the reliability of this query heuristic by comparing with manual annotation. Moreover, using these labels for distant supervision, we demonstrate competitive model performance on the same data as models trained on manual annotations. As such, we offer a cheap, extensible, and fast alternative that can be employed beyond the task of gender classification. Data and code are available open-source here.
We present an update on FoLiA, the Format for Linguistic Annotation and the rich infrastructure surrounding it. We will give a brief shout-out of new features in FoLiA v1.4 (august 2016) and FoLiA v1.5 (released october 2017) and stress the need for well-developed data formats in our field. We'll highlight ties to linked open data, richer support for metadata, and newly updated tools such as a ReStructuredText to FoLiA converter. We will have a practical hands-on focus and point users to the Python library, the C++ library, and the various FoLiA tools and utilities.
Featured as a demo at last year's ATILA, we again present FLAT, a web-based linguistic annotation environment based on FoLiA, and show what has been improved in the meantime.
We present what's new with Frog and ucto. Frog is a rich NLP-suite, an integration of memory-based natural language processing (NLP) modules developed for Dutch. It features native FoLiA support, comes with a Python binding, and is currently funded by CLARIAH. Ucto is a multilingual rule-based tokeniser, also embedded into Frog.
CLAM has been in use for quite some years to make command-line tools available over the web as RESTful webservices with an end-user web-application interfaces. We briefly show what it does and point users to the collection of webservices hosted at Radboud University Nijmegen.
We present LaMachine, a software distribution we use to release and distribute all of our NLP software. It comes in three flavours for flexibility: a Virtual Machine, a Docker container and a local installation script. We invite third parties to join in if interested.
We present the CLARIAH / eScience Center TICCLAT project due to start January 2018. We will extend TICCL's correction capabilities with classification facilities based on specific data collected from the full diachronic Dutch Nederlab corpus: word statistics, document and time references and linguistic annotations, i.e. Part-of Speech and Named-Entity labels. These data will complement a solid, renewed basis composed of the available validated lexicons and name lists for Dutch. In this, TICCL as a post-correction tool will be transformed into TICCLAT, a lexical assessment tool capable of delivering not only correction candidates, but also e.g. more accurately dated diachronic Dutch word forms, more securely classified person and place names. To achieve this on scale, the TICCLAT project will seek a successful merger of TICCL's anagram hashing with bit-vectorization techniques. TICCLAT's capabilities will also be evaluated in comparison to human performance by an expert psycholinguist.
We learn unsupervised dense patient representations from unstructured clinical data using a stacked denoising autoencoder and a paragraph vector model. We evaluate the representations by using them as features for multiple independent tasks, and compare the performance with those of sparse representations. We explore the best encoded features within the representations, and extract the most significant features when the pretrained representations are used as the input to the classifiers.
We present the latest outcomes on our experiments on Implicit evaluation of Machine Translation.
During this ligthning talk, I 'll expose the challenges faced by a company focused on medical NER in Dutch. Then I'll expose the strategy we adopted and the leads we investigate to improve the performances of our system.
We will present how patterns can be identified in keystroke data obtained during different writing tasks, and how this information can be used to predict the writing task.
Language provides an important source of information to predict human personality. However, most studies that have predicted personality traits using computational linguistic methods have focused on lexicon-based information. We investigate to what extent the performance of lexicon-based and grammar-based methods compare when predicting personality traits. We analyzed a corpus of student essays and their personality traits using two lexicon-based approaches, one top-down (Linguistic Inquiry and Word Count (LIWC)), one bottom-up (topic models) and one grammar-driven approach (Biber model), as well as combinations of these models. Results showed that the performance of the models and their combinations demonstrated similar performance, showing that lexicon-based top-down models and bottom-up models do not differ, and neither do lexicon-based models and grammar-based models. Moreover, combination of models did not improve performance. These findings suggest that predicting personality traits from text remains difficult, but that the performance from lexicon-based and grammar-based models are on par.
The development of the social web has stimulated creative and figurative language use like irony. This frequent use of irony on social media has important implications for natural language processing tasks, which struggle to maintain high performance when applied to ironic text (Liu, 2012; Maynard and Greenwood, 2014; Ghosh and Veale, 2016). The goal of the SemEval-2018 Task 3 is to provide the research community with a manually annotated irony dataset and to incite this community to develop state-of-the-art irony classification systems. The dataset provides binary and fine-grained class labels, allowing to develop binary (subtask A) and multi-class (subtask B) irony classifiers. Participants of the task are invited to submit their systems until 22 (subtask A) and 29 (subtask B) January 2018.
Flow.ai is a complete suite for conversational AI design. It provides an intuitive drag and drop interface with integrated NLP. Flow.ai can be used to create, train and host chatbots and smart assistants and deploy them on a variety of channels, for example web chat, Facebook Messenger, Amazon Alexa or Rocket.Chat. Sign up at https://flow.ai!
Our dinner and social activity will held at Het Wapen Van Tilburg, just down the road of the main location (see the map). On the menu will be the following:
Several small starters creatively presented.
Choice of: Steak with pepper sauce, Vegetable tempura with rice, or Cod tail with beurre blanc.
Choice of: Cheese platter, or a Sweet dessert.
Everyone has up to six drinks (beer / wine / non-alcoholic) at their disposal for the entire evening (including social activity). If you're planning on drinking less, treat someone thirsty. :)
Collections of tweets are overly rich in the sense that not all tweets are relevant for a task at hand. Tweets can be irrelevant for a particular task, for instance because they are posted by non-human accounts, contain spam, refer to irrelevant events, or point to irrelevant sense of an ambiguous keyword used in data collection. This richness has a dynamic characteristics, which can be introduced in a static or continuously updated collection, as well. There is not any guarantee tweet collections will have similar characteristics across different periods of time.
We introduce the term information thread and our tool, Relevancer, in order to specify and handle collections of tweets respectively. Related group of tweets (information threads), in definition of an expert, are detected using unsupervised machine learning, confirmed by a human expert, and used to classify remaining or new tweets using supervised machine learning. An expert can be anybody who is able to make knowledgeable decisions about how to annotate tweet clusters in order to understand a tweet collection in a certain context.
Relevancer enables an expert to analyze a tweet collection, i.e. any set of tweets collected or being collected by using keywords. The tool requires expert feedback in terms of cluster annotation in order to complete the analysis. Experts can repeat the analysis process in case they collect new data with the same keywords or decide to do another type of annotation as they understand the collection better when they evaluate the automatically selected first
set of coherent clusters. Our method advances the state of art in terms of e fficient and complete understanding and management of a non-standard, rich, and a dynamic data type.
The strength of our approach is the ability to scale to a large collection without sacrificing the precision or the recall by understanding intrinsic characteristics of the features that can be extracted from tweets, used key terms, and temporal content distribution on social media. Finally, sharing the responsibility for completeness and precision with the users of the tool ensures they will achieve and preserve the target performance they require.
In this session, we present how to work with Relevancer on four use cases that uses collections collected with the words 'flood', 'earthquake', 'genocide', and 'griep'. The demo website and the source code are both available.
Online platforms are increasingly being used for expressing suicidal thoughts, but moderators are faced with information overload when monitoring for such signals of distress. In this talk, we present our work on online suicidality detection. It is intended to help moderators by automatically identifying content that requires their attention, and triaging it to ensure that urgent content can be responded to more quickly and consistently. First, we discuss the problem of data collection, labeling and the classification tasks that can be derived. We describe our approaches for performing classification, and evaluate them on Dutch and English datasets. Next, we describe how moderators perform with and without such technology at their disposal: do systems that do well in theory make a difference in practice? We finish the talk with a discussion on potential other ways in which NLP can support suicide prevention, and how we could make such systems language and platform independent.
Learning from past incidents has a great importance for disaster managers. Estimation of the outcomes beforehand can improve preparations for the next incidents. To make this a less labour-intensive task, we aim to automate extracting information from past events. We focus on extracting critical information about flooding events from newspaper articles as our use case. We treat this information extraction task as a sequential labelling task and train a supervised machine learning algorithm, namely Conditional Random Fields, to achieve our goal. However, supervised learning requires manually annotated training data, which is very expensive and time-consuming to obtain. To reduce the need for manual annotation, Active Learning, a human-in-the-loop method, is explored. We obtain improvement on f1-score up to 25% and observe that Active Learning drastically reduces the effort required by annotation.
As publishing has become more and more accessible and basically cost-free, virtually anyone can get their words printed, whether online or on paper. Such ease of disseminating content doesn't necessarily go together with author identifiability. In other words: it's very simple for anyone to publicly write any text, but it isn't equally simple to always tell who the author of a text is. Telling the author of a text can be thought of at various levels of detail. For example, in some contexts, and possibly in the interest of companies who want to advertise, or legal institutions, it can correspond to profiling, namely defining certain characteristics of the author, such as sex and age. In other contexts, and in the interest also of ancient and contemporary literary or historical studies, identifying authors can mean being able to tell whether two texts are likely to have been written by the same person. The latter problem can take more than one form in practice, as one could be faced with one unknown text to compare to another one written by a known author, or could be given a large number of unknown texts to be clustered according to authorship. To what extent is all this feasible? And is it meaningful?
In this talk, I will discuss the specifics of such tasks, and describe a couple of systems that perform author profiling and author verification on different kinds of texts from different languages, experimenting with various linguistic and structural features, as well as various approaches. I will also discuss such systems and their performance not only in terms of how they fare, but also in terms of what it means to profile and identify authors, and what challenges lie ahead for people working in this field.
Organic soup, worstenbrood, pistolets with organic cheese and/or ham with salad and dressing, coffee, tea, orange juice and organic milk.
It is widely held that smells and flavors are impossible to put into words. In this paper we test this claim by seeking predictive patterns in wine reviews, which ostensibly aim to provide guides to perceptual content. Wine reviews have previously been critiqued as random and meaningless. We collected an English corpus of wine reviews with their structured metadata, and applied machine learning techniques to automatically predict the wine's color, grape variety, and country of origin. To train the three supervised classifiers, three different information sources were incorporated: lexical bag-of-words features, domain-specific terminology features, and semantic word embedding features. In addition, using regression analysis we investigated basic review properties, i.e., review length, average word length, and their relationship to the scalar values of price and review score. Our results show that wine experts do share a common vocabulary to describe wines and they use this in a consistent way, which makes it possible to automatically predict wine characteristics based on the review text alone. This means that odors and flavors may be more expressible in language than typically acknowledged.
In the past year we have been closely working together with the Ghent-based company Hello Customer. The objective was to develop a proof-of-concept aspect-based sentiment analysis (ABSA) pipeline. During this presentation I will discuss the pipeline we created for Dutch that was trained and tested on qualitative user feedback coming from three domains: retail, banking and human resources. The two latter domains provide service-oriented data, which has not been investigated before in the domain of fine-grained sentiment analysis. By performing in-domain and cross-domain experiments the validity of our approach was investigated. We show promising results for the three ABSA subtasks: aspect term extraction, aspect category classification and aspect polarity classification. Currently, we are also working with comparable English and French data.
Some limitations can be found when looking at the current state of data-to-text systems that focus on the journalistic domain. On the one hand there are modern systems which are often closed systems, inaccessible for the general public and interested researchers. On the other hand there are less recent systems which are also often inaccessible because the code has become obsolete or abandoned. I present PASS, a modular and open source data-to-text system that generates Dutch soccer reports from match statistics. One of the novel elements of PASS is the fact that the system produces corpus-based texts tailored towards fans of one club or the other, which can most prominently be observed in the tone of voice used in the reports.
We hope to see you all next year!
Dr. Malvina Nissim has a background in linguistics and computational linguistics. She is interested in understanding how (natural) language works, and does so also with the aid of digital resources and tools. At the same time, she's contributing to creating tools which help process texts. In 2016 she was awarded with the title "University of Groningen Lecturer of the Year".
Located directly in front of the Central Station, the museum has existed for more than 80 years and still attracts many visitors every year.
Spoorlaan 434,5038 CH, Tilburg
The museum offers a 24h parking (exit) ticket for €4,50 if you park at Parkeergarage Knegtel (Gasthuisring 60, 5041 DT, Tilburg).
After exiting the station on the left side after the gates, turn right after crossing the street and walk until you see the building on your left-hand side. It's in front of the station's bus stops.
Chris van der Lee
Jan de Wit
Thiago Castro Ferreira
Maira Brandao Carvalho
Menno van Zaanen
Laura Van Brussel
Claudia Matos Veliz
Orphee De Clercq
Ayla Rigouts Terryn
Luna De Bruyne
Ko van der Sloot
Maarten van Gompel
Mustafa Erkan Basar
Antal van den Bosch