BNAIC 2017

The 29th Benelux Conference on Artificial Intelligence

November 8–9, 2017 in Groningen, The Netherlands


The program is also available as a PDF file. All accepted papers can be found in the preproceedings (PDF). The BNAIC postproceedings are published in the Springer CCIS series (Communications in Computer and Information Science) and will be accessible at (expected by March 2018; volume number: CCIS 823).

Springer CCIS


Ground floor



Opening day 1

Congreszaal Martijn van Otterlo

Machine Learning

Room 1.04 Marieke van Vugt

Agent Systems

Room 1.12 Marco Wiering

Natural Language Processing

Grand Café



Keynote lecture

Grand Café


Room 1.04

KION Meeting

Congreszaal Diederik Roijers

Reinforcement Learning

Room 1.04 Harmen de Weerd

Agent Systems

Room 1.12 Henry Prakken

Uncertainty in AI

Grand Café


Room 1.12

IPN SIG AI Meeting


Room 1.08 Central hall 1st floor Room 1.09
Central hall 1st floor


Grand Café



Keynote lecture


Ground floor


Congreszaal Nico Roos


Room 1.04 Tibor Bosse

Knowledge & Reasoning

Room 1.12 Siamak Mehrkanoon

Machine Learning

Grand Café



Keynote lecture

Grand Café


Congreszaal Tom Lenaerts

FACt Talks

Grand Café



BNVKI General Assembly

Room 1.04 Marco Wiering

Reasearch & Business Session

Guillaume Barat NVIDIA Jean-Paul van Oosten Target Holding Arjen van Wijngaarden Anchormen Tijn van der Zant Sim-CI Panel discussion
Central hall 1st floor


Grand Café



Keynote lecture


Closing: Awards, BNAIC 2018

Keynote speakers

Photo of Marco Dorigo Marco Dorigo Université Libre de Bruxelles

Swarm robotics: Current research directions at IRIDIA

Swarm robotics studies how to design and implement groups of robots that operate without relying on any external infrastructure or on any form of centralized control. Such robot swarms exploit self-organization to perform tasks that require cooperation between the robots. In the last ten years there has been a lot of progress in swarm robotics research. Quite complex robot swarms have been demonstrated in a number of different case studies. However, this progress has also led to the identification of some problems that might hinder further development. In the talk, I will discuss what I consider to be the main current issues in swarm robotics research. I will then propose a few novel research directions that could allow us to successfully address these issues and therefore take robot swarms one step closer to real world deployment.

Photo of Laurens van der Maaten Laurens van der Maaten Facebook AI Research

From Visual Recognition to Visual Understanding

This talk gives an overview of some of our recent work on models for visual recognition and visual understanding.

The first part of the talk presents a new visual recognition model, called DenseNet. DenseNets change the common connectivity pattern of convolutional networks by using the activations from all prior layers as input into a layer. It also presents an extension, called multi-scale DenseNet, that has the ability to adapt dynamically to computational resource limits at inference time. Specifically, our architecture spends less computation on “easy” images, and uses the surplus computation to obtain higher accuracy on “hard” images.

The second half of the talk presents our work on moving from visual recognition to visual understanding, focusing on a problem setting known as visual question answering. It discusses the problems of biases in current visual question answering benchmarks and presents a new benchmark, called CLEVR, that aims to overcome these problems. We use CLEVR to study a new type of deep models that aims to model the reasoning processes required to answer questions more explicitly than standard models.

Photo of Luc Steels Luc Steels Institute for Advanced Studies (ICREA), Barcelona

Digital Replicants and Mind-Uploading

Strange things are beginning to happen with AI. One of them is the creation of virtual agents that mimick real people. Replika is a good example. It is a company that creates AI chatbots based on the digital traces of a person (email, Twitter, Facebook, etc.) and was originally set up to preserve the memory of deceased relatives. More recently it launched an app so that a living person can make a digital copy of him or herself, which can then autonomously send messages or perform actions in lieu of the person, and would ultimately survive its owner in cyberspace. These developments, and there are several more intriguing things happening, such as brain-computer interfaces, neural implants, mind uploading, and augmented reality, are beginning to have a remarkable impact on how people start thinking about age-old questions that used to be the domain of religion: Do we have a soul that can survive our body? Is (digital) immortality possible? What if we begin to have artificial agents that mingle with our own digital replicants? And what if these agents have bad intentions? Can a cyborg emerge which is based on a human body augmented with a broad range of digital devices and artificial intelligence?

Together with neuroscientist Oscar Vilarroya, I have been exploring this future impact of AI on our society and on the image that humanity has of itself. Rather than writing scientific papers, we decided to write an opera, Fausto, that imagines a future where digital replicants and mind uploading are common. Vilarroya in the role of libretist used the age-old saga of Faust as the narrative backbone and myself, being the composer, used the language of the Western musical tradition exploring melody, harmony and rhythm to express the emotional turbulences of the main characters. My talk will present both the philosophical ideas behind this opera and show some clips of a recent performance.

Photo of Rineke Verbrugge Rineke Verbrugge University of Groningen

Recursive theory of mind: between logic and cognition

Computational agents often reason about other agents' beliefs, knowledge, goals and plans, based on extensions of modal logics of knowledge and belief. Usually, such agents are capable of an arbitrary amount of recursion when reasoning about the mental states of other computational agents. However, people lose track of such ‘recursive theory of mind’ reasoning after a few levels.

This lecture is about several strands of research related to recursive theory of mind, starting with the question how children develop from first-order theory of mind ('Dad doesn't know that I took the chocolate") to second-order theory of mind ("Alice believes that I believe that she wrote a novel under pseudonym"). Another question is why adults are not very good at higher-order strategic reasoning in games and how software agents can support them to improve their social sophistication. Finally, there is the question why higher-order theory of mind has evolved in the first place, if it’s so difficult. To investigate these questions, we take logic into the lab and combine epistemic logic, computational cognitive models, agent-based models, and empirical research.

FACt talks – FACulty focusing on the FACts of Artificial Intelligence

Photo of Bert Bredeweg Bert Bredeweg Universiteit van Amsterdam

Humanly AI: Creating smart people with AI

Modern stochastic-based AI is without doubt responsible for many of the recent successes involving machine learning and pattern recognition. This swing back to Goofy control theory and cybernetics is good for industry and owners of big data. But what does it entail for people? End of privacy, autonomous weapons, fake news

We prefer to take a very different approach, and are interested in developing AI that supports people, not destroys them. Humanly AI that is articulate, reflective and communicable. Articulate refers to being explicit, either in knowledge (knowing things) or in formats (knowing how to represent things). Reflective refers to being able to assess and process what is articulated (e.g. compare, contrast, order, etc.). Communicable refers to being able to interactively share and co-construct, typically with the goal to obtain more and better knowledge, possible for the AI but surely also for the human in the loop.

This presentation will illustrate the added value of Humanly AI for learners in higher education (e.g. Schlatter et al., EC-TEL 2017) and for scientists doing cutting edge research (e.g. Kansou et al., Scientific Reports 2017), and argue that this is more fun, more valuable, and much less dangerous then Goofy control theory and cybernetics.

Photo of Eric Postma Eric Postma Tilburg University

Towards artificial human-like intelligence

The recent upsurge in "AI" (more appropriately: deep learning) gives rise to widespread comparisons between human and artificial intelligence. In the presentation, I will argue that there still exists a huge gap between human and artificial intelligence. Moreover, concepts associated with human intelligence do not transfer to artificial intelligence. As a case in point: AlphaGo (Zero) is often stated to have "intuition" about the game of Go, although intuition is an ill-understood introspective concept. I will outline my ideas about "artificial human-like intelligence" and relate these ideas to attempts to bridge sub-symbolic and symbolic AI and to Artificial General Intelligence.

Photo of Geraint Wiggins Geraint Wiggins Queen Mary University of London/Vrije Universiteit Brussel

Introducing Computational Creativity

How does a mathematician invent a new mathematical structure or a new proof? How does a composer make a new song, or an painter conceive a new painting or a sculptor imagine a sculpture? How does a programmer imagine her program? And when these people have ideas for these things, what separates the great ones from the everyday ones, and the everyday ones from the bad ones? Some people believe that computers can never be creative. In computational creativity, we beg to disagree. So far, we’ve been quite successful, building software that can learn to create in ways similar to humans, making artefacts that are at least as good as basic human work, and in some cases, reaching the pinnacles of human creativity. In this talk, I’ll outline the main issues in computational creativity, and give some examples of challenges and successes.

Accepted papers and demonstrations

  1. Type A: regular papers
  2. Type B: compressed contributions
  3. Type C: demonstrations
  4. Type D: thesis abstracts

Type A: regular papers

Oral presentation

  • Nico Roos Learning-based diagnosis and repair

    This paper proposes a new form of diagnosis and repair based on reinforcement learning. Self-interested agents learn locally which agents may provide a low quality of service for a task. The correctness of learned assessments of other agents is proved under conditions on exploration versus exploitation of the learned assessments.

    Compared to collaborative multi-agent diagnosis, the proposed learning-based approach in not very efficient. However, it does not depend on collaboration with other agents. The proposed learning based diagnosis approach may therefore provide an incentive to collaborate in the execution of tasks, and in diagnosis if tasks are executed in a suboptimal way.

  • Gleb Polevoy and Mathijs de Weerdt Competition between Cooperative Projects

    A paper needs to be good enough to be published; a grant proposal needs to be sufficiently convincing compared to the other proposals, in order to get funded. Papers and proposals are examples of cooperative projects that compete with each other and require effort from the involved agents, while often these agents need to divide their efforts across several such projects. We aim to provide advice how an agent can act optimally and how the designer of such a competition (e.g., the program chairs) can create the conditions under which a socially optimal outcome can be obtained. We therefore extend a model for dividing effort across projects with two types of competition: a quota or a success threshold. In the quota competition type, only a given number of the best projects survive, while in the second competition type, only the projects that are better than a predefined success threshold survive. For these two types of games we prove conditions for equilibrium existence and efficiency. Additionally we find that competitions using a success threshold can more often have an efficient equilibrium than those using a quota. We also show that often a socially optimal Nash equilibrium exists, but there exist inefficient equilibria as well, requiring regulation.

  • Remi Wieten, Floris Bex, Linda van der Gaag, Henry Prakken and Silja Renooij Refining a Heuristic for Constructing Bayesian Networks from Structured Arguments

    Recently, a heuristic was proposed for constructing Bayesian networks (BNs) from structured arguments. This heuristic helps domain experts who are accustomed to argumentation to transform their reasoning into a BN and subsequently weigh their case evidence in a probabilistic manner. While the underlying undirected graph of the BN is automatically constructed by following the heuristic, the arc directions are to be set manually by a BN engineer in consultation with the domain expert. As the knowledge elicitation involved is known to be time-consuming, it is of value to (partly) automate this step. We propose a refinement of the heuristic to this end, which specifies the directions in which arcs are to be set given specific conditions on structured arguments.

  • Gleb Polevoy and Mathijs de Weerdt Reciprocation Effort Games

    Consider people dividing their time and effort between friends, interest clubs, and reading seminars. These are all reciprocal interactions, and the reciprocal processes determine the utilities of the agents from these interactions. To advise on efficient effort division, we determine the existence and efficiency of the Nash equilibria of the game of allocating effort to such projects. When no minimum effort is required to receive reciprocation, an equilibrium always exists, and if acting is either easy to everyone, or hard to everyone, then every equilibrium is socially optimal. If a minimal effort is needed to participate, we prove that not contributing at all is an equilibrium, and for two agents, also a socially optimal equilibrium can be found. Next, we extend the model, assuming that the need to react requires more than the agents can contribute to acting, rendering the reciprocation imperfect. We prove that even then, each interaction converges and the corresponding game has an equilibrium.

  • Linford Goedschalk, Tibor Bosse and Marco Otte Get Your Virtual Hands Off Me! - Developing Threatening Agents Using Haptic Feedback

    Intelligent Virtual Agents (IVAs) become widely used for numerous applications, varying from healthcare decision support to communication training. In several of such applications, it is useful if IVAs have the ability to take a negative stance towards the user, for instance for anti-bullying or conflict management training. However, the believability of such 'virtual bad guys' is often limited, since they are non-consequential, i.e., are unable to apply serious sanctions to users. To improve this situation, this research explores the potential of endowing IVAs with the ability to provide haptic feedback. This was realized by conducting an experiment in which users interact with a virtual agent that is able to physically 'touch' the user via a haptic gaming vest. The effect on the loudness of the speech and the subjective experience of the participants was measured. Results of the experiment suggest there might be an effect on the subjective experience of the participants and the loudness of their speech. Statistical analysis, however, shows no significant effect but due to the relatively small sample size it is advisable to further look into these aspects.

  • Marieke van Vugt, Armin Brandt and Andreas Schulze-Bonhage Tracking Perceptual and Memory Decisions by Decoding Brain Activity

    Decision making is thought to involve a process of evidence accumulation, modelled as a drifting diffusion process. This modeling framework suggests that all single-stage decisions involve a similar evidence accumulation process. In this paper we use decoding by machine learning classifiers on intracranially recorded EEG (iEEG) to examine whether different kinds of decisions (perceptual vs. memory) exhibit dynamics consistent with such drift diffusion models. We observed that decisions are indeed decodable from brain activity for both perceptual and memory decisions, and that the time courses for these types of decisions appear to be quite similar. Moreover, the high spatial resolution of iEEG reveals that perceptual and memory decisions rely on slightly different brain areas. While the accuracy of decision decoding can stil be improved, these initial studies demonstrate the power of decoding analyses to examine computational models of cognition.

  • Bram Wiggers and Harmen de Weerd The origin of mimicry: Deception or merely coincidence?

    One of the most remarkable phenomena in nature is mimicry, in which one species (the mimic) evolves to imitate the phenotype of another species (the model). Several reasons for the origin of mimicry have been proposed, but no definitive conclusion has been found yet. In this paper, we test several of these hypotheses through an agent based co-evolutionary model. In particular, we consider two possible alternatives: (1) Deception, in which mimics evolve to to imitate the phenotype of models that predators avoid to eat, and (2) Coincidence, in which models evolve a warning color to avoid predation, which coincidentally benefits the mimics.

    We find that both these hypotheses are a plausible origins for mimicry, but also that once a mimicry situation has been established through coincidence, mimics will take advantage of the possibility for deception as well.

  • Claudio Reggiani, Yann-Aël Le Borgne and Gianluca Bontempi Feature selection in high-dimensional dataset using MapReduce

    This paper describes a distributed MapReduce implementation of the minimum Redundancy Maximum Relevance algorithm, a popular feature selection method in bioinformatics and network inference problems. The proposed approach handles both tall/narrow and wide/short datasets. We further provide an open source implementation based on Hadoop/Spark, and illustrate its scalability on datasets involving millions of observations or features.

  • David Roschewitz, Kurt Driessens and Pieter Collins Simultaneous Ensemble Generation and Hyperparameter Optimization for Regression

    The development of advanced hyperparameter optimization algorithms, using e.g. Bayesian optimization, has encouraged a departure from hand-tuning. Primarily, this trend is observed for classification tasks while regression has received less attention. In this paper, we devise a method for simultaneously tuning hyperparameters and generating an ensemble, by explicitly optimizing parameters in an ensemble context. Techniques traditionally used for classification are adapted to suit regression problems and we investigate the use of more robust loss functions. Furthermore, we propose methods for dynamically establishing the size of an ensemble and for weighting the individual models. The performance is evaluated using three base-learners and 16 datasets. We show that our algorithms consistently outperform single optimized models and can outperform or match the performance of state of the art ensemble generation techniques.

  • Mathijs Pieters and Marco Wiering Comparison of Machine Learning Techniques for Multi-label Genre Classification

    We compare classic text classification techniques with more recent machine learning techniques and introduce a novel architecture that outperforms many state-of-the-art approaches. These techniques are evaluated on a new multi-label classification task, where the task is to predict the genre of a movie based on its subtitle. We show that pre-trained word embeddings contain ’universal’ features by using the Semantic-Syntactic Word Relationship test. Furthermore, we explore the effectiveness of a convolutional neural network (CNN) that can extract local features, and a long short term memory network (LSTM) that can find time-dependent relationships. By combining a CNN with an LSTM we observe a strong performance improvement. The technique that performs best is a multi-layer perceptron, with as input the bag-of-words model.

  • Paul Ozkohen, Jelle Visser, Martijn van Otterlo and Marco Wiering Learning to Play Donkey Kong Using Neural Networks and Reinforcement Learning

    Neural networks and reinforcement learning have successfully been applied to various games, such as Ms. Pac-Man and Go. We combine multilayer perceptrons and a class of reinforcement learning algorithms called actor-critic to learn to play the arcade classic Donkey Kong game. Two neural networks are used in this study, the actor and the critic. The actor neural network learns to select the best action given the game state; the critic tries to learn the value of being in a certain state. First, a base game-playing performance is obtained by making learning from demonstration data, which is obtained from humans playing the game. After this off-line training phase we try to further improve this base performance using feedback from the critic. The critic gives feedback by comparing the value of the state before and after taking the action. Results show that an agent pre-trained on demonstration data is able to achieve a good baseline performance. Applying actor-critic methods, however, does usually not improve performance, in many cases even decreasing it. Possible reasons include the game not fully being Markovian and other issues.

Poster presentation

  • Diego Alvarez-Estevez and Vicente Moret-Bonillo A Proposal to Solve Rule Conflicts in the Wang-Mendel Algorithm for Fuzzy Classification Using Evidential Theory

    This paper addresses the problem of solving rule conflicts in a modified version of the Wang-Mendel algorithm for the induction of fuzzy classification rules. At this respect we propose a solution based on the reinterpretation of the conflict resolution mechanism as an evidence reassignment problem. The Evidential-theory framework developed by Dempster and Shafer is used for this purpose, and different alternative conflict handling strategies are explored. Experiments are carried out using a benchmark of well-known classification problems. Our preliminary results are encouraging toward supporting the usefulness of the proposed approach.

  • William Kos, Marijn Schraagen, Matthieu Brinkhuis and Floris Bex Classification in a Skewed Online Trade Fraud Complaint Corpus

    This paper explores how machine learning techniques can be used to support handling of skewed online trade fraud complaints, by predicting whether a complaint will be withdrawn or not. To optimize the performance of each classifier, the influence of resampling, word weighting, and word normalization on the classification performance is assessed. It is found that machine learning can indeed be used for this purpose, by improving the baseline performance in comparison to the skewness ratio up to 13 pp using Logistic Regression. Furthermore, the results show that data alteration techniques can improve classifier performance on a skewed dataset up to 13.5 pp.

  • Laura van der Lubbe and Tibor Bosse Studying Gender Bias and Social Backlash via Simulated Negotiations with Virtual Agents

    This research investigates whether (female and male) virtual negotiators experience a social backlash during negotiations with an economical outcome when they are using a negotiation style that is congruent with the opposite gender. To this end, first some background research has been done on gender differences in negotiations and the social backlash that is experienced by men and women. Based on this literature study, an application has been implemented (using the tools Poser Pro, Renpy and IVONA) that enables users to engage in a salary negotiation with a virtual agent that plays the role of employee. Next, an experiment has been conducted in which 93 participants had an interactive negotiation with the virtual employee. Results show that the effect of gender on negotiation outcome and social backlash was less pronounced in this experiment than expected based on the literature. However, several factors, such as the experience of the participants and the provided context, could explain these findings.

  • Stefan Huijser, Niels Taatgen and Marieke van Vugt Distracted in a Demanding Task: A Classification Study with Artificial Neural Networks

    An important issue in cognitive science research is to know what your subjects are thinking about. In this paper, we trained multiple Artificial Neural Network (ANN) classifiers to predict whether subjects’ thoughts were focused on the task (i.e., on-task) or if they were distracted (i.e., distracted thought), based on recorded eye-tracking features and task performance. Novel in this study is that we used data from a demanding spatial complex working memory task. The results of this study showed that we could classify on-task vs. distracted thought with an average of 60% accuracy. Task performance was found to be the strongest predictor of distracted thought. Eye-tracking features (e.g., pupil size, blink duration, fixation duration) were found to be much less predictive. Recent literature showed potential for eye-tracking features, but this study suggests that the nature of the task can greatly affect this potential. Rehearsal effort based on eye-movement behavior was found to be the most promising eye-tracking feature. Although speculative, we argue that eye-movement features are independent of the content of distracted thought and may therefore provide a more generic feature for classifying distracted thought.

  • Marco Stam, Charlotte Gerritsen, Ward van Breda and Elias Krainsk Assessing the Spatiotemporal Relation between Twitter Data and Violent Crime

    Social media has grown over the past decade to become an important factor in the way people share information. This paper explores the feasibility of using Twitter data for predictive policing models within the city of Amsterdam, the Netherlands. A novel, powerful approach to Bayesian inference is used to assess the correlations between the spatiotemporal distributions of tweets containing predefined keywords and that of violent crime incidents. Spatiotemporal log-Gaussian Cox processes are considered using the stochastic partial differential equations (SPDE) approach for space correlated over time with autoregressive dynamics and fitted using the integrated nested Laplace approximations (INLA) method. The findings show that the occurrence of such tweets raises the probability of incidents occurring nearby in space-time. This novel insight has unveiled the promising potential of Twitter-based predictive policing models within the city of Amsterdam.

  • Sébastien Hoorens, Katrien Beuls and Paul Van Eecke Constructions at Work! Visualising Linguistic Pathways for Computational Construction Grammar

    Computational construction grammar combines well known concepts from artificial intelligence, linguistics and computer science into fully operational language processing models. These models allow to map an utterance to its meaning representation (comprehension), as well as to map a meaning representation to an utterance (formulation). The processing machinery is based on the unification of usage-patterns that combine morpho-syntactic and semantic information (constructions) with intermediate structures that contain all information that is known at a certain point in processing (transient structures). Language processing is then implemented as a search process, which searches for a sequence of constructions (a linguistic pathway) that successfully transforms an initial transient structure containing the input into a transient structure that qualifies as a goal. For larger grammars, these linguistic pathways become increasingly more complex, which makes them difficult to interpret and debug for the human researcher. In order to accommodate this problem, we present a novel approach to visualising the outcome of constructional language processing. The linguistic pathways are visualised as graphs featuring the applied constructions, why they could apply, with which bindings, and what information they have added. The visualisation tool is concretely implemented for Fluid Construction Grammar, but is also of interest to other flavours of computational construction grammar, as well as more generally to other unification-based search problems of high complexity.

  • Marjolein Troost, Katja Seeliger and Marcel van Gerven Generalization of an Upper Bound on the Number of Nodes Needed to Achieve Linear Separability

    An important issue in neural network research is how to choose the number of nodes and layers such as to solve a classification problem. We provide new intuitions based on earlier results by An et al. (2015) by deriving an upper bound on the number of nodes in two-layer networks such that linear separability can be achieved. Concretely, we show that if the data can be described in terms of N finite sets and the used activation function f is non-constant, increasing and has a left asymptote, we can derive how many nodes are needed to linearly separate these sets. For the leaky rectified linear activation function, we prove separately that under some conditions on the slope, the same number of layers and nodes as for the aforementioned activation functions is sufficient. We empirically validate our claims.

  • Kim Veltman, Harmen de Weerd and Rineke Verbrugge Socially smart software agents entice people to use higher-order theory of mind in the Mod game

    In social settings, people often need to reason about unobservable mental content of other people, such as their beliefs, goals, or intentions. This ability helps them to understand, to predict, and even to influence the behavior of others. People can take this ability further by applying it recursively. For example, they use second-order theory of mind to reason about the way others use theory of mind, as in 'Alice believes that Bob does not know about the surprise party'. However, empirical evidence so far suggests that people do not spontaneously use higher-order theory of mind in strategic games. Previous agent-based modeling simulations also suggest that the ability to recursively apply theory of mind may be especially effective in competitive settings. In this paper, we use a combination of computational agents and Bayesian model selection to determine to what extent people make use of higher-order theory of mind reasoning in a particular competitive game, the Mod game, which can be seen as a much larger variant of the well-known rock-paper-scissors game.

    We let participants play the competitive Mod game against computational theory of mind agents. We find that people adapt their level of theory of mind to that of their software opponent. Surprisingly, knowingly playing against second- and third-order theory of mind agents entices human participants to apply up to fourth-order theory of mind themselves, thereby improving their results in the Mod game. This phenomenon contrasts with earlier experiments about other strategic one-shot and sequential games, in which human players only displayed lower orders of theory of mind.

  • Elie Merhej, Steven Schockaert, T. Greg McKelvey and Martine De Cock Recommending Treatments for Comorbid Patients Using Word-Based and Phrase-Based Alignment Methods

    The problem of finding treatments for patients diagnosed with multiple diseases (i.e. a comorbidity) is an important research topic in the medical literature. In this paper, we propose a new data driven approach to recommend treatments for these comorbidities using word-based and phrase-based alignment methods. The most popular methods currently rely on combining specific information from individual diseases (e.g. procedures, tests, etc.), then aim to detect and repair the conflicts that arise in the combined treatments. This proves to be a challenge especially in the cases where the studied comorbidities contain large numbers of diseases. In contrast, our methods rely on training a translation model using previous medical records to find treatments for newly diagnosed comorbidities. We also explore the use of additional criteria in the form of a drug interactions penalty and a treatment popularity score to select the best treatment in the case where multiple valid translations for a single comorbidity are available.

  • Fabiano Dalpiaz, Mehdi Dastani and Davide Dell'Anna Reasoning about Norms Revision

    Norms with sanctions have been widely employed as a mechanism for controlling and coordinating the behavior of agents without limiting their autonomy. The norms enforced in a multi-agent system can be revised in order to increase the likelihood that desirable system properties are fulfilled or that system performance is sufficiently high. In this paper, we provide a preliminary analysis of some types of norm revision: relaxation and strengthening. Furthermore, with the help of some illustrative scenarios, we show the usefulness of norm revision for better satisfying the overall system objectives.

  • Florian Wimmenauer, Evgueni Smirnov and Matúš Mihalák Distribution-driven Regression Ensemble Construction for Time Series Forecasting

    This paper introduces a two-stage approach on selecting members of an ensemble generating accurate forecasts of a time series with a small amount of forecasts. In the first stage, models trained on similarly-distributed data are selected based on time series stationarity, more specially, the local stationary of sub-series. The second stage identifies the most diverse models among those selected in the first stage to compose the final ensemble. Diversity is measured either on pair-wise basis or over the complete ensemble. In both case, multiple novel diversity metrics are introduced. Additionally, the presented approach is highly modular and does not presuppose the type of comprising models of the ensemble. The experiments show that the proposed approach outperforms the base line when predicting both synthetic and real-world time series.

  • Jérôme Renaux, Jan Ramon and Andrea Argentin A Hierarchical Bayesian Network for the Optimization of SRM Assays

    Many experimental processes in biomedical sciences consist in several sequential steps. Predictions regarding the final output of these processes can be made based on the initial input, by learning a mapping between the two and considering the corresponding process as a black box. This simple approach can be improved upon by opening the black box and performing inference about all the steps of the process as well as the relationships between them. This level of reasoning allows to answer a broader range of more refined queries, to potentially achieve better predictions and to gain insights into the workings of the process of interest. We present such an approach applied to mass spectrometry proteomics in the form a sequential architecture of probabilistic models trained to solve an important problem in that field.

  • Jonathan Hogervorst, Emmanuel Okafor and Marco Wiering Deep Colorization for Facial Gender Recognition

    Recent research suggests that colorization models have the capability of generating plausible color versions from grayscale images. In this paper, we investigate whether colorization prior to gender classification improves classification performance on the FERET grayscale face dataset. For this, we colorize the images using an existing Lab colorization model, both with and without class rebalancing, and our novel HSV colorization model without class rebalancing. Then we construct gender classification models on the grayscale and colorized datasets using a reduced GoogLeNet convolutional neural network. Several models are trained using different loss functions (cross entropy loss, hinge loss) and gradient optimization solvers (Nesterov’s Accelerated Gradient Descent, Stochastic Gradient Descent), initialized using both random and pre-trained weights. Finally, we compare the gender classification accuracies of the models when applied to the face image color variants. The best performances are obtained by models initialized using pre-trained weights, and models using colorization without class rebalancing.

  • Henry Maathuis, Luuk Boulogne, Marco Wiering and Alef Sterk Predicting Chaotic Time Series using Machine Learning Techniques

    Predicting chaotic time series can be applied in many fields, e.g. in the form of weather forecasting or predicting stocks. This paper discusses several neural network approaches to perform a regression prediction task on chaotic time series. Each approach is evaluated on its sequence prediction ability on three different data sets: the intermittency map, logistic map and a six-dimensional model. In order to investigate how well each regressor generalizes, they are compared to a 1 Nearest Neighbor baseline. In previous work, the Hierarchical Mixture of Experts architecture (HME) has been developed. For a given input, this architecture chooses between specialized neural networks. In this work, these experts are Multilayer Perceptrons (MLPs), Residual MLPs, and Long Short-Term Memory neural networks (LSTMs). The results indicate that a Residual MLP outperforms a standard MLP and an LSTM in sequence prediction tasks on the logistic map and the 6-dimensional model. The standard MLP performs best in a sequence prediction task on the intermittency map. With the use of HMEs, we successfully reduced the error in all the above mentioned time series prediction tasks.

Type B: compressed contributions

Oral presentation

  • Elias Fernández Domingos, Juan Carlos Burguillo and Tom Lenaerts Reactive Versus Anticipative Decision Making in a Novel Gift-Giving Game

    Gift-giving games provide an excellent framework for the study of the emergence of trust, fairness and generosity. Here, we study a novel gift-giving game, i.e., the Anticipation Game, which was originally designed to model the effect of group formation on gift-giving. To find out how the cognitive mechanism of anticipation affects individual decision-making in this scenario, we implemented anticipative agents and compared their behavior to reactive (backwards-looking) ones. Our results show that only anticipatory agents were able to reproduce some of the characteristics of human decision-making observed in experiments, making them more adequate to model human behavior.

  • Neil Yorke-Smith Evaluating Intelligent Knowledge Systems

    The article published in Knowledge and Information Systems examines the evaluation of a user-adaptive personal assistant agent designed to assist a busy knowledge worker in time management. The article examines the managerial and technical challenges of designing adequate evaluation and the tension of collecting adequate data without a fully functional, deployed system. The PTIME agent was part of the CALO project, a seminal multi-institution effort to develop a personalized cognitive assistant. The project included a significant attempt to rigorously quantify learning capability, which the article discusses for the first time, and ultimately the project led to multiple spin-outs including Siri. Retrospection on negative and positive experiences over the six years of the project underscores best practice in evaluating user-adaptive systems. Through the lessons illustrated from the case study of intelligent knowledge system evaluation, the article highlights how development and infusion of innovative technology must be supported by adequate evaluation of its efficacy.

  • Johan Kwisthout The Parameterized Complexity of Approximate Inference in Bayesian Networks

    Computing posterior and marginal probabilities constitutes the backbone of almost all inferences in Bayesian networks. These computations are known to be intractable in general; moreover, it is known that approximating these computations is also NP-hard. In the original paper we use fixed-error randomized tractability analysis, a recent randomized analogue of parameterized complexity analysis, to systematically address the complexity of (randomized) approximate inference in Bayesian networks. In this extended abstract we will give a brief introduction of the key concepts and results in this paper.

  • Henry Prakken On the Problem of Making Autonomous Vehicles Conform to Traffic Law

    Autonomous vehicles are one of the most spectacular recent developments of Artificial Intelligence. Among the problems that still need to be solved before they can fully autonomously participate in traffic is the one of making their behaviour conform to the traffic laws. This paper discusses this problem by way of a case study of Dutch traffic law. First it is discussed to what extent Dutch traffic law exhibits features that are traditionally said to pose challenges for AI & Law models, such as exceptions, rule conflicts, open texture and vagueness, rule change, and the need for commonsense knowledge. Then three approaches to the design of law-conforming autonomous vehicles are evaluated in light of the challenges posed by Dutch traffic law, which includes an assessment of the usefulness of AI & Law models of nonmonotonic reasoning, argumentation and case-based reasoning.

  • Sander Beckers and Joost Vennekens The Transitivity and Asymmetry of Actual Causation

    The counterfactual tradition to defining actual causation has come a long way since Lewis started it off. However there are still important open problems that need to be solved. One of them is the (in)transitivity of causation. Endorsing transitivity was a major source of trouble for the approach taken by Lewis, which is why currently most approaches reject it. But transitivity has never lost its appeal, and there is a large literature devoted to understanding why this is so. Starting from a survey of this work, we will develop a formal analysis of transitivity and the problems it poses for causation. This analysis provides us with a sufficient condition for causation to be transitive, a sufficient condition for dependence to be necessary for causation, and several characterisations of the transitivity of dependence. Finally, we show how this analysis leads naturally to several conditions a definition of causation should satisfy, and use those to suggest a new definition of causation.

  • Lynn Houthuys, Zahra Karevan and Johan A.K. Suykens Multi-View LS-SVM for Temperature Prediction

    In multi-view regression, the input data can be represented in multiple ways or views. The aim is to increase the performance of using only one view by taking into account the information available from all views. We introduce a novel multi-view regression model called Multi-View Least Squares Support Vector Machines (MV LS-SVM). This work was motivated by the challenge of predicting temperature in weather forecasting. Black-box weather forecasting deals with a large number of observations and features and is one of the most challenging learning task around. In order to predict the temperature in a city, the historical data from that city as well as from the neighboring cities are taking into account. We use MV LS-SVM to do temperature prediction by regarding each city as a different view. Experimental results on the min. and max. temperature prediction in Brussels, show the improvement of the multi-view method with regard to previous work and that it is competitive to the existing state-of-the-art methods in weather prediction.

  • Siamak Mehrkanoon and Johan A.K. Suykens Regularized Semi-Paired Kernel CCA for Domain Adaptation

    A Regularized Semi-Paired Kernel Canonical Correlation Analysis (RSP-KCCA) formulation is introduced for learning a latent space for the domain adaptation problem. The optimization problem is formulated in the primal-dual LS-SVM setting where side information can be readily incorporated through regularization terms. A joint representation of the data set across different domains is learned by solving a generalized eigenvalue problem or linear system of equations in the dual. The proposed model is naturally equipped with out-of-sample extension property which plays an important role for model selection. Experimental results are given to illustrate the effectiveness of the proposed approaches on synthetic and real-life datasets.

  • Rijk Mercuur, Virginia Dignum and Catholijn Jonker Using Values and Norms to Model Realistic Social Agents

    This paper looks at the role and limits of values and norms for modeling realistic social agents. Based on literature we synthesize a theory on norms and a theory that combines both values and norms. In contrast to previous work, these theories are checked against data on human behavior obtained from a psychological experiment on dividing money: the ultimatum game. We found that agents that act according to a theory that combines both values and norms, produce behavior quite similar to that of humans. Furthermore, we found that this theory is more realistic than theories solely concerned with norms or theories solely concerned with values. However, to explain the amount of money people accept in this ultimatum game we will eventually need an even more realistic theory. We propose that a theory that explains when people exactly choose to use norms instead of values could provide this realism.

  • Zhisheng Huang, Jie Yang, Frank Van Harmelen and Qing Hu Constructing Knowledge Graphs of Depression

    Knowledge Graphs have been shown to be useful tools for integrating multiple medical knowledge sources, and to support such tasks as medical decision making, literature retrieval, determining healthcare quality indicators, co-morbodity analysis and many others. A large number of medical knowledge sources have by now been converted to knowledge graphs, covering everything from drugs to trials and from vocabularies to gene-disease associations. Such knowledge graphs have typically been generic, covering very large areas of medicine. (e.g. all of internal medicine, or arbitrary drugs, arbitrary trials, etc). This has had the effect that such knowledge graphs become prohibitively large, hampering both efficiency for machines and usability for people. In this paper we show how we use multiple large knowledge sources to construct a much smaller knowledge graph that is focussed on single disease (in our case major depression disorder). Such a disease-centric knowledge-graph makes it more convenient for doctors (in our case psychiatric doctors) to explore the relationship among various knowledge resources and to answer realistic clinical queries.

  • Thomas M. Moerland, Joost Broekens and Catholijn M. Jonker Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning

    In this work we study how to learn stochastic, multimodal transition dynamics in reinforcement learning tasks. Model-based RL is an important class of RL algorithms that learns and utilizes transition dynamics to enhance data efficiency and target exploration. However, many tasks environments inherently have stochastic transition dynamics. We therefore require methods to approximate such complex distributions, while they should also scale to higher-dimensions. In this paper we study conditional variational inference in (deep) neural networks as a principled method to solve this challenge. Our results show that conditional variational inference in deep neural networks successfully predicts multimodal distributions, but also robustly ignores these for deterministic parts of the transition dynamics. Due to the flexibility of neural networks as black-box function approximators, these results are applicable to a variety of RL tasks, and are a key preliminary for model-based RL in stochastic domains.

  • Daniel Formolo and C. Natalie Van Der Wal Simulating Collective Evacuations with Social Elements

    This work proposes an agent-based evacuation model that incorporates social aspects in the behaviour of the agents and validates it on a benchmark. It aims to fill the gap in this research field with mainly evacuation models without psychological and social factors such as group decision making and other social interactions. The model was compared with the previous model, its new social features were analysed and the model was validated. With the inclusion of social aspects, new patterns emerge organically from the behaviour of each agent as showed in the experiments. Notably, people travelling in groups instead of alone seem to reduce evacuation time and helping behaviour is not too costly for the evacuation time as expected. The model was validated with data from a real scenario and demonstrates acceptable results and the potential to be used in predicting real emergency scenarios. This model will be used by emergency management professionals in emergency prevention.

  • Dimitrios Bountouridis, Dan Brown, Hendrik Vincent Koops, Frans Wiering and Remco C. Veltkamp Melody Retrieval and Classification Using Biologically-Inspired Techniques

    Retrieval and classification are at the center of Music Information Retrieval research. Both tasks rely on a method to assess the similarity between two music documents. In the context of symbolically encoded melodies, pairwise alignment via dynamic programming has been the most widely used method. However, this approach fails to scale-up well in terms of time complexity and insufficiently models the variance between melodies of the same class. Compact representations and indexing techniques that capture the salient and robust properties of music content, are increasingly important. We adapt two existing bioinformatics tools to improve the melody retrieval and classification tasks. On two datasets of folk tunes and cover song melodies, we apply the extremely fast indexing method of the Basic Local Alignment Search Tool (BLAST) and achieve comparable classification performance to exhaustive approaches. We increase retrieval performance and efficiency by using multiple sequence alignment algorithms for locating variation patterns and profile hidden Markov models for incorporating those patterns into a similarity model.

  • Vincent Jaco Koeman, Koen V. Hindriks and Catholijn M. Jonker Omniscient Debugging for Cognitive Agent Programs

    For real-time programs reproducing a bug by rerunning the system is likely to fail, making fault localization a time-consuming process. Omniscient debugging is a technique that stores each run in such a way that it supports going backwards in time. However, the overhead of existing omniscient debugging implementations for languages like Java is so large that it cannot be effectively used in practice.

    In this paper, we show that for agent-oriented programming practical omniscient debugging is possible. We design a tracing mechanism for efficiently storing and exploring agent program runs. We are the first to demonstrate that this mechanism does not affect program runs by empirically establishing that the same tests succeed or fail. Usability is supported by a trace visualization method aimed at more effectively locating faults in agent programs.

  • Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps and W. Bruce Croft Neural Ranking Models with Weak Supervision

    Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources. To this aim, we use the output of an unsupervised ranking model, such as BM25, as a pseudo-labeler. We further train a set of simple yet effective ranking models based on feed-forward neural networks. We study their effectiveness under various learning scenarios (point-wise and pair-wise models) and using different input representations (i.e., from encoding query-document pairs into dense/sparse vectors to using word embedding representation).

    We found that by employing proper objective functions and letting the networks to learn the input representation based on weakly supervised data, we can improve the performance over the pseudo-labeler just by learning from it. Our findings also suggest that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.

  • Qing Chuan Ye and Yingqian Zhang Participation Behavior and Social Welfare in Repeated Task Allocations

    Task allocation problems have focused on achieving one-shot optimality. In practice, many task allocation problems are of repeated nature, where the allocation outcome of previous rounds may influence the participation of agents in subsequent rounds, and consequently, the quality of the allocations in the long term. We investigate how allocation influences agents' decision to participate using prospect theory, and simulate how agents' participation affects the system's long term social welfare. We compare two task allocation algorithms in this study, one only considering optimality in terms of costs and the other considering optimality in terms of primarily fairness and secondarily costs. The simulation results demonstrate that fairness incentivizes agents to keep participating and consequently leads to a higher social welfare.

  • Lenin Medeiros and Tibor Bosse An Empathic Agent that Alleviates Stress by Providing Support via Social Media

    This paper describes the development of an ‘artificial friend’, i.e., an intelligent agent that provides support via text messages in social media in order to alleviate the stress that users experience as a result of everyday problems. The agent consists of three main components: 1) a module that processes text messages based on text mining and classifies them into categories of problems, 2) a module that selects appropriate support strategies based on a validated psychological model of emotion regulation, and 3) a module that generates appropriate responses based on the output of the first two modules. The application is able to interact with users via the social network Telegram.

  • Mike Ligthart, Olivier Blanson Henkemans, Koen V. Hindriks and Mark Neerincx Expectation Management in Child-Robot Interaction

    Children are eager to anthropomorphize (ascribe human attributes to) social robots. As a consequence they expect a more unconstrained, substantive and useful interaction with the robot than is possible with the current state-of-the art. In this paper we reflect on several of our user studies and investigate the form and role of expectations in child-robot interaction. We have found that the effectiveness of the social assistance of the robot is negatively influenced by misaligned expectations. We propose three strategies that have to be worked out for the management of expectations in child-robot interaction: 1) be aware of and analyze children's expectations, 2) educate children, and 3) acknowledge robots are (perceived as) a new kind of ‘living’ entity besides humans and animals that we need to make responsible for managing expectations.

Poster presentation

  • Hendrik Vincent Koops, W. Bas de Haas, Jeroen Bransen and Anja Volk Chord Label Personalization through Deep Learning of Integrated Harmonic Interval-based Representations

    Harmony annotations are at the core of a wide range of studies in music information retrieval and Automatic Chord Estimation (ace) in particular. Nevertheless, annotator subjectivity makes it hard to derive one-size-fits-all chord labels. Annotators transcribing chords from a recording by ear can disagree because of personal preference, bias towards a particular instrument, and because harmony can be ambiguous perceptually as well as theoretically by definition. These reasons contributed to annotators creating large amounts of heterogeneous chord label reference annotations. For example, on-line repositories for popular songs (e.g. Ultimate Guitar, Chordify) often contain multiple, conflicting versions. Approaches that aim to integrate conflicting versions can be used to create a unified view that can outperform individual sources. Nevertheless, this approach is built on the intuition that one single correct annotation exists that is best for everybody, on which ace systems are almost exclusively trained and evaluated.

    In this paper, we propose a first solution to the problem of finding appropriate chord labels in multiple, subjective heterogeneous reference annotations for the same song. We propose an automatic audio chord label estimation and personalization technique using the harmonic content shared between annotators. We create an harmonic bird’s-eye view from different reference annotations, by integrating their chord labels at the level of harmonic intervals. More specifically, we introduce a new feature that captures the shared harmonic interval profile of multiple chord labels, which we deep learn from audio. First, we extract Constant Q (cqt) features from audio, then we calculate Shared Harmonic Interval Profile (ship) features from multiple chord label reference annotations corresponding to the cqt frames. Finally, we train a deep neural network to associate a context window of cqt to ship features. From the deep learned shared harmonic interval profiles, we can create chord labels that match a particular annotator vocabulary, thereby providing an annotator with familiar, and personal chord labels.

    We test our approach on a 20-song dataset with multiple reference annotations, created by annotators who use different chord label vocabularies. In an experiment we compare training of our chord label personalization system on multiple reference annotations with training on a commonly used single reference annotation. In the first case we train a dnn on ships derived from a dataset containing 20 popular songs annotated by five annotators with varying degrees of musical proficiency. In the second case, we train a dnn on the Isophonics single reference annotation. Isophonics is a peer-reviewed, and de facto standard training reference annotation used in numerous ace systems.

    We show that by taking into account annotator subjectivity, our system is able to personalize chord labels from multiple reference annotations. Comparable high accuracy scores for each annotator show that the model is able to learn a ship representation that is meaningful for all annotators, and from which chord labels can be accurately personalized for each annotator. Furthermore, our results show that personalization using a commonly used single reference annotation yields significantly worse results. From the results presented in this paper, we believe chord label personalization is the next step in the evolution of ace systems.

  • John Bruntse Larsen and Jørgen Villadsen An Approach for Hospital Planning with Multi-Agent Organizations

    The background for this paper is a development that the Danish hospitals are undertaking which requires the establishment of a common emergency department. It is uncertain exactly what and how many resources the department needs and so resources are assigned dynamically as seen necessary by the staff. Such dynamic adjustments pose a challenge in predicting what consequences these adjustments may lead to. We propose an approach to deal with this challenge that applies simulation with intelligent agents and logics for organizational reasoning. We present some of the expected obstacles with this approach and potential ways to overcome them.

Type C: demonstrations

  • Denis Steckelmacher, Hélène Plisnier, Diederik M. Roijers and Ann Nowé Hierarchical Reinforcement Learning for a Robotic Partially Observable Task

    Most real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we illustrate on a complex robotic task that addressing both problems simultaneously is simpler and more efficient. We decompose our complex partially observable task into a set of sub-tasks, in a way that allows each sub-task to be solved by a memoryless option. Then, we implement Option-Observation Initiation Sets (OOIs), that make the selection of any option conditional on the previously-executed option. Contrary to classical options, OOIs allow agents to solve partially observable tasks. Our agent successfully learns the task, achieves better results than a carefully crafted human policy, and does so much faster than an recurrent neural network over options.

  • Peter Vamplew, Dean Webb, Luisa M Zintgraf, Diederik M. Roijers, Richard Dazeley, Rustam Issabekov and Evan Dekker MORL-Glue: A Benchmark Suite for Multi-Objective Reinforcement Learning

    This is a demonstration of the MORL-Glue benchmark suite which is designed to support empirical research in muulti-objective reinforcement learning. The demo consists of training an agent on one of our generalised benchmarks, the generalised Deep Sea Treasure (G-DST) problem. Specifically, we designed an experiment in which the size of the state-space is increased: at every instance, we determine the size of the state-space, and generate a random solution set (i.e., coverage set). We then show how many samples it takes before the agent can accurately estimate this coverage set, as a function of the size of this state-space. The demo contains a visualisation component, intended to show the behaviour of the agent at different times during learning. A video accompanying this demo can be found at

  • Tijn van der Zant and Lars Zwanepol Klinkmeijer RoboCup HQ: A new benchmark focusing on AI, HMI and Autonomous Agents

    The focus of the RoboCup Federation has expanded in the past two decades to include more AI and HRI. This paper describes a new league which starts in December 2017 adding also HMI. Typically a league is used as an interna-tional benchmark for scientific progress. This benchmark is called RoboCup HQ and runs in the cloud.

    The benchmark consists of a simulation environment, accessible in the cloud, where a complete city, including infrastructure, is modelled in 3D. The goal is to minimize the effects of a disaster, such as flooding, cyberat-tacks, or an earthquake. Autonomous robots can be send into areas or build-ings to gather information. The response of emergency response teams, such as police and ambulances, has to be optimized.

    The humans interacting with the simulation get an AI driven cockpit where the AI has to learn which information is the most relevant for decision mak-ing. The persons controlling the simulation get little or no training in how to use the cockpit. This is to simulate a real world disaster scenario where oper-ators should focus on rescuing people and assets and not on how to accom-plish this complex task. The cockpit is multi-user adding the complexity of distributed decision making. Algorithms can also be developed to control parts of the city such as electronic road signs and traffic lights in order to averse disaster.

  • Manon Legrand, Roxana Rădulescu, Diederik M. Roijers and Ann Nowé The SimuLane Highway Traffic Simulator for Multi-Agent Reinforcement Learning

    SimuLane is a highway traffic simulator modeling a multi-agent learning environment. Human drivers are simulated through a behavior-based model translating basic characteristics and desires of real-life drivers. SimuLane is designed in a fully parameterizable way and serves as a learning environment for self-driving cars; as such, a ratio of autonomous drivers—all using the same learning model—can be defined for either training or a simple simulation.

  • Benjamin Timmermans, Zoltán Szlávik, Manfred Overmeen and Alessandro Bozzon ECrowd: Enterprise Crowdsourcing for Training Cognitive Systems using the Workforce

    Although the benefits of using crowdsourcing in a corporate environment have been explored, its application in practice has been limited. The reason for this is twofold: employees have no incentive to perform tasks as they cannot get paid as reward, and crowdsourcing tasks can also not be sent to external annotators as companies often deal with confidential data. However, as artificial intelligence and, specifically, supervised machine learning algorithms become more common, so does the need for annotated data. In this paper we introduce a platform for enterprise crowdsourcing called ECrowd, that we a) aim to employ to get a better understanding on how enterprise mobile crowdsourcing (EMC) could be sustainably adopted in a traditional work environment, b) use for gathering training data for Cognitive Systems, and c) build on to create business applications.

  • Jonathan Gerbscheid, Thomas Groot and Arnoud Visser Intelligent News Conversation with the Pepper Robot

    The UvA@Home Team develops social behaviours for home-assistance robots. Beside the use of classical techniques for human robot interaction, such as speech recognition, image recognition and language processing, the team strives to explore new means of improving social interactions. To showcase how the inclusion of new fields can improve social behaviour a news-conversation agent will be demonstrated.

Type D: thesis abstracts

Oral presentation

  • Verna Dankers, Aysenur Bilgin and Raquel Fernández Modelling the Generation and Retrieval of Word Associations with Word Embeddings

    Word associations capture important aspects of the semantic representation of words, by telling us about the contexts in which words appear in the world. Artificially mimicking word associations involves emulating the generation of word associations and the retrieval mechanisms underlying associative responses. Tasks in which this plays a primary role are word-guessing games, such as the Location Taboo Game. In this game, artificial guesser agents should guess the names of cities from simple textual hints and are evaluated with games played by humans. Thus, playing the games successfully requires mimicking associations that humans have with geographical locations. In this thesis, a method for modelling word associations is presented and applied to the construction of an artificial guesser agent for the word-guessing Location Taboo Game.

    The acquisition of word associations is modelled through the construction of a semantic vector space from a tailored corpus about travel destinations, using context-predicting distributional semantic models. A targeted corpus annotation method is introduced to make the word associations more explicit. The guesser agent architecture retrieves associations during the game by calculating the associative similarity between a city and a hint from the semantic vector space. The annotation method significantly improves performance. The results on a dataset of example games indicate that the proposed architecture can guess the target city with up to 27.50% accuracy—a substantial improvement over the 5% accuracy achieved by the baseline architecture.

  • Jens Nevens and Katrien Beuls The Effect of Tutor Feedback in Language Acquisition Models

    This work investigates the role of tutor feedback in agent-based models of lexicon acquisition where one agent (the tutor) is teaching another agent (the learner) a particular language. An important aspect of lexicon learning is referent resolution, i.e. understanding what object in the environment is referred to by a given word. We compare two dominant paradigms in this respect: interactive learning and cross-situational learning - which differ primarily in the role of social feedback, such as pointing or gaze, in solving the referent uncertainty problem. In the former, the learner not only observes objects and words, but also receives social, pragmatic feedback. This clearly restricts the number of referents of the words. In the latter, the learner only receives objects and words. Here, the referent uncertainty problem is solved either by storing a single hypothesised referent, until evidence for the contrary is presented, or by statistical analysis of the co-occurrences of objects and words. We opt for the latter approach as it seems to align well with empirical findings.

    Almost all models in lexicon learning fall into either of these two categories. However, little work has been done to compare these two paradigms systematically, except for Belpaeme and Morse (2012), who show that cross-situational learning is slower in the continuous-meaning domain due to the absence of social feedback. Importantly, real life interactions between caregiver and child probably do not fall into either of the two categories but are really combinations of full, little or no social feedback. Consequently, learning algorithms must be able to deal with situations where there is feedback interleaved with interactions where there is no feedback. In this work, we propose a new mixed paradigm that combines the two paradigms. This new paradigm allows to test algorithms in experiments that combine no feedback and social feedback. We control the presence or absence of social feedback in consequent interactions.

    To deal with this mixed feedback setting, we extend existing learning algorithms in a new way and show how they perform with respect to the KNN approach of Belpaeme and Morse (2012) and other prototype-based approaches. Our learning algorithms use prototypes to estimate the referent shown by the tutor during subsequent training interactions. The algorithms differ in the way their internal prototypes are updated in the presence and absence of social feedback. Our prototypes are continuous-valued and multi-dimensional and can represent various sensorimotor spaces—in our case colours. After training, we measure the communicative abilities of the learner, both in language production and language understanding, by a number of testing interactions. The algorithms are evaluated first in a purely simulated environment where the objects are randomly generated. Afterwards, a grounded environment is used. This was created in other work by humanoid robots observing scenes with physical objects. These scenes emulate a more structured and realistic learning environment. We also modify the distribution of the object's features (colours) in the grounded environment and investigate how the agents respond. In both environments, the agents are situated in contexts of multiple objects and use single words to refer to an entire object. Finally, we perform a study on a large number of parameters that all influence some aspect of language acquisition, e.g. context size, size of the tutor lexicon, total number of objects in the world and word production strategy of the tutor. We investigate how the effects of these parameters change with respect to social feedback and the proposed learning algorithms.

    Our results suggest that the effect of social feedback is double. Not only does it allow the agent to learn faster, i.e. to reach the same level of success needing less interactions, the communicative abilities of the agent also become better with increasing presence of social feedback. Furthermore, we observe how the communicative success of the agents scales with respect to the social feedback in the novel mixed feedback setting. This is, however, dependent on the learning algorithm as some algorithms benefit from the presence of only some feedback more than others. Concerning the grounded world, our results paint a mixed picture. Performance in grounded worlds can be better than in simulated worlds, but this depends on the combination of algorithm and statistics of the environment. The agents are able to benefit from the additional structure present in the more realistic, grounded environments, but require at least some social feedback to do so. Finally, we deem it important not to underestimate the role of the tutor in language acquisition models and in language learning in general. In our experiments, we found that the way the tutor structures the world for the learner (i.e. the tutor's word production strategy) can have a profound impact on the learner's communicative success. This is especially true in the absence of social feedback.

  • Dirk van der Hoeven and Tim van Erven Is Mirror Descent a Special Case of Exponential Weights?

    Online Convex Optimization is a setting in which a forecaster is to sequentially predict outcomes. In this thesis we focus on two algorithms in the Online Convex Optimization setting, namely Mirror Descent and Exponential Weights. Exponential Weights is usually seen as a special case of Mirror Descent. However, we developed an interpretation of Exponential Weights that sees Mirror Descent as the mean of Exponential Weights, and thus a special case of Exponential Weights. Specifically, different priors for Exponential Weights lead to different Mirror Descent algorithms. The link between Exponential Weights and Mirror Descent hinges on the link between cumulant generating functions, related to the prior in Exponential Weights, and Legendre functions, related to the update step in Mirror Descent.

  • Verna Dankers, Aysenur Bilgin and Raquel Fernández Modelling Word Associations and Interactiveness for Describer Agents in Word-Guessing Games

    In word-guessing games, one player describes a stimulus and his partner should try to guess what the stimulus is. Producing a response to such a stimulus requires an associative mechanism. Additionally, the production of associations that allow the partner to guess what the stimulus is requires modelling a shared context. The Location Taboo Game is such a word-guessing game, in which a describer gives simple textual hints about a target city, and the other player should try to guess this city. The hints given should not contain words from a list of taboo words. In this thesis, an architecture for an artificial describer agent is presented and evaluated through simulation and an empirical study. To be able to elicit a correct guess from a human guesser, the artificial describer agent should mimic associations that humans have with geographical locations. The artificial describer agent extracts word associations from a semantic vector space that has been created with a context-predicting distributional semantic model. Firstly, two methods for extracting word associations are detailed: a nearest neighbours approach that uses the list of taboo words, and an approach that applies clustering and analogical reasoning on a dataset of games played by humans. Secondly, interactiveness is modelled through a rule-based approach for the generation of clues that depend upon the guesser's response. Different variants of the methods for clue generation are evaluated through simulation and an empirical study. These describer agents could elicit a correct guess for 37.86% of the games and for 50.00% of the games for which the human player knew the target city.

  • Nikita Galinkin, Zoltán Szlávik, Lora Aroyo and Benjamin Timmermans Catch Them If You Can: Malicious Behavior Simulation in Deep Question Answering

    Recent advances in artificial intelligence and machine learning have allowed question answering systems to become much more prominent for retrieving information to handle day-to-day tasks. In the study, we investigate the impact of malicious user behavior in question answering systems specifically deployed in the cultural heritage domain. We need to be prepared for some users being ‘malicious’, trying to render a learning system useless by misusing it. To prepare, first we need to estimate the impact of malicious actions, and study ways to deal with this issue.

Poster presentation

  • Daphne Lenders and Willem F.G. Haselager ‘Well, at least it tried’ The Role of Intentions and Outcomes in Ethically Evaluating Robot Actions

    In order to make robots more trustworthy, it is important to find out which factors make a human consider a robot as ethical. This study indicates that perceived good intentions and good outcomes in robot actions affect ethical evaluations positively, though the influence of intentions is larger than the ones of outcomes. Furthermore we found that the influence of outcomes on ethical evaluations is larger when the action was perceived to be based on a good rather than bad intention.

  • David Stap, Bert Bredeweg and Natasa Brouwer Gamification for Learning by Modelling in Interactive Learning Environments

    Learning analytics aims to optimise learning, typically by providing students meaningful insight in their own learning behaviour. Gamification deploys game mechanics to increase motivation and thereby boost the learning process. In our work, we use learning analytics to implement game mechanics that create a motivating learning experience. The educational context concerns students that engage in model-building to develop systems thinking expertise. Three mechanics have been implemented: badges, leaderboard and life. The gamification add-on was evaluated during high school physics classes. Data mining showed that gamification resulted in significantly higher self-reported scores on enjoyment but inferior student-created models. A strong correlation between delete-behaviour and correctness of the created models was also found.

  • Manon Legrand, Roxana Rădulescu, Diederik M. Roijers and Ann Nowé Neural Network Reuse in Deep RL for Autonomous Vehicles among Human Drivers

    In this thesis, we consider the problem of reinforcement learning for autonomous cars on a highway. This is an inherently multi-agent problem, in which agents have to learn to handle the dynamics of a system, other autonomous cars, and be able to account for some irrationality on the part of other drivers (typically human). We created a simple traffic simulator composed of straight lanes and simulated human drivers (following a rule-based behavioral model with some irrationality built into it), and designed different neural networks as learning models for our self-driving cars. Using this simulator, we try to answer the question on how we can efficiently train agents in such a complex multi-agent setting. Our key insight is that we can reuse neural networks trained on a single-agent version of a multi-agent problem to speed up learning, leading to good performance.

  • Max van de Westelaken and Yingqian Zhang An Agent-Based Model for Feasibility and Diffusion of Crowd Shipping

    In crowd shipping, regular people deliver packages. An agent-based model was created to study the spread (based on the Bass model), and feasibility. The model showed that crowd shipping can be feasible if there are enough users (which develops over time) and that the flexibility of crowd shippers was the most important factor in the feasibility of crowd shipping.

  • Robin Manhaeve, Luc De Raedt, Kurt De Grave and Laura Antanas Realtime Road User Detection and Classification with Single Pass Deep Learning

    In this thesis, research is performed on the trade-off between speed and accuracy in road user detection. We apply the YOLO deep neural network object detector to detect road users from the perspective of an autonomous vehicle. We investigate the effect of replacing YOLO's convolutional layers by a residual network. These networks are evaluated on the Caltech Pedestrian Benchmark, the KITTI Vision Benchmark Suite and a new dataset developed by Flanders Make. Finally, we also investigate the use of thermal images for self-driving cars.

  • Marten Schutten, Marco Wiering and Pamela MacDougall Balancing Imbalances

    Focus on sustainable energy is increasing in a lot of different layers of society. By an increase in the use of Renewable Energy Resources (RES) such as photo voltaic (solar) and wind energy, the supply of electricity becomes more variable and harder to control and predict. On the other hand, devices are being developed that consume electricity in a more flexible way. Finally, a transition can be observed in which electricity is supplied by large numbers of smaller suppliers (e.g. households with solar panels), rather than a small group of large suppliers. This thesis explores the feasibility of using Reinforcement Learning to enable prosumers (parties that both consume and produce electricity) to participate in the reserve electricity market, in such a way that the flexibility of electricity consumption that is inherently available within a household's appliances can be utilized to increase the stability of the grid, while enabling households to buy or sell electricity at profitable rates. Furthermore, the Neural Fitted CACLA (NFCACLA) algorithm is introduced, which is a neural fitted variant on the existing CACLA algorithm.

  • Jesse Bakker, Wouter Beek and Erwin Folmer Should you link(ed) data?

    Quality of Linked Open Data is often approached with scepticism. This impairs the transition to a global data space. As quality can be assessed from different point of views, it is difficult to establish a distinction between high quality data and low quality data in the LOD cloud. There is need for justification of interlinking. This research proposes an extensible methodology by which such a justification can be formulated, based on a semi-automatic quality assessment. The methodology is tailored such that a dataset, owned by a third party, can be assessed from the scope of another dataset. The resulting justification is in the form of a new quality measurement dataset, as Linked Data, containing individual measurements and thorough documentation on related concepts. This dataset is intended to be used by both the assessor and the users of the, potentially, resulting linkset to gain insights.

  • Michiel Van Lancker, Annemie Vorstermans and Mathias Verbeke Customer Profiling based on Electronic Payment Transaction Data

    Customer profiling allows companies to profile a customer or group of customers based on their transaction behavior. In this study, transactions of electronic payment cards are considered. Contrary to classic market basket analysis, this transactional data does not contain information about the products bought, but only the amount of the payment, as well as a number of details on the respective shop. In order to create meaningful customer profiles based on this limited amount of information, four different techniques were compared. While the construction of customer profiles for groups of customers was possible, it proved difficult to thoroughly validate the profiles and methods used without extra reference data.

Author index

NVIDIA deep learning workshop

For participants of BNAIC'2017, there is the possibility to follow a free workshop organized by NVIDIA about deep learning. If you are not experienced with using deep learning algorithms and frameworks, but you would like to learn about this, please register for this workshop by sending an email to before 2 November 2017, 12.00h. Note that as there is a maximum of 30 participants for this workshop, registrations will be selected based on a first mail first served basis. More information about this workshop can be found below.

Image Classification with DIGITS

Deep learning enables entirely new solutions by replacing hand-coded instructions with models learned from examples. Train a deep neural network to recognize handwritten digits by:

  • Loading image data to a training environment
  • Choosing and training a network
  • Testing with new data and iterating to improve performance

On completion of this module, you will be able to assess what data you should be training from.

Prerequisites - None
Framework - Caffe with DIGITS Interface
Remarks: You need to bring your own laptop to the workshop.