IR Journal Special Issue on ECIR 2021

by Djoerd Hiemstra and Marie-Francine Moens

The 43rd European Conference on Information Retrieval, ECIR 2021, was supposed to take place as an in-person conference in Lucca, Italy. Due to the COVID-19 pandemic, ECIR 2021 was held entirely online from March 28 to April 1, 2021. The conference programme contained full paper presentations, poster presentations, system demonstrations, eight tutorials, five workshops, an industry event, a doctoral consortium, a reproducibility track, a panel on open access publishing and several online social events.

For this special issue, we asked the authors of eight of the ECIR 2021 full apers that had the best reviewing scores to submit an extended version of their paper. This led to five papers that are published in this special issue of the Information Retrieval Journal. The extended papers contain at least 30% new content. Examples of extensions are enhancements that improve the techniques described in the ECIR 2021 paper; as well as tests on additional datasets that reveal behaviors that differ from the originally published claims and that provide further insights into the methods being described. Among the papers in this special issue are extensions of two papers that received an award at ECIR 2021.

Published in Information Retrieval Journal.

[download pdf]

Felipe Moraes Gomes defends PhD thesis on Collaborative Search

Examining the Effectiveness of Collaborative Search Engines

by Felipe Moraes Gomes

Although searching is often seen as a solitary activity, searching in collaboration with others is deemed useful or necessary in many complex situations such as: travel planning; online shopping; looking for health related information; planning birthday parties; working on a group project; or finding a house to buy. Researchers have found that complex search tasks can be executed more effectively and efficiently, achieve higher material coverage, and enable higher knowledge gains in an explicit collaborative setting than if conducted in isolation. However, even though researchers have carefully designed several Collaborative Search (CSE) user studies, there is still conflicting evidence or a lack of evidence on the effectiveness of CSE systems. Thus, in this thesis, we focus on examining the effectiveness of CSE systems in two parts.

In the first part, we shed light on the effectiveness of CSE to support two group configurations, namely group sizes and users’ roles. Past collaborative search studies have had a strong focus on groups of two or three collaborators, thus naturally limiting the number of experimental conditions that could increase quickly. Therefore, there is a lack of evidence suggesting the extent to which
a CSE system can support group sizes beyond these commonly investigated group sizes. Thus, in Chapter 3, we study CSE system effectiveness with group size as the primary dependent variable. Here, we vary group sizes from two to six collaborators, with six as our upper bound due to limitations on our available resources.

In Chapter 4, we focus on roles in CSE. Roles can determine how a group splits up the search task, and determines each group member’s function (e.g., one group member is responsible for finding documents and reading and evaluating them, with a further member responsible for in-depth reading and evaluating of the aforementioned documents). In particular, when the CSE system assigns a role to each group member, researchers have hypothesised that a group may reduce the time spent communicating and coordinating the task, and make the search process more efficient and successful than groups without
role assignment. However, past user studies have provided contradicting evidence as to the utility of assigned roles in CSE. Thus, in Chapter 4, we provide more evidence to settle the question of the effectiveness of CSE systems when used by groups with pre-assigned roles versus groups without pre-assigned roles.

In the second part of this thesis, we make our group configurations constant, particularly, group sizes are set to up to three people, and group members receive the same role. We then turn to a different perspective and focus on examining the effectiveness in two contexts: Search as Learning (SAL) and collaborative online shopping. Search activities for human learning involve multiple iterations that require cognitive processing and interpretation, often requiring the searcher to spend time scanning/viewing, comparing, and evaluating information. However, web search engines are not built to support users in the search tasks often required in learning situations. When people use search as a learning activity, it can be an individual activity or a collaborative activity (e.g., group projects). Hence, in Chapter 5, we tackle the challenge of identifying the impact of web search engines on the (single-search or collaborative search) users’ ability to learn compared to learning acquired via high-quality learning materials as a baseline.

In Chapter 6, we look at a further context: collaborative online shopping. In collaborative online shopping, a group of people come together to make a decision to purchase a product that meets the various group members’ requirements and opinions. While shopping together, search is an important part of the task in order to search for products in a catalogue that is available in an e-commerce website. One important aspect of collaborative shopping is supporting awareness and sharing of knowledge as it can enable a sense of co-presence, which helps groups make a decision that satisfies each group member’s requirements and wishes. As search is a significant part of a collaborative online shopping experience, CSE systems are suitable for executing such tasks. However, there is insufficient evidence of how well can CSE systems support a group of users to search for online products together and make a group decision. Hence, in Chapter 6, we explore the effects of increased awareness and sharing of knowledge (co-presence) using a CSE system in collaborative shopping on the group decision making process.

[more info]

PhD vacancy for software correctness

We’re hiring a PhD Candidate for Software Correctness. The work will target the correctness of high-level programming languages that are “only” strings in your host language, such as SQL and regular expressions.

Software has shaped almost every aspect of our modern lives. Ensuring that software is correct, is both a major scientific challenge and an enterprise with enormous social relevance. Would you like to examine possibilities to introduce a theory for correctness levels for software? Then you have a part to play as a PhD Candidate.

The correctness of software is of major importance in computer science. Unfortunately, the significance of software correctness is not always clear. Furthermore, the automatic checking of software correctness is difficult. This leads to problems during system development projects and during the grading of software exercises.

This PhD candidate position is intended for four years. You understand the importance of correct software and know how to work with several meanings of software correctness. Your goal is to introduce a generic and formal theory of software correctness levels, in which partially correct/incorrect software can be handled in a flexible way.

You will put the generic theory into practice, by experimenting with automatic grading of software exercises in the context of our courses. One application will be dealing with automatic grading of SQL statements. You will be supervised by Patrick van Bommel and Djoerd Hiemstra. Profile:

  • You are an enthusiastic and motivated researcher.
  • You should have a Master’s degree in computer science, or a Master’s degree in mathematics and a demonstrable interest in computer science.

[Apply On-line]

(Deadline: 6 March 2022)

Dutch-Belgian Information Retrieval Workshop 2021½

The program for DIR2021½ is out. DIR 2021½ will run on four consecutive Fridays as online Search Engine Amsterdam meetups. Register now!

Session 1, 4 February 2022

  • Keynote 1 by Maria Maistro (Uni. of Copenhagen): How can we measure reproducibility of IR experiments?

Session 2, 11 February 2022

  • Ali Vardasbi (University of Amsterdam): Mixture-Based Correction for Position and Trust Bias in Counterfactual Learning to Rank
  • Sepideh Mesbah (Randstad Groep): Using RobBERT and eXtreme Multi-Label Classification to Extract Implicit and Explicit Skills From Dutch Job Descriptions
  • Hideaki Joko (Radboud University): Conversational Entity Linking: Problem Definition and Datasets
  • Liesbeth Allein (KU Leuven): Time-aware evidence ranking for fact-checking
  • Mozhdeh Ariannezhad (University of Amsterdam): Understanding Multi-channel Customer Behavior in Retail

Session 3, 18 February 2022

  • Garett Allen (TU Delft): Supercalifragilisticexpialidocious: Why Using the “Right” Readability Formula in Children’s Web Search Matters
  • Carsten Schnober (WizeNoze): Neural Information Retrieval for Educational Resources
  • Olivier Jeunen (Amazon): Embarrassingly shallow auto-encoders for dynamic collaborative filtering
  • Zhe Roger (TU Delft): Leave No User Behind: Towards Improving the Utility of Recommender Systems for Non-mainstream Users
  • Harrie Oosterhuis (Radboud University): Computationally Efficient Optimization of Plackett-Luce Ranking Models for Relevance and Fairness

Session 4, 25 February 2022

  • Keynote 2 by Gabriella Kazai (Microsoft Research): IR Evaluation – An Industry Perspective

Web Analytics & Privacy workshop

On Thursday 23 December, the NoGA team organizes the first Web Analytics and Privacy workshop with in the morning a demonstration of the open source analytics system Matomo, and in the afternoon two excellent guest speakers: Frederik Zuiderveen Borgesius and Güneş Acar.

Frederik Zuiderveen Borgesius will talk about behavioural targeting, privacy, and the law, discussesing the troubled relationship between contemporary advertising technology (adtech) systems, in particular systems of real-time bidding (RTB, also known as programmatic advertising) underpinning much behavioural targeting on the web and through mobile applications.

Güneş Acar will talk about browser fingerprinting and personal data exfiltration on the web, discussing the results of a study into data exfiltration by third-party scripts directly embedded on web pages. Specifically, Güneş will discuss three attacks: misuse of browsers’ internal login managers, social data exfiltration, and whole-DOM exfiltration.

More information at:

Maurice Verbrugge graduates on the BERT Ranking Paradigm

The BERT Ranking Paradigm: Training Strategies Evaluated

by Maurice Verbrugge

This thesis researches the most recent paradigm in information retrieval, which applies the neural language representation model BERT to rank relevant passages out of a corpus. The research focuses on a re-ranker scheme that uses BM25 to pre-rank the corpus followed by BERT-based ranking, exploring better fine-tuning methodology for a pre-trained BERT. This goal is pursued in two parts, in the first, all methods rely on binary relevance labels, while the second part applies methods that rely on multiple relevance labels instead. Part one researches methods that apply training data enhancement and the application of inductive transfer learning methods. Part two researches the application of single class multi label methods, multi class multi label methods and label-based regression. In all parts, the methods were evaluated on the fully annotated Cranfield dataset.
This thesis demonstrates that applying inductive transfer learning with the Next Sentence Prediction task improves the baseline by presenting various methods to enrich the fine-tuning data for different levels of the BM25-BERT ranking pipeline. Also, this thesis demonstrates that application of a regression method results in above baseline performance. This indicates the superiority of this method over rule-based filtering of classifier results.

[download pdf]

Casper van Aarle graduates on Federated Regression Analysis

Federated Regression Analysis on Personal Data Stores: Improving the Personal Health Train

by Casper van Aarle

Due to regulations and increased privacy awareness, patients may be reticent in sharing data with any institution. The Personal Health Train is an initiative to connect different data institutions for data analysis while maintaining full authority over their data. The Personal Health Train may not only connect larger institutions but also connect smaller, possibly on-device personal data stores, where data is safely and separately stored.
This thesis explores possible solutions in the literature that guarantee data-privacy and model-privacy, and it shows the practical feasibility when learning over a large number of personal data stores. We specifically regard the generation of linear regression and logistic regression models over personal data stores. We experiment with different design choices to optimise the convergence of our training architecture.
We discuss the PrivFL protocol* which takes into account both data-privacy and model-privacy when learning a regression model and is applicable to personal data stores. We further propose a standardisation protocol, Secure Scaling Operation, that guarantees data-privacy for patients, and experiments concluded that it improves convergence better than an adaptive gradient.
We implement an architecture that can learn over personal data stores and which preserves user privacy in FedLinReg-v2 and FedLogReg-v2. While, in theory, no convergence is guaranteed, training over various datasets shows a difference of 0 to 0.33% in loss differences over both training and test sets compared to models that are centrally optimised. No parameter optimisation was necessary. The coefficients however may deviate from centrally trained models. We were able to train regression models while preserving data-privacy over 150 personal data stores in minutes. An even higher level of data-privacy will cause a strong linear increase in computation-time in relation to the amount of personal data stores included.

[download pdf]

Vacancy: PhD Candidate for Fairness and Non-discrimination in Machine Learning for Retrieval and Recommendation

Information retrieval and recommender systems based on machine learning can be used to make decisions about people. Government agencies can use such systems to detect welfare fraud, insurers can use them to predict risks and to set insurance premiums, and companies can use them to select the best people from a list job applicants. Such systems can lead to more efficiency, and could improve our society in many ways. However, such AI-driven decision-making also brings risks. This project focuses on the risk that such AI systems lead to illegal discrimination, for instance harming people of a certain ethnicity, or other types of unfairness. A different type of unfairness could concern, for instance, a system that reinforces financial inequality in society. Recent machine learning work on measures of fairness has resulted in several competing approaches for measuring fairness. There is no consensus on what is the best way to measure fairness and the measures often depend on the type of machine learning that is applied. Based on the application of existing measures on real-world data, we suspect that many proposed measures are not that helpful in practice. In this project, you will study measures of fairness, answering questions such as the following. To what extent can legal non-discrimination norms be translated into fairness measures for machine learning? Can we measure fairness independently of the machine learning approach? Can we show which machine learning methods are the most appropriate to achieve non-discrimination and fairness? The project concerns primarily machine learning for information retrieval and recommendation, but is interdisciplinary, as it is also informed by legal norms. The project will be supervised by Professor Hiemstra, professor of data science and federated search, and Professor Zuiderveen Borgesius, professor of ICT and law.


  • You hold a completed Master’s Degree or Research Master’s degree in computer science, data science, machine learning, artificial intelligence, or a related discipline.
  • You have good programming skills.
  • You have good command of spoken and written English.
  • We encourage you to apply even if you think you do not meet all the requirements.

More information at:

2nd Dutch meeting on Clinical NLP

Now that electronic health records are commonly used, the availability of clinical texts is growing. This workshop discusses the automatic analysis of textual clinical health data to advance medical research and improve healthcare related services. We especially encourage presentations discussing possibilities to share clinical texts, models and tools for clinical natural language processing (NLP). In practice, privacy- and legal regulations prevent the free sharing and combination of electronic health records themselves, but de-identified texts, NLP tools and intermediate results may be shared. We hope that sharing will promote cooperation within the Dutch-speaking countries, as well as advance the research in Clinical NLP in those countries. Relevant topics include, but are not limited to:

  • Data sets with clinical texts
  • Open source tools for Clinical NLP
  • Information extraction from clinical text
  • Information retrieval for clinical text
  • Adapting standard NLP tools for clinical text
  • De-identification and ways to preserve privacy in clinical data
  • Using medical terminologies and ontologies
  • Annotation schemes and annotation methodology for clinical data
  • Evaluation methods for the clinical domain
  • Text-based clinical prediction models
  • Speech recognition for clinical text

We solicit short presentations (15 to 20 minutes) from researchers covering recent work, including work in progress and work that was recently published at journals and/or conferences in the field or made available via data and software sharing platforms like Zenodo or Github. Please email the title and abstract of your presentation before 12 October 2021.

More information at: