Gebre Gebremeskel defends PhD thesis on recommender systems

Spotlight on Recommender Systems: Contributions to Selected Components in the Recommendation
Pipeline

by Gebrekirstos Gebremeskel

This thesis sheds light on the different components of the recommendation pipeline, under three themes, which are divided in 10 chapters. The first theme is Cumulative Citation Recommendation. Under this theme, we have conducted research on the task of Cumulative Citation Recommendation (CCR), which is the automation and maintenance of knowledge bases such as Wikipedia. Given a set of Knowledge Base entities, CCR is the task of filtering and ranking documents according to their citation worthiness to the entities. We specifically focused on the filtering stage of the recommendation process and the interplay between feature sets and machine learning algorithms. There are four chapters under the first theme: Chapters 3 to 6. Chapter 3 presents experiments with string-matching and machine learning approaches to the task of CCR. Chapter 4 investigates the interplay between the choice of feature sets and their impact on the performance of machine learning algorithms. Chapter 5 investigates the impact of the initial task of filtering in the CCR overall performance, and what makes some documents unfilterable. Chapter 6 reviews new advances in the area of the theme and the specific chapters. Under this theme, we show that simple string-matching approaches can have advantages over complex machine learning approaches for the task of CCR, that comparisons of machine learning algorithms should take into account the sets of features used, and that the filtering stage of a CCR task can impact recommender systems performance in different ways. The second theme is News Recommendation. In this theme, we investigate news recommendation with a particular focus on evaluation. We study the role of geography in news consumption to understand the geographical focus of news items and the geographical location of readers followed by the incorporation of geographic information into online deployments of algorithms. We also attempt to quantify random fluctuations in the performance difference of a live recommender system. After that, we focus on news evaluation, investigating it from several angles. We conducted A/A tests (running two instances of the same algorithm), offline evaluations, online evaluations, and comparisons of algorithm performances across years. There are three chapters under the theme of News Recommendation. Chapter 7 investigates the role of geographic information in news consumption, and examines in a real-world setting, the performance patterns of news recommender systems, one of which incorporates geographic information into its algorithm. Chapter 8 examines the challenges, validity, and consistency of news recommender systems evaluations from multiple perspectives, involving A/A tests, offline evaluations, online evaluations, and comparisons of algorithm performances across years. Chapter 9 reviews advances in News Recommendation with a focus on developments that have relevance to the approaches and findings presented in chapters 7 and 8. In the above theme, we show that user and item geography play a role in the consumption of news, that there are significant differences and discrepancies in offline and online evaluation of recommender systems algorithms, and that random effects on online performances can result in statistically significant performance differences. The third and final theme is Measuring Personalization and consists of Chapter 10. We view personalization as introducing or imposing differentiation between users in terms of the items recommended to them. In the differentiation, some items will be shared between users, and some will not. We then propose and apply a user-centric metric of personalization that, by using the recommendation lists and the resulting user reaction lists that result from users choosing to click or react on, measures the degree of users’ tendency to agree to the differentiation introduced or imposed between them by the recommender system, to converge (by, for example, clicking more on shared items), or to diverge from the differentiation (by, for example, clicking more on the items that are not in shared recommendation).

[Read more]

Selective Search as a First-Stage Retriever

by Gijs Hendriksen, Djoerd Hiemstra, and Arjen de Vries

Selective search assumes a document collection can be partitioned into topical index shards in such a way that individual search requests would be satisfied with a few shards only. Previous work has considered primarily the retrieval effectiveness of selective search architectures in an early precision setting. In this work, we instead consider selective search as the rst stage in a multi-stage pipeline, and therefore focus on obtaining high recall. We reproduce the most important algorithms from the selective search literature, and show that they can match the recall level of exhaustive search while reducing the required resources by 50%. We compare the different types of resource selection algorithms, and conclude that the more straightforward strategies that can select shards at a low cost actually outperform the more involved algorithms, in terms of reliably obtaining high recall with fewer shards.

To be presented at the 16th Conference and Lab of the Evaluation Forum (CLEF), in September 2025 in Madrid

[download pdf]

IRRJ Volume 1, Number 1

We are proud to introduce the first issue of the Information Retrieval Research Journal (IRRJ). IRRJ is the only peer-reviewed diamond open access journal that focuses exclusively on the information retrieval research community. The journal provides free and un-restricted on-line open access to papers in information retrieval, and runs fully on volunteer work by editors, reviewers, a production editor, a webmaster, and an advisory board. IRRJ does not require subscription fees nor article processing fees: At IRRJ the readers do not pay and the authors do not pay either. Instead, IRRJ plans to be completely self-funded, running on micro-donations, using resources and infrastructure provided by friend organizations and universities. We are grateful to the Radboud University and Royal Netherlands Academy of Arts and Sciences (KNAW) for providing the initial funding and infrastructure.

[read more]

Score-Fitted Indexes and Constant Length Indexes for Information Retrieval

by Djoerd Hiemstra

We present two novel inverted index approaches and corresponding query processing strategies for information retrieval: 1) score-fitted indexes and 2) constant length indexes. These indexes do not store document lengths and document priors, but nevertheless support popular rankers like BM25 and language models by approximating their results. We answer the question: What is the effect of score-fitted indexes and constant length indexes, and the combination of both approaches, on the search quality? We show on three diverse datasets that the two indexes perform on par with approaches that use a standard inverted index in almost all cases. Our work suggests that it is possible to develop search engines that are more efficient than engines that store document lengths and/or document priors, such as Lucene and Terrier.

To be presented at the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2025)

[download pdf] [poster]

Efficient Session Retrieval Using Topical Index Shards

by Gijs Hendriksen, Djoerd Hiemstra, and Arjen de Vries.

Retrieval is often considered one query at a time. However, in practice, queries regularly come in the context of sessions with coherent topics. By dividing a collection into topical index shards and matching the topical context of a session with the right shards, we may reduce the amount of resources required for answering each query. We consider two alternatives: (1) starting with exhaustive search and pruning unnecessary shards after each session turn, and (2) applying a resource selection algorithm to pre-select shards at the start of the session. We empirically evaluate our approaches on a conversational search dataset (CAsT), and compare effectiveness and resource usage against exhaustive retrieval. Our experiments show that both approaches reduce the number of postings necessary to fulfill a search request (by 50-80%), and in terms of effectiveness our systems are statistically indistinguishable from a system performing exhaustive retrieval.

To be presented at the European Conference on Information Retrieval (ECIR 2025) in Lucca, Italy on 6-10 April 2025.

[download pdf]

2nd Workshop on Open Web Search

The Second International Workshop on Open Web Search (WOWS) aims to promote and discuss ideas and approaches to open up the web search ecosystem so that small research groups and young startups can leverage the web to foster an open and diverse search market. Therefore, the workshop, which takes place at ECIR2025, has two calls that support collaborative and open web search engines: (1) for scientific contributions, and (2) for participation in the WOWS-Eval shared task for collaborative evaluations of the Open Web Index.

The first call aims for scientific contributions to building collaborative search engines, including collaborative crawling, collaborative search engine deployment, collaborative search engine evaluation, and collaborative use of the web as a resource for researchers and innovators.

The second call on the WOWS-Eval shared task aims at gaining practical experience with joint, cooperative evaluation of search engines by focusing to enrich the Open Web Index (OWI) with relevance judgments cooperatively transferred from existing TREC-style test collections.

More information:
https://opensearchfoundation.org/en/events-osf/wows2025/

Unreliable algorithm labels youth as future criminals

Police and judicial authorities in the Netherlands use an algorithm on tens of thousands of young people to predict whether they will end up in crime. In practice, this prediction often turns out to be wrong, while the consequences can be huge. It can make the difference between having a criminal record or not.

I was interviewed for this article in Follow the Money with colleagues Tim de Jonge en Frederik Zuiderveen-Borgesius, amongst others.

Read more (in Dutch) on: Follow the Money.
(also in Metro Nieuws and on Reddit)

Edit: The article was discussed in the Dutch parliament on 6 March.

DB guest lecture by Hannes Mühleisen

We are proud to announce that Hannes Mühleisen will give a guest lecture on Tuesday 10 December at 15:30h. in EOS N 01.630 for the course Information Modelling and Databases. Hannes Mühleisen is professor of Data Engineering at Radboud University, the creator of DuckDB and co-founder and CEO of DuckDB Labs. Students of the course use DuckDB to practice their SQL skills.

Analytical Query Processing and the DuckDB System

by Hannes Mühleisen

DBMSs have historically been created to support transactional (OLTP) workloads. However, a second use case, analytical data analysis (OLAP), quickly appeared. These workloads are characterised by complex, relatively long-running queries that process significant portions of the stored dataset, for example aggregations over entire tables or joins between several large tables. Its rather impossible for an OLTP-focused DBMS to perform well in OLAP scenarios, which is why specialised systems have been developed. In this lecture, I will introduce analytical query processing, give an overview over the state of the art in research and industry, and describe our own analytical DBMS, DuckDB.

Introducing Zoekeend

We made a little tool for running information retrieval experiments using DuckDB which we appropriately called Zoekeend (Dutch for “search duck”). Zoekeend will be presented at DuckCon #6 in Amsterdam on 31 January 2025.

I will present several reproduced experiments, such as ranking using (small) language models, imports of indexes in the common index file format (CIFF), and the CIFF tokenizer based on tokenizers of large language models, all elegantly defined as SQL queries. I will further present ongoing work on new types of indexes for search engines, such as the score-fitted index, the constant length index and the term-grouped index, all of which would be extremely cumbersome to implement in existing search engines like Lucene, but can be easily defined as SQL queries in DuckDB. Zoekeend will greatly simplify information retrieval experimentation. Zoekeend is open source and available from: https://gitlab.science.ru.nl/informagus/zoekeend/

Alisa Rieger defends PhD thesis on responsible opinion formation

Striving for responsible opinion formation in web search on debated topics

by Alisa Rieger

Web search plays an important role in the contemporary information landscape, shaping individual and collective knowledge by providing fast and effortless access to vast amounts of resources. We rely on web search engines for various information needs, some of which can carry serious consequences. This is particularly evident when searching for information on debated topics, which can shape opinions and practical decisions. Debated topics are characterized by diverse and often opposing perspectives linked to different values and interests. Ideally, individuals would diligently engage with different perspectives to become well-informed and form opinions responsibly. However, engaging with information on debated topics can be cognitively demanding and subject to emotionally charged and biased behavior. When resorting to web search to find information on debated topics, searchers may be confronted with further obstacles. For instance, search engines are known to apply opaque ranking criteria, may not provide sufficient viewpoint diversity, and might foster over-reliance.

In this dissertation, we present different user studies aimed at better understanding the challenges of web search on debated topics and identifying measures to help searchers overcome these challenges. We first explored whether and how factors inherent to the searcher and search interface affect search behavior. Then, we investigated the risks and benefits of interventions to guide search behavior as well as empower searchers, aiming at supporting unbiased and diligent search interactions without restricting searcher autonomy. Our findings underscore the unique characteristics of web search on debated topics and provide a foundation for designing, tailoring, and evaluating interventions to support searchers. Considering the overall insights gained through our user studies, it becomes clear that the most pivotal challenges of web search on debated topics arise from the complex searcher-system interplay. Rather than turning to simple fixes, there is a need to acknowledge the complexity of the issue and commit to comprehensive investigations and solutions to avoid inadvertently exacerbating risks. Laying the groundwork for future investigations, we provide an extensive review of interdisciplinary literature with a detailed account of challenges and research opportunities.

With this dissertation, we raise awareness for the pressing socio-technical issues related to digital media and opinion formation and aspire to encourage interdisciplinary research teams, practitioners, and policymakers to join forces in establishing web search environments that foster individual and societal well-being.

[more information]