Score-Fitted Indexes and Constant Length Indexes for Information Retrieval

by Djoerd Hiemstra

We present two novel inverted index approaches and corresponding query processing strategies for information retrieval: 1) score-fitted indexes and 2) constant length indexes. These indexes do not store document lengths and document priors, but nevertheless support popular rankers like BM25 and language models by approximating their results. We answer the question: What is the effect of score-fitted indexes and constant length indexes, and the combination of both approaches, on the search quality? We show on three diverse datasets that the two indexes perform on par with approaches that use a standard inverted index in almost all cases. Our work suggests that it is possible to develop search engines that are more efficient than engines that store document lengths and/or document priors, such as Lucene and Terrier.

To be presented at the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2025)

[download pdf]

Leave a Reply

Your email address will not be published.