I present three well-known probabilistic models of information retrieval in tutorial style: The binary independence probabilistic model, the language modeling approach, and Googleâ€™s page rank. Although all three models are based on probability theory, they are very different in nature. Each model seems well-suited for solving certain information retrieval problems, but not so useful for solving others. So, essentially each model solves part of a bigger puzzle, and a unified view on these models might be a first step towards an Information Retrieval Theory of Everything.
The paper is published in the news letter of the NVTI, the “Nederlandse Vereniging voor Theoretische Informatica”. A more extensive overview of information retrieval theory, covering eight models is given in: Djoerd Hiemstra, Information Retrieval Models. In: Ayse Goker and John Davies (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009.