Information Retrieval Models Tutorial

Many applications that handle information on the internet would be completely inadequate without the support of information retrieval technology. How would we find information on the world wide web if there were no web search engines? How would we manage our email without spam filtering? Much of the development of information retrieval technology, such as web search engines and spam filters, requires a combination of experimentation and theory. Experimentation and rigorous empirical testing are needed to keep up with increasing volumes of web pages and emails. Furthermore, experimentation and constant adaptation of technology is needed in practice to counteract the effects of people that deliberately try to manipulate the technology, such as email spammers. However, if experimentation is not guided by theory, engineering becomes trial and error. New problems and challenges for information retrieval come up constantly. They cannot possibly be solved by trial and error alone. So, what is the theory of information retrieval? There is not one convincing answer to this question. There are many theories, here called formal models, and each model is helpful for the development of some information retrieval tools, but not so helpful for the development others. In order to understand information retrieval, it is essential to learn about these retrieval models. In this chapter, some of the most important retrieval models are gathered and explained in a tutorial style.

The tutorial will be published in Ayse Goker and John Davies (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009.

[download draft]

[download exercise solutions]

DIR industry talk by Rene van Erk

Rene van Erk is member of the European management team of Wolters Kluwer where he is responsible for all Product – and Business Development. In this role, his key responsibility is to optimize the WK portfolio for maximum growth, meaning: a) M&A focus: Responsible for identifying acquisition opportunities with a good strategic portfolio b) Leading Innovation: Overall responsible for Product Development & Product Management moving WK from content provider to information solutions provider c) Leading our Online and Software Businesses: Wolters Kluwer currently owns around 30 Software Development Companies and leading Online properties across Europe. Rene will talk about: Communities and Workflow: Driving Information Consumption

More info at: DIR 2009

Supervisors for groups

All groups have now been assigned a supervisor, see under “Messages/groups” on TeleTOP. The deadline for the project plan is 30 October. Make sure you have talked with your supervisor at least twice before you submit your plan. Send him an email to make an appointment.

Example reports for the course Information Retrieval

We have collected three examples of reports submitted for the course Information Retrieval in earlier years. These are reports we found particularly good. To wet your appetite, we have collected here versions that have been further developed into papers for workshops/conferences. Yes, that's possible in this course! All links point to the faculty's ePrints-service and are clickable, certainly if you approach them from a machine in the UT-domain (please report if they don't work).

  1. Bockting, S. and Ooms, M.J. and Hiemstra, D. and van der Vet, P.E. and Huibers, T.W.C. (2008) Evaluating Relevance Feedback: An Image Retrieval Interface for Children. In: Proceedings of the Dutch-Belgian Information Retrieval Workshop, 14-15 Apr 2008, Maastricht. pp. 15-20. University of Maastricht. ISBN 978-90-5681-282-9
  2. Ben Moussa, M. and Pasch, M. and Hiemstra, D. and van der Vet, P.E. and Huibers, T.W.C. (2007) The Potential of User Feedback Through the Iterative Refining of Queries in an Image Retrieval System. In: Adaptive Multimedia Retrieval: User, Context and Feedback, 27-28 July 2006, Geneva, Switzerland. pp. 258-268. Lecture Notes in Computer Science 4398. Springer Verlag. ISSN 0302-9743 ISBN 978-3-540-71544-3
  3. Hoekstra, A.H. and Hiemstra, D. and van der Vet, P.E. and Huibers, T.W.C. (2006) Question Answering for Dutch: Simple does it. In: Proceedings of the 18th BeNeLux Conference on Artificial Intelligence (BNAIC), 5-6 Oct 2006, Namur, Belgium. BNVKI. ISSN 1568-7805

More info on TeleTOP.

DB Colloquium of Tuesday 29 January

The DB Colloquium of Tuesday 29 January, 14:00 h.-15:00 h. in ZI-3126 consists of two small presentations.

Comprehending historical election programs using XML and XRPC

by Douwe van der Meij

Party programs for elections can be incomprehensible, not to mention the comparison of current party programs to that of a decade ago. This paper focusses on a way to compre- hend the latter. It shows how to use xml to store election programs and to query those. This paper also comes with a proof of concept (PoC). In retrospect we look at this PoC, and we discuss the design choices made.


Boeken zonder leeftijdscategorie sneller vinden

by Wout Maaskant

In dit onderzoek is een systeem ontwikkeld waarmee gebruikers sneller boeken waar geen leeftijdscategorie aan is toegekend kunnen vinden in een bol.com-corpus, door gebruik te maken van eigenschappen van vergelijkbare boeken waar wel een leeftijdscategorie aan is toegekend. De gelijkenis tussen boeken wordt bepaald met behulp van het vector space model.

Intake Meeting for Projects Information Retrieval

The IR Projects groups are assigned to the following supervisors. We
scheduled an intake meeting for every group. See below.

Group A: Lucassen, T. and Nijkamp, B.
Supervisor: Djoerd Hiemstra
Intake: Wed 14 Nov. 13.45 u. – 14.15 u. room 3031

Group B: Alofs, T. and Niesink, L.D.J.
Supervisor: Theo Huibers
Intake: Wed 14 Nov. 13.45 u. – 14.15 u. room 2122

Group C: Logtenberg, J.D. and Vliet, W.M. van
Supervisor: Dolf Trieschnigg
Intake: Wed 14 Nov. 13.45 u. – 14.15 u. room 2063

Group D: Dehling, E.E. and Wal, T. van der
Supervisor: Paul van der Vet
Intake: Wed 14 Nov. 13.45 u. – 14.45 u. room 2122

Group E: Klaas, M.H.J. and Maaskant, W.J.
Supervisor: Djoerd Hiemstra
Intake: Wed 14 Nov. 14.15 u. – 14.45 u. room 3031

Group F: and Diephuis, M. and Kanis, Z.
Supervisor: Theo Huibers
Intake: Wed 14 Nov. 14.15 u. – 14.45 u. room 2122

Group G: Pothoven, T. and Sikkema, N.
Supervisor: Dolf Trieschnigg
Intake: Wed 14 Nov. 14.15 u. – 14.45 u. room 2063

Group H: Hofwegen, M.F. van and Sanderman, R.
Supervisor: Paul van der Vet
Intake: Wed 14 Nov. 14.15 u. – 14.45 u. room 2122

Group I: Toure, Y.C. and Weele, J.H.D. ter
Supervisor: Dolf Trieschnigg
Intake: to be scheduled

[Information Retrieval]: Welcome to the course Information Retrieval

Next Wednesday, 5 September 2007, we start our full semester course Information Retrieval. I think we have put together and interesting course with a number of guest speakers with international reputation.

In November, we will switch from lectures to small research projects in which students participate actively in research done at the University of Twente. In this second part, the meetings will be used to set up a discussion forum in which all participants actively exchange their ideas, progress and problems encountered.

If you did not already do so, please buy the reader Information Retrieval.
We wish you a fruitful course!