University of Twente at TREC 2010

MapReduce for Experimental Search

by Djoerd Hiemstra and Claudia Hauff

This draft report presents preliminary results for the TREC 2010 ad-hoc web search task. We ran our MIREX system on 0.5 billion web documents from the ClueWeb09 crawl. On average, the system retrieves at least 3 relevant documents on the first result page containing 10 results, using a simple index consisting of anchor texts, page titles, and spam removal.

[download pdf]