Learning to Merge Search Results

Learning to Merge Search Results for Efficient Distributed Information Retrieval

Kien Tjin-Kam-Jet and Djoerd Hiemstra

Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which learn a function that merges results based on information that is readily available, i.e. the ranks, titles, summaries and URLs contained in the results pages. By not downloading additional information, such as the full document, we decrease bandwidth usage. CORI and Round Robin merging were used as our baselines; surprisingly, our results show that the SVM methods do not improve over those baselines

[download pdf]