Learning to Merge Search Results for Efficient Distributed Information Retrieval
Kien Tjin-Kam-Jet and Djoerd Hiemstra
Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which learn a function that merges results based on information that is readily available, i.e. the ranks, titles, summaries and URLs contained in the results pages. By not downloading additional information, such as the full document, we decrease bandwidth usage. CORI and Round Robin merging were used as our baselines; surprisingly, our results show that the SVM methods do not improve over those baselines