Evaluate FedWeb runs online

The TREC Federated web track provides a new online tool to check the syntax of your runs and provide preliminary evaluation results on 10 of the 75 provided topics. Now you can easily see how you compare to other runs submitted to the system. The official TREC evaluation results will be based on at least 50 of the remaining topics in your run. Check your run at:
http://circus.ewi.utwente.nl/fedweb/.

FedWeb Please note that the site does NOT submit runs to TREC. Submit your runs at TREC via the TREC active participants site: before August 18, 2014 (Resource & Vertical Selection); before September 15, 2014 (Results Merging).

Follow @TRECFedWeb on Twitter.

Yoran Heling graduates on peer selection in Direct Connect

by Yoran Heling

In a distributed Peer-to-peer (P2P) system such as Direct Connect, files are often distributed over multiple source peers. It is up to the downloading peer to decide from how many and from which source peers to download the particular file of interest. Biased Random Period Switching (BRPS) is an algorithm, implemented at the downloading peer, that determines at what point to download from which source peer. The number of source peers that a downloading peer downloads from at a certain point is called the Degree of Parallelism (DoP). This research focussed on implementing BRPS in an existing Direct Connect client and comparing the downloading performance against an unmodified client. Two implementations of BRPS in Direct Connect have been made. A simple implementation that follows the original BRPS algorithm as closely as possible, with minor modifications that were required to ensure that the downloading process would not get stuck on an unavailable source peer. An improved implementation has also been made with slight modifications to the original BRPS algorithm. The improved implementation incorporates two improvements to ensure that the DoP does not drop below its desired value in the face of unavailable source peers.

The original client and the two BRPS implementations have been evaluated in a controlled Direct Connect network with 50 downloading peers and a variable number of source peers. The source peers have been configured to throttle their available bandwidth to an average of 500 KB/s, and following a realistic bandwidth distribution based on measurements from the Tor P2P network. The experiments consisted of all downloading peers downloading the same file at the same time, and taking measurements on the side of these downloading peers. Four experiments have been performed, with one varying parameter in each experiment. The size of the file being downloaded was varied between 100 MB and 1024 MB in the first experiment, the second experiment varied the DoP between 1 and 15. The number of source peers was varied between 10 and 100 in the third experiment, and in the last experiment between 0% and 80% unavailable source peers were added to the network.

In all experiments, both BRPS implementations performed close to the optimal average download time, and were consistently faster than the original client by a factor of 2 to 5. In the last experiment, the improved BRPS implementation did keep the measured DoP closer to its desired value than the simple implementation, but this has not resulted in a significant difference in the measured download times.

[download pdf]

The Lowlands at TREC

by Robin Aly, Djoerd Hiemstra, Dolf Trieschnigg, and Thomas Demeester

We describe the participation of the Lowlands at the Web Track and the FedWeb track of TREC 2013. For the Web Track we used the MIREX MapReduce library with out-of-the-box approaches. For the FedWeb Track we adapted our shard selection method Taily for resource selection. Our results are above the median performance of TREC participants.

Presented at the 22nd Text REtrieval Conference (TREC) at the USA National Institute of Standards and Technology (NIST) in Gaithersburg, USA

[download pdf]

Exploiting User Disagreement for Web Search Evaluation

Exploiting User Disagreement for Web Search Evaluation: An experimental approach

by Thomas Demeester, Robin Aly, Djoerd Hiemstra, Dong Nguyen, Dolf Trieschnigg, and Chris Develder

To express a more nuanced notion of relevance as compared to binary judgments, graded relevance levels can be used for the evaluation of search results. Especially in Web search, users strongly prefer top results over less relevant results, and yet they often disagree on which are the top results for a given information need. Whereas previous works have generally considered disagreement as a negative effect, this paper proposes a method to exploit this user disagreement by integrating it into the evaluation procedure. First, we present experiments that investigate the user disagreement. We argue that, with a high disagreement, lower relevance levels might need to be promoted more than in the case where there is global consensus on the top results. This is formalized by introducing the User Disagreement Model, resulting in a weighting of the relevance levels with a probabilistic interpretation. A validity analysis is given, and we explain how to integrate the model with well-established evaluation metrics. Finally, we discuss a specific application of the model, in the estimation of suitable weights for the combined relevance of Web search snippets and pages.

To be presented at the 7th ACM Conference on Web Search and Data Mining (WSDM) in New York City, USA on 24-28 February.

[Read more]

STW Valorization Grant for Q-Able

Q-Able.com The University of Twente spin-off Q-Able receives a Valorization Grant Phase 1 from the Dutch Technology Foundation STW to further develop and market their OneBox search technology.

When it comes to web applications, users love the “single text box” interface because it is extremely easy to use. However, much information on the web is stored in structured databases and can only be accessed by filling out a web form with multiple input fields. Examples include planning a trip, booking a hotel room, looking for a second-hand car, etc.

The approach of web search engines – to crawl sites and make a central index of the pages – does not suffice in many cases. First, some sites are hard to crawl because the pages can only be accessed via the web form. Second, some sites provide information that changes quickly, like available hotel rooms, and crawled pages would be almost immediately outdated. Third, some sites provide information that is generated dynamically, like planning a trip from one address to another on a certain date, and it is impossible to crawl all combinations of addresses and dates. Finally, a simple text index that search engines provide does not easily allow structured queries on arbitrary fields. In all these cases, the sites that provide such information can be found using a search engine like Google, but the information itself can only be retrieved after filling in a web form. Filling in one or more web forms with many fields can be a tedious job.

Q-Able replaces a site's web forms by OneBox, a simple text field, giving complex sites the look and feel of Google: a single field for asking questions and performing simple transactions. OneBox allows users to plan a trip by typing for instance “Next Wednesday from Enschede to Amsterdam arriving at 9am”, or to search for second-hand cars by typing “Ford C-max 4-doors less than 200,000 kilometres from before 2008”. OneBox can be configured to operate on any web site that provides complex web forms. Furthermore, OneBox can be configured to operate on multiple web sites using a single simple text field. This way, to search for instance for a second-hand car, users enter a single query, and search multiple second-hand car sites with a single click. OneBox only replaces the user interface of a web database: It does not copy, crawl or otherwise index the data itself.

OneBox, is the result of the Ph.D. research project done by Kien Tjin-Kam-Jet at the University of Twente. His research identified several successful novel approaches to query understanding by combining rule-based approaches with probabilistic approaches that rank query interpretations. Furthermore, the research resulted in an efficient implementation of OneBox, that needs only a fraction of a second to interpret queries even in complex configurations for accessing multiple web databases. Treinplanner.info, Q-Able's first public demonstration of OneBox, demonstrates natural search for the Dutch Railways (Nederlandse Spoorwegen) travel planner, and was well-received in user questionnaires, on Twitter, and on the Dutch national public radio and television. Q-Able will use the STW valorisation grant to investigate the technical and and commercial feasibility of OneBox.

TREC Federated Web Search track

http://sites.google.com/site/trecfedweb/

First submission due on August 11, 2013

The Federated Web Search track is part of NIST's Text REtrieval Conference TREC 2013. The track investigates techniques for the selection and combination of search results from a large number of real on-line web search services. The data set, consisting of search results from 157 search engines is now available. The search engines cover a broad range of categories, including news, books, academic, travel, etc. We have included one big general web search engine, which is based on a combination of existing web search engines.

Federated search is the approach of querying multiple search engines simultaneously, and combining their results into one coherent search engine result page. The goal of the Federated Web Search (FedWeb) track is to evaluate approaches to federated search at very large scale in a realistic setting, by combining the search results of existing web search engines. This year the track focuses on resource selection (selecting the search engines that should be queried), and results merging (combining the results into a single ranked list). You may submit up to 3 runs for each task. All runs will be judged. Upon submission, you will be asked for each run to indicate whether you used result snippets and/or pages, and whether any external data was used. Precise guidelines can be found at:
http://sites.google.com/site/trecfedweb/

WWW Fed

Track coordinators

  • Djoerd Hiemstra – University of Twente, The Netherlands
  • Thomas Demeester – Ghent University, Belgium
  • Dolf Trieschnigg – University of Twente, The Netherlands
  • Dong Nguyen – University of Twente, The Netherlands

Snippet-based Relevance Predictions

Snippet-based Relevance Predictions for Federated Web Search

by Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, Chris Develder, and Djoerd Hiemstra

How well can the relevance of a page be predicted, purely based on snippets? This would be highly useful in a Federated Web Search setting where caching large amounts of result snippets is more feasible than caching entire pages. The experiments reported in this paper make use of result snippets and pages from a diverse set of actual Web search engines. A linear classifier is trained to predict the snippet-based user estimate of page relevance, but also, to predict the actual page relevance, again based on snippets alone. The presented results confirm the validity of the proposed approach and provide promising insights into future result merging strategies for a Federated Web Search setting.

The paper will be presented at the 35th European Conference on Information Retrieval (ECIR) on 25 March 2013 in Moscow, Russia

[download pdf]

A probabilistic approach for mapping free-text queries to complex web forms

by Kien Tjin-Kam-Jet, Dolf Trieschnigg, and Djoerd Hiemstra

Web applications with complex interfaces consisting of multiple input fields should understand free-text queries. We propose a probabilistic approach to map parts of a free-text query to the fields of a complex web form. Our method uses token models rather than only static dictionaries to create this mapping, offering greater flexibility and requiring less domain knowledge than existing systems. We evaluate different implementations of our mapping model and show that our system effectively maps free-text queries without using a dictionary. If a dictionary is available, the performance increases and is significantly better than a rule-based baseline.

[download pdf]

What snippets say about web pages

What Snippets Say About Pages in Federated Web Search

by Thomas Demeester, Dong Nguyen, Dolf Trieschnigg, Chris Develder, and Djoerd Hiemstra

What is the likelihood that a Web page is considered relevant to a query, given the relevance assessment of the corresponding snippet? Using a new federated IR test collection that contains search results from over a hundred search engines on the internet, we are able to investigate such research questions from a global perspective. Our test collection covers the main Web search engines like Google, Yahoo!, and Bing, as well as a number of smaller search engines dedicated to multimedia, shopping, etc., and as such reflects a realistic Web environment. Using a large set of relevance assessments, we are able to investigate the connection between snippet quality and page relevance. The dataset is strongly inhomogeneous, and although the assessors’ consistency is shown to be satisfying, care is required when comparing resources. To this end, a number of probabilistic quantities, based on snippet and page relevance, are introduced and evaluated.

The paper will be presented at the Asia Information Retrieval Societies Conference AIRS 2012 in Tianjin, China

[download pdf]