OpenSearch: share your search results

OpenSearch is a collection of simple XML formats for sharing search results, that was originally developed by A9, a company founded by Amazon.com. A9 acts as a search mediator: You pick your favorite search engines, and A9 sends your queries to these engines, aggregates the results, and done, you have your own personal view of the web!

Many search engines provide some kind of OpenSearch or RSS-like search these days, for instance, here's an ego search on Yahoo. But, OpenSearch is just as useful on a much smaller scale, for instance for searching these pages for information on SIKS (the Dutch School for Information and Knowledge Systems).

Distributed Search and Keyword Auctions

After the burst of the dot-com bubble in the autumn of 2001, the World Wide Web has gone through some remarkable changes in its organizational structure. Consumers of data and content are increasingly taking the role of producers of data and content, thereby threatening traditional publishers. A well known example is the Wikipedia encyclopedia, which is written entirely by its (non-professional) users on a voluntary basis, while still rivaling a traditional publisher like Britannica on-line in both size and quality. Similarly, in SourceForge, communities of open source software developers collaboratively create new software thereby rivaling software vendors like Microsoft; Blogging turned the internet consumers of news into news providers; Kazaa and related peer-to-peer platforms like BitTorrent and E-mule turned anyone who downloads a file automatically into contributors of files; Flickr turned users into contributors of visual content, but also into indexers of that content by social tagging, etc. Communities of users operate by trusting each other as co-developers and contributors, without the need for strict rules. There is however one major internet application for which communities only play a minor role. One of the web's most important applications — if not the most important application — is search. Internet search is almost exclusively run by three companies that dominate the search market: Google, Yahoo, and Microsoft. In contrast to traditional centralized search, where a centralized body like Google or Yahoo is in full control, a community-run search engine would consist of many small search engines that collaboratively provide the search service. This report motivates the need for large-scale distributed approaches to information retrieval, and proposes solutions based on keyword auctions.

[download pdf]

Distributed search: who bids on “britney spears”?

NWO We will start a new NWO research project on the use of keyword auctions for distributed information retrieval. The project's aim is to distribute internet search functionality in such a way that communities of users and/or federations of small search systems provide search services in a collaborative way. Instead of getting all data to a centralized point and process queries centrally, as is done by today's search systems, the project will distribute queries over many small autonomous search systems and process them locally. Distributed information retrieval is a well researched sub area of information retrieval, but it has not resulted in practical solutions for large scale search problems because of high administration costs of setting up large numbers of installations and because it turns out to be hard in practice to direct queries to the appropriate local search systems. In this project we will research a radical new approach to distribute search: distributed information retrieval by means of keyword auctions.

Keyword auctions like Google's AdWords give advertisers the opportunity to provide targeted advertisements by bidding on specific keywords, for instance by bidding on today's hottest query britney spears. Analogous to these keyword auctions, local search systems will bid for keywords at a central broker. They “pay” by serving queries for the broker. The broker will send queries to those local search systems that optimize the overall effectiveness of the system, i.e., local search systems that are willing to serve many queries, but also are able to provide high quality results. The project will approach the problem from three different angles: 1) modeling the local search system, including models for automatic bidding and multi-word keywords; 2) modeling the search broker's optimization using the bids, the quality of the answers, and click-through rates; 3) integration of structured data typically available behind web forms of local search systems with text search. The approaches will be evaluated using prototype systems and simulations on benchmark test collections.

See: NWO news (in Dutch)