by Dolf Trieschnigg, Kien Tjin-Kam-Jet, and Djoerd Hiemstra
Building a federated search engine based on a large number existing web search engines is a challenge: implementing the programming interface (API) for each search engine is an exacting and time-consuming job. In this demonstration we present SearchResultFinder, a browser plugin which speeds up determining reusable XPaths for extracting search result items from HTML search result pages. Based on a single search result page, the tool presents a ranked list of candidate extraction XPaths and allows highlighting to view the extraction result. An evaluation with 148 web search engines shows that in 90% of the cases a correct XPath is suggested.
The software can be downloaded as a FireFox plugin.
The tool was demonstrated at the ACM SIGIR Conference in Dublin.