This demonstrator showcases the PuppyIR framework by incorporating numerous child specific components developed as part of the PuppyIR project. The Demonstrator is for Emma’s Children's Hospital in Amsterdam and provides children with a novel and exciting interface to help support their information needs while in hospital or visiting the hospital.
EmSe will be demonstrated at the 34th European Conference on Information Retrieval (ECIR) in Barcelona on 1-5 April 2012
Ranking XPaths for extracting search result records
by Dolf Trieschnigg, Kien Tjin-Kam-Jet and Djoerd Hiemstra
Extracting search result records (SRRs) from webpages is useful for building an aggregated search engine which combines search results from a variety of search engines. Most automatic approaches to search result extraction are not portable: the complete process has to be rerun on a new search result page. In this paper we describe an algorithm to automatically determine XPath expressions to extract SRRs from webpages. Based on a single search result page, an XPath expression is determined which can be reused to extract SRRs from pages based on the same template. The algorithm is evaluated on six datasets, including two new datasets containing a variety of web, image, video, shopping and news search results. The evaluation shows that for 85% of the tested search result pages, a useful XPath is determined. The algorithm is implemented as a browser plugin and as a standalone application which are available as open source software.
The MapReduce, Pig Latin and Cloud Computing assignments are graded. The final grades can be found in Blackboard's grade center. Please join the course evaluation session on 21 February in hal B 2C from 12.30 – 13.30 hour (including a free lunch).
by Almer Tigelaar, Djoerd Hiemstra, Dolf Trieschnigg
Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real-world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom.
Dutch broadcaster BNN tests the intuitive train planner developed at the Database Group. Their verdict: “ingenious”, and “approved for elderly”. Picture of Kien Tjin-Kam-Jet proudly in the back (in Dutch). See the treinplanner in action at: http://treinplanner.info
CLEF 2012: Conference and Labs of the Evaluation Forum: First Call for Participation
The CLEF 2012 is next year's edition of the popular CLEF campaign and workshop series which has run since 2000 contributing to the systematic evaluation of information access systems, primarily through experimentation on shared tasks. In 2010 CLEF was launched in a new format, as a conference with research presentations, panels, poster and demo sessions and laboratory evaluation workshops. Labs follow under two types: laboratories to conduct evaluation of information access systems, and workshops to discuss and pilot innovative evaluation activities. In 2012, CLEF will take place in September 17-20 in Rome, and researchers and practitioners from all segments of the information access and related communities are invited to participate to the following Evaluation Labs:
Yesterday, the Wikipedia community announced its decision to black out the English-language Wikipedia for 24 hours, worldwide, on Wednesday, January 18. The blackout is a protest against proposed legislation in the United States – the Stop Online Piracy Act (SOPA) and the PROTECT IP Act (PIPA). See: http://wikimediafoundation.org/wiki/English_Wikipedia_anti-SOPA_blackout
If I understand things right, the Stop Online Piracy Act will allow a U.S. court to legally demand to take utwente.nl off-line, just because a student or professor published a link to circumvent internet censorship on his/her University of Twente web page, for instance a link like: https://addons.mozilla.org/en-US/firefox/addon/desopa/…