Thursday 30 January the final of the Young Technology Award was held in Atak with an excellent performance of Kien Tjin-Kam-Jet of Q-Able.
See the photo impression.
Category: Photos
Merry Christmas
Federated Search Made Easy
by Dolf Trieschnigg, Kien Tjin-Kam-Jet, and Djoerd Hiemstra
Building a federated search engine based on a large number existing web search engines is a challenge: implementing the programming interface (API) for each search engine is an exacting and time-consuming job. In this demonstration we present SearchResultFinder, a browser plugin which speeds up determining reusable XPaths for extracting search result items from HTML search result pages. Based on a single search result page, the tool presents a ranked list of candidate extraction XPaths and allows highlighting to view the extraction result. An evaluation with 148 web search engines shows that in 90% of the cases a correct XPath is suggested.
The software can be downloaded as a FireFox plugin.
The tool was demonstrated at the ACM SIGIR Conference in Dublin.
Taily: Shard Selection Using the Tail of Score Distributions
by Robin Aly, Djoerd Hiemstra, and Thomas Demeester
Search engines can improve their efficiency by selecting only few promising shards for each query. State-of-the-art shard selection algorithms first query a central index of sampled documents, and their effectiveness is similar to searching all shards. However, the search in the central index also hurts efficiency. Additionally, we show that the effectiveness of these approaches varies substantially with the sampled documents. This paper proposes Taily, a novel shard selection algorithm that models a query's score distribution in each shard as a Gamma distribution and selects shards with highly scored documents in the tail of the distribution. Taily estimates the parameters of score distributions based on the mean and variance of the score function’s features in the collections and shards. Because Taily operates on term statistics instead of document samples, it is efficient and has deterministic effectiveness. Experiments on large web collections (Gov2, CluewebA and CluewebB) show that Taily achieves similar effectiveness to sample-based approaches, and improves upon their efficiency by roughly 20% in terms of used resources and response time.
Presented at the 36th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in Dublin, Ireland, 28 July – 1 August.
SIGIR Doctoral Consortium
I will be participating in the SIGIR Doctoral Consortium this year in Dublin, Ireland, organized by Jaime Arguello, Mounia Lalmas, and Grace Hui Yang.
Update (August 5)
Everyone concentrated at the DC meeting on 28 July in Dublin
Study tour completed
We are back from the two week China tour organized by Inter-Actief: 28 students, 4 cities (Shanghai, Hangzhou, Beijing and Hong Kong), 14 company and university visits in 14 days!
Top 3 university visits: 1) Tsingua University in Beijing with a very warm welcome by prof. Ling Feng, excellent talks and an impressive campus tour; 2) Jiao Tong University in Shanghai with interesting talks and students demoing their design challenge results; 3) Tongji University, Shanghai with interesting presentations and campus tour.
Top 3 company visits: 1) Microsoft Research Asia in Beijing with a excellent welcome by Tetsuya Sakai, some really awesome tech talks and a cool tour through the lab (see team photo); 2) Philips Design, Hong Kong with interesting talks and some of us participating in an experiment; 3) MotionGlobal, Shanghai, with very inspiring talks and an 'international' office tour. Two runner ups worth mentioning: 4) Nedap in Shanghai, and 5) Alibaba in Hangzhou.
More information at: noodle2012.nl
Kien Tjin-Kam-Jet wins CTIT PhD Carousel
Kien Tjin-Kam-Jet was awarded the first prize in the PhD Carousel of the Centre for Telematics and Information Technology Symposium: ICT The Innovation Highway. The prize was handed over by Stefano Stramigioli, professor of Advanced Robotics and chair holder of the Control Engineering group at the University of Twente.
Bessensap 2012 en het diepe web
Meer dan 99 procent van het wereldwijde web is op dit moment niet doorzoekbaar door zoekmachines. Daardoor blijft veel informatie ontoegankelijk. Relatief eenvoudige vragen als 'Wat is de beste treinreis van Enschede naar Amsterdam op 4 juni 2012?' en 'Wat is het telefoonnummer van Djoerd Hiemstra uit Enschede?' kunnen niet door zoekmachines als Google en Bing worden beantwoord kunnen worden. Toch is het antwoord daarvan wel degelijk beschikbaar op het web. Namelijk in het diepe web, waar zoekmachines niet kunnen komen omdat ze de pagina's niet van te voren hebben gedownload. De redenen daarvoor zijn divers en de Universiteit Twente onderzoekt methoden waarmee deze informatie toch gevonden kan worden door vragen op juiste te interpreteren, vragen naar de juiste bron te sturen en zoekresultaten te interpreteren en te integreren met resultaten van andere bronnen. De eerste demonstratie van onderzoeksresultaten uit dit onderzoek (http://treinplanner.info) kreeg sinds begin 2012 al 10.000den bezoekers.
Foto: Jan Taco te Gussinklo. Een leuk verslag is te vinden op: Dutch Button Works.
Study tour to South Korea and China
Noodle is the name of the 2012 study tour organized by study association Inter-Actief from the University of Twente. In September and October 2012 we will visit companies and universities in South Korea and China. Before the students depart they research the countries they will be visiting. All participants conduct research in one of the six research tracks defined within the tour's theme IT Integrated Lifestyle: how IT affects and enriches our daily lives.
The Study Tour Committee: David Huistra, Lex Utama, Marijn Mensinga, Mark Oude Veldhuis, Nils van Kleef, and Yme Joustra |
Follow the Noodle study tour preparations at http://noodle2012.nl.
ImagePile: an Alternative for Vertical Results Lists
by Saskia Akkersdijk, Merel Brandon, Hanna Jochmann-Mannak, Djoerd Hiemstra, and Theo Huibers
Recent work shows that children are very well capable of searching with Google, due to their familiarity with the interface. However, children do have difficulties with the vertical list representation of the results. In this paper, we present an alternative result representation for a touch interface, the ImagePile. The ImagePile displays the results as a pile of images where the user navigates through via horizontal swiping. This representation was tested on a search engine for the Emma child hospital's library. Using a within subject experiment, both representations were tested with children to compare the usability of both systems. The vertical representation was perceived as easier to use, but the ImagePile system was considered more fun to use. Also, with the ImagePile system more relevant results were chosen by the children, and they were more aware of the number of results.