IR for children – Page 2 – Djoerd Hiemstra

What and How Children Search on the Web

by Sergio Duarte Torres and Ingmar Weber (Yahoo! Research)

The Internet has become an important part of the daily life of children as a source of information and leisure activities. Nonetheless, given that most of the content available on the web is aimed at the general public, children are constantly exposed to inappropriate content, either because the language goes beyond their reading skills, their attention span differs from grown-ups or simple because the content is not targeted at children as is the case of ads and adult content. In this work we employed a large query log sample from a commercial web search engine to identify the struggles and search behavior of children of the age of 6 to young adults of the age of 18. Concretely we hypothesized that the large and complex volume of information to which children are exposed leads to ill-defined searches and to dis-orientation during the search process. For this purpose, we quantified their search difficulties based on query metrics (e.g. fraction of queries posed in natural language), session metrics (e.g. fraction of abandoned sessions) and click activity (e.g. fraction of ad clicks). We also used the search logs to retrace stages of child development. Concretely we looked for changes in the user interests (e.g. distribution of topics searched), language development (e.g. readability of the content accessed) and cognitive development (e.g. sentiment expressed in the queries) among children and adults. We observed that these metrics clearly demonstrate an increased level of confusion and unsuccessful search sessions among children. We also found a clear relation between the reading level of the clicked pages and the demographics characteristics of the users such as age and average educational attainment of the zone in which the user is located.

Visual Exploration of Health Information for Children

by Frans van der Sluis, Sergio Duarte, Djoerd Hiemstra, Betsy van Dijk and Frea Kruisinga

Children experience several difficulties retrieving information using current Information Retrieval (IR) systems. Particularly, children struggle to find the right keywords to construct queries given their lack of domain knowledge. This problem is even more critical in the case of the specialized health domain. In this work we present a novel method to address this problem using a cross-media search interface in which the textual data is searched through visual images. This solution aims to solve the recall and recognition problem which is salient for health information, by replacing the need for a vocabulary with the easy task of recognising the different body parts.

[download pdf]

Automatic Reformulation of Children’s Search Queries

Maarten van Kalsbeek, Joost de Wit, Dolf Trieschnigg, Paul van der Vet, Theo Huibers and Djoerd Hiemstra

The number of children that have access to an Internet connection (at home or at school) is large and growing fast. Many of these children search the web by using a search engine. These search engines do not consider their skills and preferences however, which makes searching difficult. This paper tries to uncover methods and techniques that can be used to automatically improve search results on queries formulated by children. In order to achieve this, a prototype of a query expander is built that implements several of these techniques. The paper concludes with an evaluation of the prototype and a discussion of the promising results.

download pdf

SIGIR Workshop on Accessible Search Systems

We organize a workshop on an exciting new theme at SIGIR on 23 July 2010 in Geneva, Switzerland.

Current search systems are not adequate for individuals with specific needs: children, older adults, people with visual or motor impairments, and people with intellectual disabilities or low literacy. Search services are typically created for average users (young or middle-aged adults without physical or mental disabilities) and information retrieval methods are based on their perception of relevance as well. The workshop will be the first ever to raise the discussion on how to make search engines accessible for different types of users, including those with problems in reading, writing or comprehension of complex content. Search accessibility means that people whose abilities are considerably different from those that average users have will be able to use search systems with the same success.

The objective of the workshop is to provide a forum and initiate collaborations between academics and industrial practitioners interested in making search more usable for users in general and for users with specific needs in particular. We encourage presentation and participation from researchers working at the intersection of information retrieval, natural language processing, human-computer interaction, ambient intelligence and related areas. The workshop will be a mix of oral presentations for long papers (maximum of 8 pages), a session for posters (maximum of 2 pages) and a panel discussion. All submissions will be reviewed by at least two PC members. Workshop proceedings will be available at the workshop. The workshop welcomes, but is not limited to, contributions on a range of the following key issues:

Understanding of search behavior of users with specific needs
Understanding of relevance criteria of users with specific needs
Understanding the effects of domain expertise, age, user experience and cognitive abilities on search goals and results evaluation
Non-topical aspects of relevance: text style, readability, appropriateness of language (harassment and explicit content detection)
Development of test collections for evaluation of accessible search systems
Collaborative search techniques for assisting users with specific needs (e.g. parents helping children)
Potential of search personalization techniques to satisfy users with specific needs
Search interfaces and result representation for people with specific needs
Using assistive technologies for interaction with search systems, e.g. speech recognition or eye tracking software for querying and browsing.

See the Workshop website.

New DB group member: Sergio Duarte Torres

Today, Sergio Duarte Torres joined our group to work on PuppyIR, a European project that will develop an open source environment to construct information services for children. Welcome Sergio!

First PuppyIR search architecture

PuppyIR: Designing an Open Source Framework for Interactive Information Services for Children

by Leif Azzopardi, Richard Glassey, Mounia Lalmas, Tamara Polajnar, and Ian Ruthven

One of the main aims of the PuppyIR project is to provide an open source framework for the development of Interactive Information Retrieval Services. The main focus of the project is directed towards developing such services for children, which introduces a number of novel and challenging issues to address (such as language development, security, moderation, etc).

In this poster paper, we outline the preliminary high-level design of the open source framework. The framework uses a layered architecture to minimize dependencies between the user-side concerns of interaction and presentation, and the system-side concerns of aggregating content from multiple sources and processing information appropriately. Each layer will consist of a series of interchangeable components, which can be interconnected to form a complete service. To facilitate the construction of diverse information services, a dataflow language is proposed to enable the assembly of the components in an intuitive and visual manner. One of the the design goals of the architecture, and ultimate measures of success, is to provide a â€œlegoâ€ style building block environment in which researchers and developers of any age can build their own information service. The poster provides the starting point for the design of the framework and aims to seek comments, feedback and suggestions from the community in order to improve and refine the architecture.

[download paper]

PuppyIR: IR for Children

As adults we are keen to help children maximize their full potential. Developing childrenâ€™s abilities to find and understand information is key to their development as young adults. The Internet offers children exciting new ways to meet people, learn about different cultures and develop their creative potential. In a world where Internet and technology play such an important role as it does today, it is absolutely necessary that children can assess the meaning of gathered information and can in child-friendly ways get engaged in interaction with content.

However, childrenâ€™s ability to use the Internet is severely hampered by the lack of appropriate search tools. Most Information Retrieval (IR) systems are designed for adults: they return information that is unsuitable for children, present information in lists that children find difficult to manage and make it difficult for children to identify the relevant parts. Worse, almost all Internet search engines confront children with inappropriate material.

PuppyIR is an FP7 project that will help children search the Internet safely and successfully by the design of an Open-Source platform of child-friendly information services. These Information Services will be able to summarise content for children, moderate information for children, help children safely build social networks and intelligently aggregate for presentation to children. PuppyIR aims to facilitate the creation of child-centric information access, based on the understanding of the behaviour and needs of children. PuppyIR will provide a suite of components that can be used by system designers to construct usable and tailored IR systems for children and the opportunity for children to fully exploit the Internet. PuppyIR will develop new interaction paradigms that allow children to express their information needs simply and have results presented in an intuitive way. PuppyIR will contribute to the evaluation of childrenâ€™s IR systems by the development of child-centred evaluation methods.

More info at: PuppyIR project page at NIRICT.