Search for the Future

Information Retrieval is the discipline that studies computer-based search tools. Many applications that handle information on the internet would be completely inadequate without the support of information retrieval technology. How would we manage our email without spam filtering? How would we find information on the world wide web if there were no web search engines? The rise of web search engines has been one of the major success stories in computer science of the last decade: Internet and search companies like Google and Yahoo are now among the world's most influential information technology companies.

Today, search technology is provided and developed by major search providers like Google and Yahoo, and by small specialized companies with specialized staff. But as search technology matures, it will have to be available to non-expert application developers as well. A major obstacle to achieve this, is the lack of theories and high-level abstractions of search systems and the lack of declarative query languages. Another obstacle is the lack of methods to handle non-textual data, such as images, audio and video. Several projects of the Database Group of the University of Twente try to solve these problems for application areas such as Entity Search, Expert Search, Video Search, and Distributed Search. The models and approaches that are developed in these projects are evaluated on large scale, realistic testbeds, and implemented in the group's open source search system PF/Tijah, a search system that combines keyword queries with structured queries on XML databases. The research contributes to the several courses in the university's graduate programs, for instance Information Retrieval, and XML & Databases 1 and XML & Databases 2.