A Probabilistic Ranking Framework using Unobservable Binary Events for Video Search

by Robin Aly, Djoerd Hiemstra, Arjen de Vries, and Franciska de Jong

CIVR 2008, Niagara Falls This paper concerns the problem of search using the output of concept detectors (also known as high-level features) for video retrieval. Unlike term occurrence in text documents, the event of the occurrence of an audiovisual concept is only indirectly observable. We develop a probabilistic ranking framework for unobservable binary events to search in videos, called PR-FUBE. The framework explicitly models the probability of relevance of a video shot through the presence and absence of concepts. From our framework, we derive a ranking formula and show its relationship to previously proposed formulas. We evaluate our framework against two other retrieval approaches using the TRECVID 2005 and 2007 datasets. Especially using large numbers of concepts for retrieval results in good performance. We attribute the observed robustness against the noise introduced by less related concepts to the effective combination of concept presence and absence in our method. The experiments show that an accurate estimate for the probability of occurrence of a particular concept in relevant shots is crucial to obtain effective retrieval results.

The paper will be presented at the ACM International Conference on Image and Video Retrieval CIVR 2008 in Niagara Falls, Canada

[download pdf]

The Effectiveness of Concept Based Search for Video Retrieval

by Claudia Hauff and Robin Aly and Djoerd Hiemstra

In this paper we investigate how a small number of high-level concepts derived for video shots, such as Sports, Face, Indoor, etc., can be used effectively for ad hoc search in video material. We will answer the following questions: 1) Can we automatically construct concept queries from ordinary text queries? 2) What is the best way to combine evidence from single concept detectors into final search results? We evaluated algorithms for automatic concept query formulation usingWordNet based concept extraction, and we evaluated algorithms for fast, on-line combination of concepts. Experimental results on data from the TREC Video 2005 workshop and 25 test users show the following. 1) Automatic query formulation through WordNet based concept extraction can achieve comparable results to user created query concepts and 2) Combination methods that take neighboring shots into account

[download pdf]

Building Detectors to Support Searches on Combined Semantic Concepts

by Robin Aly, Djoerd Hiemstra, and Roeland Ordelman

Bridging the semantic gap is one of the big challenges in multimedia information retrieval. It exists between the extraction of low-level features of a video and its conceptual contents. In order to understand the conceptual content of a video a common approach is building concept detectors. A problem of this approach is that the number of detectors is impossible to determine. This paper presents a set of 8 methods on how to combine two existing concepts into a new one, using a logical AND operator. The scores for each shot of a video for the combined concept are computed from the output of the underlying detectors. The findings are evaluated on basis of the output of the 101 detectors including a comparison to the theoretical possibility to train a classifier on each combined concept. The precision gains are significant, specially for methods which also consider the chronological surrounding of a shot promising.

[download pdf]

New project member: Robin Aly

Robin Aly has joined our group to work on audio and video retrieval. His project is sponsored by the CTIT SRO NICE (Strategic Research Orientation: Natural Interaction in Computer-mediated Environments). In this project we will closely cooperate with people from the Human Media Interaction group.