In our recent TRECVID experiments we evaluated a concept retrieval approach to video retrieval, i.e. the user searches a collection of video shots by using automatically detected concepts such as face, people, indoor, sky, building, etc. The performance of such systems is still far from sufficient to be usable in reality, but is this because automatic detectors are bad? because users cannot write concept queries? because systems cannot rank concepts queries? or possibly, all of the above?
To help researchers answering this question, we made the data from a user study involving 24 users available. In the experiment, users had to select from a set of 101 concepts those concepts they expect to be helpful for finding certain information. For instance, suppose one needs to find shots of “one or more palm trees”. Most people, 18 out of 24, choose the concept tree, but others choose outdoor (15), vegetation (9), sky (8), beach (8), or desert (4). The summarized results can be accessed now from Robin Aly's page.
Download the user study data.