Johannes Wassenaar graduates on automatic video hyperlinking

by Johannes Wassenaar

Linking segments of video using text-based methods and a flexible form of segmentation

In order to let user’s explore, and use large archives, video hyperlinking tries to aid the user in linking segments of video to other segments of videos, similar to the way hyperlinks on the web are used – instead of using a regular search tool. Indexing, querying and re-ranking multimodal data, in this case video’s, are subjects common in the video hyperlinking community. A video hyperlinking system contains an index of multimodal (video) data, while the currently watched segment is translated into a query, the query generation phase. Finally, the system responds to the user with a ranked list of targets that are about the anchor segment. In this study, the payload of terms in the form of position and offset in Elastic Search are used to obtain time-based information along the speech transcripts to link users directly to spoken text. The queries are generated by a statistic-based method using TF-IDF, a grammar-based part-of-speech tagger or a combination of both. Finally, results are ranked by weighting specific components and cosine similarity. The system is evaluated with the Precision at 5 and MAiSP measures, which are used in the TRECVid benchmark on this topic. The results show that TF-IDF and the cosine similarity work the best for the proposed system.

[download pdf]

Emiel Mols graduates on sharding Spotify search

Today, Emiel Mols graduated when presenting the master thesis project he did at Spotify in Stockholm, Sweden. Emiel got quite some attention last year when he launched SpotifyOnTheWeb, leaving Spotify “no choice but to hire him”.

In the master thesis, Emiel describes a prototype implementation of a term sharded full text search architecture. The system's requirements are based on the use case of searching for music in the Spotify catalogue. He benchmarked the system using non-synthethic data gathered from Spotify’s infrastructure.

The thesis will be available from ePrints.

AXES at TRECVid 2011

by Kevin McGuinness, Robin Aly, et al.

The AXES project participated in the interactive known-item search task (KIS) and the interactive instance search task (INS) for TRECVid 2011. We used the same system architecture and a nearly identical user interface for both the KIS and INS tasks. Both systems made use of text search on ASR, visual concept detectors, and visual similarity search. The user experiments were carried out with media professionals and media students at the Netherlands Institute for Sound and Vision, with media professionals performing the KIS task and media students participating in the INS task. This paper describes the results and findings of our experiments.

[download pdf]

AXES: Access to Audiovisual Archives


AXES is a large-scale integrating project (IP) project funded by the European Unions's 7th Framework Programme that starts in January 2011. The goal of AXES is to develop tools that provide various types of users with new engaging ways to interact with audiovisual libraries, helping them discover, browse, navigate, search and enrich archives. In particular, apart from a search-oriented scheme, we will explore how suggestions for audiovisual content exploration can be generated via a myriad of information trails crossing the archive. This will be approached from three perspectives (or axes): users, content, and technology.

Within AXES innovative indexing techniques are developed in close cooperation with a number of user communities through tailored use cases and validation stages. Rather than just starting new investments in technical solutions, the co-development is proposed of innovative paradigms of use and novel navigation and search facilities. We will target media professionals, educators, students, amateur researchers, and home users.

Based on an existing Open Source service platform for digital libraries, novel navigation and search functionalities will be offered via interfaces tuned to user profiles and workflow. To this end, AXES will develop tools for content analysis deploying weakly supervised classification methods. Information in scripts, audio tracks, wikis or blogs will be used for the cross-modal detection of people, places, events, etc., and for link generation between audiovisual content. Users will be engaged in the annotation process: with the support of selection and feedback tools, they will enable the gradual improvement of tagging performance. AXES technology will open up audiovisual digital libraries, increasing their cultural value and their exposure to the European public and academia at large.

The consortium is a perfect match to the multi-disciplinary nature of the project, with professional content owners, academic and industrial experts in audiovisual analysis, retrieval, and user studies, and partners experienced in system integration and project management. Our partners in AXES are: GEIE ERCIM, Katholieke Universiteit Leuven, University of Oxford, Institut National de Recherche en Informatique et en Automatique (INRIA), Dublin City University, Fraunhofer Gesellschaft, BBC, Netherlands Institute for Sound and Vision, Deutsche Welle, Technicolor, EADS, and Erasmus University Rotterdam.

Guest lecture by Alexander Hauptmann at SSR-4

The 4th SIKS/Twente Seminar on Searching and Ranking will take place on 2nd of July at the University of Twente. The goal of the one day seminar is to bring together researchers from companies and academia working on the effectiveness of search engines. Invited speakers are:

  • Alexander Hauptmann (Carnegie Mellon University, USA)
  • Arjen de Vries (CWI and University of Delft, Netherlands)
  • Wessel Kraaij (TNO and Radboud University Nijmegen, Netherlands)

The workshop will take place at the campus of the University of Twente at the Citadel (building 9), lecture hall T300. SSR is sponsored by SIKS and CTIT.

More information at SSR-4.

Erwin de Moel graduates on managing recorded lectures for Collegerama

Expanding the usability of recorded lectures: A new age in teaching and classroom instruction

by Erwin de Moel

The status of recorded lectures at Delft University of Technology has been studied in order to expand its usability in their present and future educational environment. Possibilities for the production of single file vodcasts have been tested. These videos allow for an increased accessibility of their recorded lectures through the form of other distribution platforms. Furthermore the production of subtitles has been studied. This was done with an ASR system called SHoUT, developed at University of Twente, and machine translation of subtitles into other languages. SHoUT generated transcripts always require post-processing for subtitling. Machine translation could produce translated subtitles of sufficient quality. Navigation of recorded lectures needs to be improved, requiring input of the lecturer. Collected metadata from lecture chapter titles, slide data (titles, content and notes) as well as ASR results have been used for the creation of a lecture search engine, which also produces interactive tables of content and tag clouds for each lecture. Recorded lectures could further be enhanced with time-based discussion boards, for the asking and answering of questions. Further improvements have been proposed for allowing recorded lectures to be re-used in recurring online-based courses.

Read More

DetectSim software released

DetectSim: contains software for simulating concept detectors for video retrieval. Researchers can use the software to test their concept-based video retrieval approaches without the need to build real detectors.

Concept based video retrieval is a promising search paradigm because it is fully automated and it investigates the fine grained content of a video, which is normally not captured by human annotations. Concepts are captured by so-called concept detectors. However, since these detectors do not yet show a sufficient performance, the evaluation of retrieval systems, which are built on top of the detector output, is difficult. In this report we describe a software package which generates simulated detector output for a specified performance level. Afterwards, this output can be used to execute a search run and ultimately to evaluate the performance of the proposed retrieval method, which is normally done through comparison to a baseline. The probabilistic model of the detectors are two Gaussians, one for the positive and one for the negative class. Thus, the parameters for the simulation are the two means and deviations plus the prior probability of the concept in the dataset.

Download Now!

Download Technical Report.

User study for concept retrieval available

In our recent TRECVID experiments we evaluated a concept retrieval approach to video retrieval, i.e. the user searches a collection of video shots by using automatically detected concepts such as face, people, indoor, sky, building, etc. The performance of such systems is still far from sufficient to be usable in reality, but is this because automatic detectors are bad? because users cannot write concept queries? because systems cannot rank concepts queries? or possibly, all of the above?

To help researchers answering this question, we made the data from a user study involving 24 users available. In the experiment, users had to select from a set of 101 concepts those concepts they expect to be helpful for finding certain information. For instance, suppose one needs to find shots of “one or more palm trees”. Most people, 18 out of 24, choose the concept tree, but others choose outdoor (15), vegetation (9), sky (8), beach (8), or desert (4). The summarized results can be accessed now from Robin Aly's page.

Download the user study data.

TREC Video Workshop 2008

by Robin Aly, Djoerd Hiemstra, Arjen de Vries, and Henning Rode

In this report we describe our experiments performed for TRECVID 2008. We participated in the High Level Feature extraction and the Search task. For the High Level Feature extraction task we mainly installed our detection environment. In the Search task we applied our new PRFUBE ranking model together with an estimation method which estimates a vital parameter of the model, the probability of a concept occurring in relevant shots. The PRFUBE model has similarities to the well known Probabilistic Text Information Retrieval methodology and follows the Probability Ranking Principle.

[download pdf]

Guest lecture: Thijs Westerveld of TeezIR

Opinion Mining and Multimedia Information Retrieval

Who: Thijs Westerveld (TeezIR)
When: Wednesday, October 1, 2008, 13.45-15.30 h., room HO-3136
What: Opinion Mining and Multimedia Information Retrieval

Thijs Westerveld has over 10 years of experience in various areas of information retrieval, mostly in an academic setting. He received a PhD in Computer Science from the Human Media Interaction group at the University of Twente in 2004, on the use of generative probabilistic models for multimedia retrieval. Thijs has worked on numerous national and European projects in the areas of information retrieval and multimedia retrieval, and published in international journals and conferences in the field. Thijs currently works at TeezIR Search Solutions in Utrecht.