A Cross-lingual Framework for Monolingual Biomedical Information Retrieval

by Dolf Trieschnigg, Djoerd Hiemstra, Franciska de Jong, and Wessel Kraaij

An important challenge for biomedical information retrieval (IR) is dealing with the complex, inconsistent and ambiguous biomedical terminology. Frequently, a concept-based representation defined in terms of a domain-specific terminological resource is employed to deal with this challenge. In this paper, we approach the incorporation of a concept-based representation in monolingual biomedical IR from a cross-lingual perspective. In the proposed framework, this is realized by translating and matching between text and concept-based representations. The approach allows for deployment of a rich set of techniques proposed and evaluated in traditional cross-lingual IR. We compare six translation models and measure their effectiveness in the biomedical domain. We demonstrate that the approach can result in significant improvements in retrieval effectiveness over word-based retrieval. Moreover, we demonstrate increased effectiveness of a cross-lingual IR framework for monolingual biomedical IR if basic translations models are combined.

The paper will be presented at the 19th ACM International Conference on Information and Knowledge Management on October 26-30 in Toronto, Canada.