The BERT Ranking Paradigm: Training Strategies Evaluated
by Maurice Verbrugge
This thesis researches the most recent paradigm in information retrieval, which applies the neural language representation model BERT to rank relevant passages out of a corpus. The research focuses on a re-ranker scheme that uses BM25 to pre-rank the corpus followed by BERT-based ranking, exploring better fine-tuning methodology for a pre-trained BERT. This goal is pursued in two parts, in the first, all methods rely on binary relevance labels, while the second part applies methods that rely on multiple relevance labels instead. Part one researches methods that apply training data enhancement and the application of inductive transfer learning methods. Part two researches the application of single class multi label methods, multi class multi label methods and label-based regression. In all parts, the methods were evaluated on the fully annotated Cranfield dataset.
This thesis demonstrates that applying inductive transfer learning with the Next Sentence Prediction task improves the baseline by presenting various methods to enrich the fine-tuning data for different levels of the BM25-BERT ranking pipeline. Also, this thesis demonstrates that application of a regression method results in above baseline performance. This indicates the superiority of this method over rule-based filtering of classifier results.