Weighted AUReC

Handling Skew in Shard Map
Quality Estimation for Selective Search

by Gijs Hendriksen, Djoerd Hiemstra, and Arjen de Vries

In selective search, a document collection is partitioned into a collection of topical index shards. To efficiently estimate the topical coherence (or quality) of a shard map, the AUReC (Area Under Recall Curve) measure was introduced. AUReC makes the assumption that shards are of similar sizes, one that is violated in practice, even for unsupervised approaches. The problem might be amplified if supervised labelling approaches with skewed class distributions are used. To estimate the quality of such unbalanced shard maps, we introduce a weighted adaptation of the AUReC measure, and empirically evaluate its effectiveness using the ClueWeb09B and Gov2 datasets. We show that it closely matches the evaluations of the original AUReC when shards are similar in size, but captures better the differences in performance when shard sizes are skewed.

To be presented at the European Conference on Information Retrieval (ECIR) in Glasgow on 24-28 March.

[download pdf]