Abstract:
A general re-weighting method, called contextualization, for more efficient element ranking in XML retrieval is introduced. Re-weighting is based on the idea of using the ancestors of an element as a context: if the element appears in a good context – good interpreted as probability of relevance – its weight is increased in relevance scoring; if the element appears in a bad context, its weight is decreased. The formal presentation of contextualization is given in a general XML representation and manipulation frame, which is based on utilization of structural indices. This provides a general approach independent of weighting schemas or query languages. Contextualization is evaluated with the INEX test collection. We tested four runs: no contextualization, parent, root and tower contextualizations. The contextualization runs were significantly better than no contextualization. The root contextualization was the best among the re-weighted runs.