| dc.description.abstract |
XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data. While the automata model is naturally suited for pattern matching on tokenized XML streams, the algebraic model in contrast is a well-established technique for set-oriented processing of self-contained tuples. However, neither automata nor algebraic models are well-equipped to handle both computation paradigms. The goal of the Raindrop project is to accommodate these two paradigms within one algebraic framework to take advantage of both. In our query model, both tokenized data and self-contained tuples are supported in a uniform manner. Query plans can be flexibly rewritten using equivalence rules to change what computation is done using tokenized data versus tuples. This paper highlights the four abstraction levels in Raindrop, namely, semantics-focused plan, stream logical plan, stream physical plan and execution plan. Various optimization techniques are provided at each level. The necessity of such a uniform and layered plan is shown by experimental study. |
|