DSpace Repository

Strategies for Processing ad hoc Queries on Large Data Warehouses * General Terms Algorithms, Performance

Show simple item record

dc.contributor.author Stockinger Kurt
dc.contributor.author Geneva Cern
dc.contributor.author Switzerland Kurt
dc.contributor.author Stockinger@cern
dc.contributor.author Ch Kesheng
dc.contributor.author Wu
dc.contributor.author Shoshani Arie
dc.date.accessioned 2018-01-22T17:25:32Z
dc.date.available 2018-01-22T17:25:32Z
dc.date.issued 2002
dc.identifier.uri http://hdl.handle.net/123456789/6997
dc.description.abstract As data warehousing applications grow in size, existing data organizations and access strategies, such as relational tables and B-tree indexes, are becoming increasingly ineffective. The two primary reasons for this are that these datasets involve many attributes and the queries on the data usually involve conditions on small subsets of the attributes. Two strategies are known to address these difficulties well, namely vertical partitioning and bitmap indexes. In this paper , we summarize our experience of implementing a number of bitmap index schemes on vertically partitioned data tables. One important observation is that simply scanning the vertically partitioned data tables is often more efficient than using B-tree based indexes to answer ad hoc range queries on static datasets. For these range queries, compressed bitmap indexes are in most cases more efficient than scanning vertically partitioned tables. We evaluate the performance of two different compression schemes for bitmap indexes stored is various ways. Using the compression scheme called Word-Aligned Hybrid Code (WAH) to store the bitmaps in plain files shows the best overall performance for bitmap indexes. Tests indicate that our bitmap index strategy based on WAH is not only efficient for attributes of low cardinality, say, < 100, but also for high-cardinality attributes with 200,000 or more distinct values.
dc.format application/pdf
dc.title Strategies for Processing ad hoc Queries on Large Data Warehouses * General Terms Algorithms, Performance
dc.type generic


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account