Abstract:
Data warehouse view maintenance is an important issue due to the growing use of warehouse technology for information integration and data analysis. Given the dynamic nature of modern distributed environments, both data updates and schema changes are likely to occur in different data sources. In applications for which the real-time refresh of the data warehouse extent under source change is not critical, the source updates are usually maintained in a batch fashion to reduce the maintenance overhead. However, most prior work can only deal with the batching source data updates. In this paper, we now provide a solution strategy that is capable of batching both source data updates and schema changes. We propose techniques to first preprocess the initial source updates to construct summarized delta changes for each source. We then design a view adaptation algorithm to adapt the warehouse view under these delta changes. We have implemented our proposed batching solution and incorporated it into an existing data warehouse prototype system. The experimental studies demonstrate the excellent performance achievable by our batch technique.