Show simple item record

dc.contributor.author Bleiholder Jens
dc.contributor.author Naumann Felix
dc.date.accessioned 2018-02-05T14:34:18Z
dc.date.available 2018-02-05T14:34:18Z
dc.date.issued 2008
dc.identifier.uri http://hdl.handle.net/123456789/7140
dc.description.abstract The development of the Internet in recent years has made it possible and useful to access many different information systems anywhere in the world to obtain information. While there is much research on the integration of heterogeneous information systems, most commercial systems stop short of the actual integration of available data. Data fusion is the process of fusing multiple records representing the same real-world object into a single, consistent, and clean representation. This article places data fusion into the greater context of data integration, precisely defines the goals of data fusion, namely, complete, concise, and consistent data, and highlights the challenges of data fusion, namely, uncertain and conflicting data values. We give an overview and classification of different ways of fusing data and present several techniques based on standard and advanced operators of the relational algebra and SQL. Finally, the article features a comprehensive survey of data integration systems from academia and industry, showing if and how data fusion is performed in each.
dc.format application/pdf
dc.language.iso English
dc.publisher Association for Computing Machinery (ACM)
dc.title Data fusion
dc.type journal-article
dc.identifer.doi 10.1145/1456650.1456651
dc.source.volume 41
dc.source.issue 1
dc.source.journal ACM Comput. Surv


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account