Data Integration In Data Warehousing

Read Complete Research Material



Data Integration in Data Warehousing

Abstract

Data integration is the most crucial and important component of data warehousing. When data or information is passed from the operational environment, which is the source of application orientation, to the data warehouses, there is a need to resolve the possible redundancies and inconsistencies (Calvanese et al., 2001, p.237). This will provide reconciled and integrated view of information of the organization. This paper describes an innovative approach regarding data integration in data warehousing. The approach is founded on the theoretical demonstration of the application domain of Data warehousing (Calvanese et al., 2001, p.237). This paper proposes a technique or method for declaratively identifying appropriate reconciliation associations to be used so as to resolve differences between the data of dissimilar sources. The key objective of the technique is to support the proposal or design of intermediaries that turn up the data or information in the data Warehouse relations (Calvanese et al., 2001, p.237).

Abstractii

Introduction1

Data warehousing approaches3

Industrial perspective4

Architecture of data warehousing system5

Problems regarding data warehousing6

Wrapper/monitors6

Translation6

Change detection7

Issues with wrapper/monitor7

Miscellaneous issues8

Warehouse management8

Warehouse and source evolution8

Inconsistent and duplicate information8

Outdated information9

Solution and improvement in the architecture9

Update filtering9

Self maintainability10

Optimization of multiple views11

Conclusion11

References13

Data Integration in Data Warehousing

Introduction

Information Integration is the issue of obtaining data or information from several dissimilar sources that are accessible or available for the application or function of interest. The conventional architecture of a data integration system is portrayed in terminologies of two types of units: mediators and wrappers (Widom, 1995, p.n.d.). The purpose of a wrapper is to enable access to source, take out the relevant data, and represent that data in a particular format. The objective of a mediator is to combine data extracted by the different wrappers in order to meet a particular need regarding information of the data integration system (Widom, 1995, p.n.d.). The realization and the specification of mediators are the basic problem in the proposed design of a data integration system. This issue has currently become a core issue in several frameworks including Data Warehousing, multi-database systems and information gathering (Widom, 1995, p.n.d.). The limitations of Data Warehouse applications are restricting the large range of approaches that are being planned.

Supplying integrated approach to heterogeneous, distributed and multiple databases and other sources of information has become one of the important problems in the database industry and research (Widom, 1995, p.n.d.). In the research society, most techniques to the data integration issue depend on the following very common two-step process:

Receive a query, determine the suitable set of data or information resources in order to answer the query, and produce the suitable commands or sub-queries for each source of information (Widom, 1995, p.n.d.)

Acquire results from the sources of information, perform respective conversion, merging and filtering of the information and afterwards return the ultimate answer to the application or user

This approach is referred as on-demand or lazy approach of data integration (Widom, 1995, p.n.d.). It is referred as lazy approach because information or data is extracted only when the queries are made from the sources. This approach also referred as mediated ...
Related Ads