Data Warehousing And Data Mining

Read Complete Research Material



Data Warehousing and Data Mining

Data Warehousing and Data Mining

Data Warehouse

Data ware housing could simply be explained as the process, which facilitates the operation and creation of Data Warehouse. A Data Warehouse is a large collection of data, it possess the ability of collecting information from multiple operational systems and scattered sources, whose activities are focused of the decision making process (Thusoo et.al, 2010). Once assembled systems data sources are stored for a long time, allowing access to historical data and data stores, which provides the user with a single consolidated interface to the data this makes it easier for the user as they could to write queries for decision making.

Benefits of Data Warehousing

It possesses the capability of working together with other databases.

It is capable of running predefined complex queries.

It allows the integration of heterogeneous databases

It possesses the capability of analyzing a particular problem in terms of dimensions.

It facilitates the operations of the application decisions support system.

Current Trends in Warehouse

Making Room for OPEX and CAPEX

During the past 10 years, the total cost of storage has increased by approximately 7% per year, in contrast to the growth of storage capacity, which has grown from 30 to 40%. The increase was due primarily to operating costs (OPEX), while hardware costs (CAPEX) have remained relatively unchanged. Disk technologies have had a great career with an increase in recording density doubles every 18 or 24 months, making drive prices eroded 30% per year. The containment hardware prices also boosted by the introduction of storage area networks (SAN), which eliminated silos attachments and helped to make better use of hardware through consolidation, but adding another level of complexity to the management and operational costs. As more applications and share data more and larger storage frames, it becomes more difficult to schedule and run backups, maintenance and migration to Windows (Ponniah, 2011).

These trends in CAPEX and OPEX are about to change drastically. The good news is that the introduction of server and storage virtualization is having a big impact on operating costs, reversing the upward trend in OPEX. The technologies of data storage virtualization, Hitachi Data Systems are proven to reduce TCO by 40% or more, with payback in less than one year. On the other hand, CAPEX, or hardware costs have begun to trend upwards as more features are added to the hardware and storage capacity demand soars due to pressure from large volumes of data and the need to preserve them for forever. At the same time, the price erosion of storage capacity foresees that represents only about 20% annually through 2020.

Consolidation of Convergence

In order to obtain greater savings, attention will focus on the convergence of servers, storage platforms, networks and applications. Also highlight the Application Programming Interfaces (API) that can make the servers more efficient. Orchestration software will help to converge the management and automate provisioning through local servers, remote and cloud-based and network infrastructures.

Big Data

One of the great concepts for the year 2012 will be the "Big ...
Related Ads