Data Miners

Read Complete Research Material

DATA MINERS

How to keep the data miners from overwhelming the organization

How to keep the data miners from overwhelming the organization

Introduction

The data mining (data mining) is the set of techniques and technologies to explore large databases, or semi-automatically, with the aim of finding repeating patterns, trends or rules that explain the behavior of a given data context.

Basically, the datamining appears to try to help understand the content of a data repository. To this end, it uses statistical practices and in some cases, next to search algorithms and artificial intelligence neural networks.

In general, the data is the raw material. At the time that the user attaches a special meaning to them go on to become information. When specialists develop or find a model, making the interpretation that arises between information and the model represents an added value, then we refer to knowledge.

Discussion

Data mining focuses on the development of efficient and effective methods to extract useful information and knowledge from massive databases. Existing methods for data mining span three main groups: computational, statistical, and visual approaches. Computational approaches resort to computer algorithms to search large volumes of data automatically for specific types of patterns, such as clusters, association rules, homogeneous regions, colocation patterns, and outliers. Statistical approaches include scan statistics, geographically weighted regression, multivariate lattice models, and association tests. There are also numerous visualization-based methods for multivariate analysis, such as multivariate mapping, spatiotemporal visualization, and other interactive geovisualization techniques (Berry, 1997).

Although computational and statistical methods can search for a specific type of pattern very quickly, they have very limited capabilities to support pattern interpretation. In contrast, visualization methods can help humans visually discern complex patterns, propose explanations of the observed patterns, and generate hypotheses for further analysis. To combine the advantages of different approaches, integrated methods have been developed to bring together the computational and visual approaches in order to address complex problems more effectively.

Trends

Data mining of consumer-related information has emerged as a critical application as the volume of e-commerce continues to grow, the amount of data generated by dynamic systems (such as online bookstores and auction sites) increases, and the value of such information to marketers becomes established. However, the use of consumer data for purposes unrelated to the original purchase, often by companies that have no pre-existing business relationship to the consumer, can raise privacy issues. (Data gets rendered anonymous by removing personal identification information before it is mined, but regulations or other ways to assure privacy remain incomplete and uncertain.)

The most controversial applications of data mining are in the area of intelligence and homeland security. Because such applications gets shrouded in secrecy, the public and even lawmakers have difficulty in assessing their value and devising privacy safeguards. According to the Government Accountability Office, by 2007 some 199 different data-mining programs were in use by at least 52 federal agencies. One of the most controversial is ADVISE (Analysis Dissemination, Visualization, Insight and Semantic Enhancement), developed by the Department of Homeland Security since 2003 (Chapelle, Zien, ...
Related Ads