Data Mining

Read Complete Research Material

DATA MINING

Data Mining- SupaSave

[Name of the Wirer]

Data Mining- SupaSave

Problem Areas

Now-a-days, retaining old customers is preferred more than attracting new customers. Business organizations are adopting different strategies to facilitate their customers in verity of different ways, so that these customers keep on buying from them. (Agrawal, 1993) Association Rule Mining (ARM) and data clustering is a particular kind of data mining problem for large set of multidimensional data points, In ARM we search for relationship among different items in the dataset while the data spare is usually not uniformly occupied so will produce different clusters. Data clustering identifies the sparse and the crowded places, and hence discovers the overall distribution patterns of the dataset. Association Rule Mining (ARM) [1] is one of the strategies that have two fold advantages to the business organization after applying the basket analysis. 1) It helps customers to get all the related items from one place and that save their time from visiting different places of the store. 2) It helps organization in more selling of items by placing items closer that are sold together. (Ian , 2005)

Learning Models and Techniques- Supervised And Non Supervised Techniques

Data and Knowledge Mining is learning from data. In this context, data are allowed to speak for themselves and no prior assumptions are made. This learning from data comes in two flavors: supervised learning and unsupervised learning. In supervised learning (often also called directed data mining) the variables under investigation can be split into two groups: explanatory variables and one (or more) dependent variables. The target of the analysis is to specify a relationship between the explanatory variables and the dependent variable as it is done in regression analysis. To apply directed data mining techniques the values of the dependent variable must be known for a sufficiently large part of the data set.

Unsupervised learning is closer to the exploratory spirit of Data Mining as stressed in the definitions given above. In unsupervised learning situations all variables are treated in the same way, there is no distinction between explanatory and dependent variables. However, in contrast to the name undirected data mining there is still some target to achieve. This target might be as general as data reduction or more specific like clustering. The dividing line between supervised learning and unsupervised learning is the same that distinguishes discriminant analysis from cluster analysis. Supervised learning requires that the target variable is well defined and that a sufficient number of its values are given. For unsupervised learning typically either the target variable is unknown or has only been recorded for too small a number of cases.

The large amount of data that is usually present in Data Mining tasks allows to split the data file in three groups: training cases, validation cases and test cases. Training cases are used to build a model and estimate the necessary parameters. The validation data helps to see whether the model obtained with one chosen sample may be generalizable to other data. In particular, it helps avoiding the phenomenon of ...
Related Ads