Cluster Analysis

Read Complete Research Material

CLUSTER ANALYSIS

Cluster Analysis

Cluster Analysis

Introduction

Cluster analysis is a significant technique in market research. The accepted purpose of this technique is to group 'individuals or objects into clusters so that objects in the same cluster are more similar to one another than they are to objects in other clusters' (Hair, Anderson, Tatham & Black 1998, p. 470). Comprehensive reviews of the technique can be found in Punj and Stewart (1983) and Arabie and Hubert (1994). A common method of cluster analysis is the k-means approach, where data points are randomly selected as initial seeds or centroids, and the remaining data points are assigned to the closest centroid on the basis of the distance between them (MacQueen 1967). The aim is to obtain maximal homogeneity within subgroups or clusters, and maximal heterogeneity between clusters. K-means cluster analysis is less affected by data idiosyncrasies than hierarchical clustering techniques (Punj & Stewart 1983), but the approach still suffers from many of the problems associated with all traditional statistical analysis methods. These methods were developed for use with variables which are normally distributed and which have an equal variancecovariance matrix in all groups. In most realistic marketing data sets, neither of these conditions necessarily holds. Over the last few years a number of new techniques have been developed which make few assumptions about the statistical characteristics of the data being analysed (Hruschka 1986, 1993; Kowalczyk & Piasta 1998; Matsatsinis, Hatzis & Samaras 1998; Voges 1997; Voges & Pope 2000). This paper describes the application of one of these techniques, rough clustering, to a segmentation analysis of on-line shopping orientations.

Rough Clustering

Rough clustering (do Prado, Engel & Filho 2002; Voges, Pope &dark 2002) is an elongation of the theory of rough or approximation groups, presented by Pawlak (1982, 1991). Rough sets theory is based on the assumption that information is associated with every record in the data matrix (in rough sets terminology, every object of the information system). This information is expressed by means of variables (in rough set terminology, attributes) that serve as descriptions of the objects. None of the traditional assumptions of multivariate analysis are relevant, as the data are treated from the perspective of set theory (that is, as object descriptors rather than variables with statistical properties). For introductions to the theory of rough sets, see Pawlak (1991), Lin and Cercone (1997), or Munakata (1998).

Application

A rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation, perceived risk and intention to purchase products via the Internet (Brown 1999; Brown, Pope & Voges 2003). The rough cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. All measures were constructed as multi-item Likert-type scales with responses ranging from strongly disagree to strongly agree. As rough clustering requires ordered discrete data, the multi-item scores were mapped onto an ordered attribute with a range of seven, with each value for the attribute representing 14 to 15 percent of the data ...
Related Ads