Principal Component Analysis and Artificial Intelligence
[Name of the institute]
INTRODUCTION
Principal Component Analysis is a helpful statistical tool; its application areas include image compression, face recognition and are also used in data of high dimension for finding patterns. This field requires clear understanding regarding the concepts of mathematics and statistics. Artificial intelligence is one of the most important blessings of science and technology, artificial intelligence have revolutionized every field of our society; its application areas include remote sensing, robotics, medical diagnosis and etc.
DESCRIPTION
Principal Component Analysis
PCA is a data analysing tool, it's basically a data pattern identifying method; it include the methods of expressing data in such a way that the differences and similarities in data are highlighted. Principal Component Analysis is a very useful method, especially with data of high dimension where data pattern finding is not an easy task; PCA is a very helpful tool for non-graphical data. Data patterns found by this method is helpful in compressing data without information loss. Face recognition and image compression are the important applications of this method [4].
Canonical Component Analysis (CCA) is also related to the Principal Component analysis (PCA), PCA describes the variance in data by making a new orthogonal coordinate system, while Canonical Component Analysis (CCA) explains the cross variance in two sets of data [6]. PCA is also linked with factor analysis; factor analysis is similar to PCA with some minor differences on the type of Eigen vectors of different matrix
Basic Assumptions
There are certain assumptions that are related to the Principal Component Analysis. These assumptions are very important while finding principal components, these components include
Orthogonal Principal Components
It is considered that Principal Components are orthogonal; this assumption creates a link between PCA and linear algebra decomposition techniques
Large Variances are considered to have significant structure
Data is assumed to have high SNR, therefore data with low variances are considered to represent noise and data with high variances are considered to show very interesting structures, but this assumption in some cases is considered as incorrect
Linearity
Linearity is one of the most important assumptions in Principal Component Analysis (PCA); linearity considers the problem as the change of basis
Steps involved in Principal Component Analysis
Steps involved in principal component Analysis must be followed sequentially otherwise it will lead it failure. The major steps involved in this method are
Step 1
Firstly have some data. To start with Principal Component Analysis we first need some data.
Step 2
Second step involves the subtraction of mean from the data dimensions. The mean of x values subtracted from all x values and the mean of all y values subtracted from all y values. Data obtained from this would have mean equal to zero.
Step 3
Find out the covariance matrix. Suppose the data we have assumed is two dimensional so we would have a 2×2 matrix. Suppose that if the non-diagonal elements of the covariance matrix obtained are positive, so we can analyze that both x and y variables will increase together.
Step 4
Eigen values and Eigen vectors of the covariance matrix are obtained. Data pattern information is obtained from ...