Integrating PCA (Principal Component Analysis) and Multiple Regressions by Mohamad Hisham Hamdan MBB

Principal Analysis Component and Multiple Regression are tools that can be considered as multivariate analysis tools. Whilst Multiple Regression is mainly use for establishing a prediction model that consists of one dependent versus multiple predictors, PCA is use to establish structure of relationships among the predictors. Thus in other word, PCA can be used prior to multiple regression in order to study the multicollinearity issue among the Xs or predictors. Although during the execution of multiple regression using Minitab , VIF ( Variance Inflation Factors ) is an indicator that can be used to measure multicollinearity among predictors  ( some practitioners use Matrix Plot ) , with VIF < 5 , personally PCA is the alternative way of judging the multicollinearity . Consider below example:

A Black Belt would like to come out with a predictive model by having all the significant factors (predictors) towards the process output (Y).  A part of the data is as below:

 

 

Data PCA

Due to the nature of data collected and the objective, the Black Belt decided to use Multiple Regression with VIF activated. The initial result from the multiple regressions as below:

bt-14-pca-2

 

 

 

 

 

 

 

 

 

Obviously there is an issue of multicollinearity among the predictors (Those with VIF > 5).

Let’s compare the result above with PCA:

bt-14-pca-3

From Multiple Regression VIF indicator we know that multicollinearity issue exits but do not really know between or among which and which. However looking at above Loading Plot of PCA , P,O, L are highly correlate each other , same goes to A,B,D,N,J,C and KG. Comparing the matrix plot below, the result is agreed.

bt-14-pca-4

Thus PCA can be used to avoid multicollinearity issue as above example. Since we know the grouping of factors (example: P, O, L çè 1 Group etc), the elimination of factors to reduce multicollinearity issue become easier.

 Using above example, below is how PCA can be executed using Minitab:

STAT è Multivariateè Principal Components

 bt-14-pca-5


Information About Article