Principal components regression problem

Some methods that paitly cope with the above mentioned problem have been proposed in the literature. The subject has been treated in areas like Cheraometrics, Econometrics etc, giving rise for example to the methods Partial Least Squares, PLS, Ridge Regression, RR, and Principal Component Regression, PCR [2]. In this work we have chosen to illustrate the multivariable approach using PCR as our regression tool, mainly because it has a relatively easy interpretation. The basic idea of PCR is described below. [Pg.888]

Other chemometrics methods to improve caUbration have been advanced. The method of partial least squares has been usehil in multicomponent cahbration (48—51). In this approach the concentrations are related to latent variables in the block of observed instmment responses. Thus PLS regression can solve the colinearity problem and provide all of the advantages discussed earlier. Principal components analysis coupled with multiple regression, often called Principal Component Regression (PCR), is another cahbration approach that has been compared and contrasted to PLS (52—54). Cahbration problems can also be approached using the Kalman filter as discussed (43). [Pg.429]

A difficulty with Hansch analysis is to decide which parameters and functions of parameters to include in the regression equation. This problem of selection of predictor variables has been discussed in Section 10.3.3. Another problem is due to the high correlations between groups of physicochemical parameters. This is the multicollinearity problem which leads to large variances in the coefficients of the regression equations and, hence, to unreliable predictions (see Section 10.5). It can be remedied by means of multivariate techniques such as principal components regression and partial least squares regression, applications of which are discussed below. [Pg.393]

The method of PCA can be used in QSAR as a preliminary step to Hansch analysis in order to determine the relevant parameters that must be entered into the equation. Principal components are by definition uncorrelated and, hence, do not pose the problem of multicollinearity. Instead of defining a Hansch model in terms of the original physicochemical parameters, it is often more appropriate to use principal components regression (PCR) which has been discussed in Section 35.6. An alternative approach is by means of partial least squares (PLS) regression, which will be more amply covered below (Section 37.4). [Pg.398]

Haaland and coworkers (5) discussed other problems with classical least-squares (CLS) and its performance relative to partial least-squares (PLS) and factor analysis (in the form of principal component regression). One of the disadvantages of CLS is that interferences from overlapping spectra are not handled well, and all the components in a sample must be included for a good analysis. For a material such as coal LTA, this is a significant limitation. [Pg.50]

If the variables are correlated, the occurring problem of multicollinearity may be circumvented by performing a principal components calculation with the variables x. This will create independent ( orthogonal ) variables and one can continue the regression analysis using the scores (see Section 5.4) instead of the original x values. This method is known as principal components regression. [Pg.195]

The methods of data analysis depend on the nature of the final output. If the problem is one of classification, a number of multivariate classifiers are available such as those based on principal components analysis (SIMCA), cluster analysis and discriminant analysis, or non-linear artificial neural networks. If the required output is a continuous variable, such as a concentration, then partial least squares regression or principal component regression are often used [20]. [Pg.136]

The prediction of Y-data of unknown samples is based on a regression method where the X-data are correlated to the Y-data. The multivariate methods, usually used for such a calibration, are principal component regression (PCR) and partial least squares regression (PLS). Both methods are based on the assumption of linearity and can deal with co-linear data. The problem of co-linearity is solved in the same way as the formation of a PCA plot. The X-variables are added together into latent variables, score vectors. These vectors are independent since they are orthogonal to each other and they can therefore be used to create a calibration model. [Pg.7]

For the estimation of components concentration, a second step is required, based on a multiple linear regression (MLR, see Section 3.1.3) between the absorbance values and the PCA scores. This can be carried out automatically after the PCA step, with the principal component regression (PCR) procedure (including PCA). This methodology was first applied to analytical chemical problems by Lawton and Sylvestre [25], and has more recently been used in different models by other researchers [26-28], Finally, the PCA procedure can also be coupled with cluster analysis (CA), as described in a very recent study on the characterisation of industrial wastewater samples [29],... [Pg.42]

The problem dealt with by principal component regression is regressing y (/ x 1) on a possibly ill-conditioned X (/ x J). Hence, principal component regression tries to solve Equation (3.27) and Equation (3.29) for ill-conditioned X. Principal component regression approximates X by a few, say R, components (its principal components) and regresses y on these R components. Principal component regression can be written as... [Pg.49]

With multiple regression analysis involving large numbers of independent variables there often exists extensive collinearity or correlation between these variables. Collinearity adds redundancy to the regression model since more variables may be included in the model than is necessary for adequate predictive performance. Of the regression methods available to the analytical with protection against the problems induced by correlation between variables, principal components regression, PCR, is the most common employed. [Pg.194]

In this section we shall consider the rather general case where for a series of chemical compounds measurements are made in a number of parallel biological tests and where a set of descriptor variables is believed to be related to the biological potencies observed. In order to imderstand the data in their entirety and to deal adequately with the mathematical properties of such data, methods of multivariate statistics are required. A variety of such methods is available as, for example, multivariate regression, canonical correlation, principal component analysis, principal component regression, partial least squares analysis, and factor analysis, which have all been applied to biological or chemical problems (for reviews, see [1-11]). Which method to choose depends on the ultimate objective of an analysis and the property of the data. We have found principal component and factor analysis particularly useful. For this reason and also since many multivariate methods make use of components for factors we will start with these methods in some detail, while the discussion of other approaches will be less extensive. [Pg.44]

In the case of multivariate modeling, several independent as well as several dependent variables may operate. Out of the many regression methods, we will learn about the conventional method of ordinary least squares (OLS) as well as methods that are based on biased parameter estimations reducing simultaneously the dimensionality of the regression problem, that is, principal component regression (PCR) and the partial least squares (PLS) method. [Pg.231]

In the previous chapter, it was commented on that the ordinary least-sqnares approach applied to multivariate data (multivariate linear regression, MLR) suffered from serious uncertainty problems when the independent variables were collinear. Principal components regression (PCR) can solve the collin-earity problem and provide additional benefits of factor-based regression methods, such as noise filtering. Recall that PCR compresses the original X-block e.g. matrix of absorbances) into a new block of scores T, containing fewer variables (the so-called factors, latent variables, or principal components), and then regression is performed between T and the property of... [Pg.300]

The previous section alludes to the most common problems in quantitative Raman spectroscopic calibrations Most models require that all components in a system to be known and modeled in the calibration data to accurately predict any one component. Inverse calibration techniques such as inverse multiple linear regression (inverse MLR), principal component regression (PCR) and partial least squares (PLS also known as principal latent structures) avoid this problem by forcing the calibration steps to utilize only the spectral features which are either changing (PCR) or directly correlated to the property of interest (PLS). More so, not all components in a sample need to be known to perform an inverse calibration. The basic form of an inverse calibration centers around an equation of the form... [Pg.314]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...