Principal component regression definition

The method of PCA can be used in QSAR as a preliminary step to Hansch analysis in order to determine the relevant parameters that must be entered into the equation. Principal components are by definition uncorrelated and, hence, do not pose the problem of multicollinearity. Instead of defining a Hansch model in terms of the original physicochemical parameters, it is often more appropriate to use principal components regression (PCR) which has been discussed in Section 35.6. An alternative approach is by means of partial least squares (PLS) regression, which will be more amply covered below (Section 37.4). [Pg.398]

The simplest definition of model complexity is based on the number of terms in the model or, in other words, the model complexity is made up by the number of model variables from Ordinary Least Squares regression cpx = p), the number M of significant principal components from Principal Component Regression (cpx = M), and the number of significant latent variables from Partial Least Squares regression (cpx = M)... [Pg.296]

Problems with the inversion of the covariance matrix can be overcome by a preceding principal component analysis. Instead of (correlated) features a set of principal component scores is used as independent variables in the regression equation (principal component regression, PCR). Remember that PCA scores are uncorrelated by definition. [Pg.353]

Note that the lipophilicity parameter log P is defined as a decimal logarithm. The parabolic equation is only non-linear in the variable log P, but is linear in the coefficients. Hence, it can be solved by multiple linear regression (see Section 10.8). The bilinear equation, however, is non-linear in both the variable P and the coefficients, and can only be solved by means of non-linear regression techniques (see Chapter 11). It is approximately linear with a positive slope (/ ,) for small values of log P, while it is also approximately linear with a negative slope b + b for large values of log P. The term bilinear is used in this context to indicate that the QSAR model can be resolved into two linear relations for small and for large values of P, respectively. This definition differs from the one which has been introduced in the context of principal components analysis in Chapter 17. [Pg.390]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...