Autoprediction error

The simplest approach to determining the number of significant components is by measuring the autoprediction error. This is also called the root mean square error of calibration. Usually (but not exclusively) the error is calculated on the concentration data matrix (c), and we will restrict the discussion below to errors in concentration importantly, similar equations can be obtained for the x data. [Pg.19]

The autopredictive error can be used to determine how many PLS components to use in the model, in a number of ways. [Pg.20]

Notice that unlike the autoprediction error this term is always divided by I because each sample in the original dataset represents a degree of freedom, however many PLS or PC A components have been calculated and however the data have been preprocessed. [Pg.21]

It can be employed as a fairly realistic error estimate for predictive ability. The minimum cross-validated prediction error for acenaphthylene of 0.040 mg L 1 equals 33.69%. This compares with an autopredictive error of 0.014 mg L-1 or 11.64% using ten components and PL SI which is a very over-optimistic estimate. [Pg.21]

The PRESS errors can then be compared widi the RSS (residual sum of square) errors for each object for straight PCA (sometimes called the autoprediction error), given by... [Pg.200]

The simplest approach to determining number of significant components is by measuring the autoprediction error. This is the also called the root mean square error of calibration. [Pg.313]

For acenaphthylene using PLS1, the cross-validated error is presented in Fig. 18. An immediate difference between autoprediction and cross-validation is evident. In the former case the data will always be better modelled as more components are employed in the calculation, so the error will always reduce (with occasional rare exceptions in the case of 1 Hcai). Flowever, cross-validated errors normally reach a minimum as the correct number of components are found and then increase afterwards. This is because later components really represent noise and not systematic information in the data. [Pg.21]

If, however, we use dataset B as the training set and dataset A as the test set, a very different story emerges as shown in Fig. 20 for acenaphthylene. The autopredictive and cross-validation errors are very similar to those obtained for dataset A the value... [Pg.22]

Fig. 20 Autopredictive, cross-validation and test errors for dataset B (acynaphthylene) and PLS1.

A complementary series of methods for determining the number of significant factors are based on cross-validation. It is assumed that significant components model data , whilst later (and redundant) components model noise . Autopredictive models involve fitting PCs to the entire dataset, and always provide a closer fit to the data the more the components are calculated. Hence the residual error will be smaller if 10 rather than nine PCs are calculated. This does not necessarily indicate that it is correct to retain all 10 PCs the later PCs may model noise which we do not want. [Pg.199]

Using the eigenvalues obtained in question 1, calculate die residual sum of squares error for 1-5 PCs and autoprediction. [Pg.270]

Perform autopredictive PLS 1 on the standardised reduced unfolded data of question 7 and calculate the errors as one, two and three components are computed. [Pg.338]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...