Statistical methods residual plot

In the maximum-likelihood method used here, the "true" value of each measured variable is also found in the course of parameter estimation. The differences between these "true" values and the corresponding experimentally measured values are the residuals (also called deviations). When there are many data points, the residuals can be analyzed by standard statistical methods (Draper and Smith, 1966). If, however, there are only a few data points, examination of the residuals for trends, when plotted versus other system variables, may provide valuable information. Often these plots can indicate at a glance excessive experimental error, systematic error, or "lack of fit." Data points which are obviously bad can also be readily detected. If the model is suitable and if there are no systematic errors, such a plot shows the residuals randomly distributed with zero means. This behavior is shown in Figure 3 for the ethyl-acetate-n-propanol data of Murti and Van Winkle (1958), fitted with the van Laar equation. [Pg.105]

Linearity is evaluated by appropriate statistical methods such as the calculation of a regression line by the method of least squares. The linearity results should include the correlation coefficient, y-intercept, slope of the regression line, and residual sum of squares as well as a plot of the data. Also, it is helpful to include an analysis of the deviation of the actual data points for the regression line to evaluate the degree of linearity. [Pg.366]

After outliers have been purged from the data and a model has been evaluated visually and/or by, e.g. residual plots, the model fit should also be tested by appropriate statistical methods [2, 6, 9, 10, 14], The fit of unweighted regression models (homoscedastic data) can be tested by the ANOVA lack-of-fit test [6, 9]. A detailed discussion of alternative statistical tests for both unweighted and weighted calibration models can be found in Ref. [16]. The widespread practice to evaluate a calibration model via its coefficients of correlation or determination is not acceptable from a statistical point of view [9]. [Pg.3]

Figure 8.9 Plots of values (including the sign) of residuals (Y — Yp ) vs x, where Yp is the value of Y predicted from the sinqtle linear model (Y = A + B.x) with A and B determined by simple hnear regression, for three impotant causes of significant values of the residual variance V. Note that in the case of nonlinearity of the data (i.e., inapplicabiUty of Equation [8.19a] to this data set) the plot could just as well be concave upwards. A sufficiently high degree of itreproducibihty can mask the other two if present It is possible to observe nonlinearity of heteroscedastic data if the latter effect is not too extreme. Reproduced fi om Meier and Ziind, Statistical Methods in Analytical Chemistry, 2nd Edition (2000), with permission of John Wiley Sons Inc.

But where did the calibration plane come from By letting the computer use the same statistical tool that it used before the least squares best fit. Only this time it operates with one more dimension. Initially (in order to calibrate the system) one obtains a so-called "learning set" of standards (samples which have been analyzed by some acceptable reference method). Each sample is then measured at each wavelength and the absorbances are plotted in three dimensional space rather than in the plane of a sheet of graph paper. The residuals (distances from point to plane) are minimized by the least squares algorithm, and the plane which fits best through these points is, by definition, the calibration plane (see Figure 10). [Pg.99]

If the functional relationship between one variable and another is linear, a straight-line plot would be obtained on arithmetic-coordinate graph paper. If the relationship approaches a linear one, the best method of fitting the data to a linear model would be through the method of least squares. The resulting linear equation (or line) would have the properties of lying as close as possible to the data. For statistical purposes, close and/or best fit is defined as that linear equation or line for which the sum of the squared vertical distances between the data (values of Y or independent variable) and line is minimized. These distances are called residuals. This approach is employed in the solution below. [Pg.178]

Criteria for evaluating the degree of fit between measured fluorescence decay curves and trial decay functions have been discussed.In some instances plots of weighted residuals were found to be sufficient, but a generalized statistical test was proposed for all other cases. An analysis of the statistical distribution of noise in fluorescence decay measurements by SPC has shown, as expected, that Poisson statistics dominate. A method for obtaining decay information from pulse fluorimetry without the need for consideration of the excitation pulse, has been described. ... [Pg.36]

In our context, model selection is the selection of the number of components, the choice between a PARAFAC and a T2 model (e.g., is there energy transfer ), or the choice between parametric models for specific ways and components (e.g., is the time dependence a single exponential ). Most often, one will be interested in deciding on the number of components to use. There are a variety of statistical tests based on the decline of the sum of squared residuals with additional components. Other methods look at the relationship between residuals for evidence that they show no patterns and hence the model is adequate. However, the simplest and perhaps the most effective approach is simply to plot the logarithm of the sum of squared residuals versus the number of components and use the number of components at the elbow where the curve flattens out. " "... [Pg.693]

Linearity can be established by visual evaluation of a plot of the area as a function of the analyte concentration (Figure 6). The correlation coefficient, y intercept, slope should be calculated. The y intercept should statistically not differ from 0. The residuals should be calculated and plotted exhibiting random arrangement if the response is linear. The slope of the active compound curve (Figure 4) is divided by the slopes of these curves (Figure 6) to determine the RRFs these are recorded in the method procedure if the method does not prescribe the use of external standards for related compounds. [Pg.445]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...