Regression distribution, squared residuals

Although we cannot clearly determine the reaction order from Figure 3.9, we can gain some insight from a residual plot, which depicts the difference between the predicted and experimental values of cA using the rate constants calculated from the regression analysis. Figure 3.10 shows a random distribution of residuals for a second-order reaction, but a nonrandom distribution of residuals for a first-order reaction (consistent overprediction of concentration for the first five datapoints). Consequently, based upon this analysis, it is apparent that the reaction is second-order rather than first-order, and the reaction rate constant is 0.050. Furthermore, the sum of squared residuals is much smaller for second-order kinetics than for first-order kinetics (1.28 X 10-4 versus 5.39 xl0 4). [Pg.59]

For the model in Exercise 1, suppose, is normally distributed with mean zero and variance g2(1 + (yx)2). Show that g2 and y2 can be consistently estimated by a regression of the least squares residuals on a constant and x2. Is this estimator efficient ... [Pg.45]

From the standpoint of statistics, the transformation Eq. 2.2-19 into Eq. 2.3.b-l and the determination of the parameters from this equation may be criticized. What is minimized by linear regression are the (residuals) between experimental and calculated y-values. The theory requires the error to be normally distributed. This may be true for r, but not necessarily for the group /(Pa - PrPs/I Wa and this may, in principle, affect the values of k, K, K, Ks, — However, when the rate equation is not rearranged, the regression is no longer linear, in general, and the minimization of the sum of squares of residuals becomes iterative. Search procedures are recommended for this (see Marquardt [41]). It is even possible to consider the data at all temperatures simultaneously. The Arrhenius law for the temperature dependence then enters into the equations and increases their nonlinear character. [Pg.115]

Studies interested in the determination of macro pharmacokinetic parameters, such as total body clearance or the apparent volume of distribution, can be readily calculated from polyexponential equations such as Eq. (9) without assignment of a specific model structure. Parameters (i.e., Ah Xt) associated with such an equation are initially estimated by the method of residuals followed by nonlinear least squares regression analyses [30],... [Pg.90]

In a well-behaved calibration model, residuals will have a Normal (i.e., Gaussian) distribution. In fact, as we have previously discussed, least-squares regression analysis is also a Maximum Likelihood method, but only when the errors are Normally distributed. If the data does not follow the straight line model, then there will be an excessive number of residuals with too-large values, and the residuals will then not follow the Normal distribution. It follows, then, that a test for Normality of residuals will also detect nonlinearity. [Pg.437]

To model the relationship between PLA and PLR, we used each of these in ordinary least squares (OLS) multiple regression to explore the relationship between the dependent variables Mean PLR or Mean PLA and the independent variables (Berry and Feldman, 1985).OLS regression was used because data satisfied OLS assumptions for the model as the best linear unbiased estimator (BLUE). Distribution of errors (residuals) is normal, they are uncorrelated with each other, and homoscedastic (constant variance among residuals), with the mean of 0. We also analyzed predicted values plotted against residuals, as they are a better indicator of non-normality in aggregated data, and found them also to be homoscedastic and independent of one other. [Pg.152]

Table 3 shows results of recorded fluorescence emission intensity as a function of concentration of quinine sulphate in acidic solutions. These data are plotted in Figure 3 with regression lines calculated from least squares estimated lines for a linear model, a quadratic model and a cubic model. The correlation for each fitted model with the experimental data is also given. It is obvious by visual inspection that the straight line represents a poor estimate of the association between the data despite the apparently high value of the correlation coefficient. The observed lack of fit may be due to random errors in the measured dependent variable or due to the incorrect use of a linear model. The latter is the more likely cause of error in the present case. This is confirmed by examining the differences between the model values and the actual results. Figure 4. With the linear model, the residuals exhibit a distinct pattern as a function of concentration. They are not randomly distributed as would be the case if a more appropriate model was employed, e.g. the quadratic function. [Pg.164]

The concept of squared distances has important functional consequences on how the value of the correlation coefficient reacts to various specific arrangements of data. The significance of correlation is based on the assumption that the distribution of the residual values (i.e., the deviations from the regression line) for the dependent variable y follows the normal distribution and that the variability of the residual values is the same for all values of the independent variable. However, Monte Carlo studies have shown that meeting these assumptions closely is not crucial if the sample size is very large. Serious biases are unlikely if the sample size is 50 or more normality can be assumed if the sample size exceeds 100. [Pg.86]

Examination of the univariate distribution of 5-FU clearance revealed it to be skewed and not normally distributed suggesting that any regression analysis based on least squares will be plagued by non-normally distributed residuals. Hence, Ln-transformed 5-FU clearance was used as the dependent variable in the analyses. Prior to analysis, age was standardized to 60 years old, BSA was standardized to 1.83 m2, and dose was standardized to 1000 mg. A p-value less than 0.05 was considered to be statistically significant. The results from the simple linear regressions of the data (Table 2.4) revealed that sex, 5-FU dose, and presence or absence of MTX were statistically significant. [Pg.75]

Model Discrimination. One can also determine which model or equation best fits the experimental data by comparing the sums of the squares for each model and then choosing the equation with a smaller sum of squares and/or carrying out an F-test. Alternatively, we can compare the residual plots for each model. The.se plots show the error associated with each data point, and one looks to see if the error is randomly distributed or if there is a trend in the error. When the error is randomly distributed, this is an additional indication that the correct rate law has been chosen. An example of model discrimination using nonlinear regression is given on the CD-ROM in Chapter 10 of the Sum-marv Notes. [Pg.277]

Since no model can reproduce the pure error sum of squares, the maximum explainable variation is the total sum of squares minus SSpe- In our case, SSt - SSpe = 8930.00 - 45.00 = 8885.00, which corresponds to 8885.00/8930.00 = 99.50% of the total sum of squares. This percentage is close to 100%, because the pure error contribution is relatively small, but it is with this new value that we should compare the variation explained by the regression, 77.79%. The model inadequacy appears clearly in the two first graphs of Fig. 5.8. Once again the residuals are distributed in a curved pattern. [Pg.228]

However, the regression theory requires that the errors be normally distributed around (—7 a). and not around f as in the linearized version just described. Hence use the values determined as initial estimates to obtain more accurate values of the constants by minimizing the sum of squares of the residuals of the rates directly from the raw rate equation by nonlinear least squares analysis. [Pg.178]

In order to conduct ordinaiy least square regression, some assumptions have to be met, which address linearity, normally of data distribution, constant variance of the error terms, independence of the error terms, and normality of the error term distribution (Cohen 2003 Hair et al. 1998], Whereas the former two can be assessed before performing the actual regression analysis, the latter three can only be evaluated ex post I will thus anticipate some of the regression results to check, if the assumptions with respect to the regression residuals are met... [Pg.137]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...