Statistical significance of the regression model

Now that we added the assumption that the errors follow a normal distribution to our hypotheses, we can return to the ANOVA and use the mean square values to test if the regression equation is statistically significant. When Pi = 0, that is, when there is no relation between X and y, it can be demonstrated that the ratio of the MSr and MSr mean squares [Pg.218]

For our example we need the value of Fi,3, which can be found in Table A.4, at the intersection of the column for vi = 1 with the row for V2 = 3. At the 95% confidence level, this value is 10.13. Our regression will be statistically significant if we find that MSr/MSr 10.13. Otherwise, we will have no reason to doubt that the value of Pi is really zero and consequently there is no relation between the variables. [Pg.219]

The correlation coefficient r is a measure of quality of fit of the model. It constitutes the variance in the data. In an ideal situation one would want the correlation coefficient to be equal to or approach 1, but in reality because of the complexity of biological data, any value above 0.90 is adequate. The standard deviation is an absolute measure of the quality of fit. Ideally s should approach zero, but in experimental situations, this is not so. It should be small but it cannot have a value lower than the standard deviation of the experimental data. The magnitude of s may be attributed to some experimental error in the data as well as imperfections in the biological model. A larger data set and a smaller number of variables generally lead to lower values of s. The F value is often used as a measure of the level of statistical significance of the regression model. It is defined as denoted in Equation 1.27. [Pg.10]

Briiggermann et al. presented a more elaborated test, developed by Mark and Workman,and exemplified it with an ICP (inductively coupled plasma spectrometry) procedure. In essence, it develops a polynomial model and then tests for the statistical significance of the regression coefficients. Its main advantage was claimed to be the avoidance of correlations between the various powers of the concentrations. [Pg.94]

ANOVA of the data confirms that there is a statistically significant relationship between the variables at the 99% confidence level. The i -squared statistic indicates that the model as fitted explains 96.2% of the variability. The adjusted P-squared statistic, which is more suitable for comparing models with different numbers of independent variables, is 94.2%. The prediction error of the model is less than 10%. Results of this protocol are displayed in Table 18.1. Validation results of the regression model are displayed in Table 18.2. [Pg.1082]

Only b0 and bx of all regression coefficients with 99% confidence, are statistically significant. Hence, the final form of the regression model is ... [Pg.435]

An alternative method is described by backward elimination. This technique starts with a full equation containing every measured variate and successively deletes one variable at each step. The variables are dropped from the equation on the basis of testing the significance of the regression coefficients, i.e. for each variable is the coefficient zero The F-statistic is referred to as the computed F-to-remove. The procedure is terminated when all variables remaining in the model are considered significant. [Pg.186]

In particular, we shall see that the regression sum of squares can also be subdivided, each part being associated with certain terms in the model - a sum of squares for the linear model, another for the quadratic and interaction terms. The statistical significance of the different terms in the model can then be assessed. [Pg.182]

From the residual sum of squares SSgEsio = 2.15, with 14 degrees of freedom, a residual mean square can be calculated (MSi Es,p = 2.15/14 = 0.154). It is this value which is used to calculate F-ratios for assessing the statistical significance of the first-order and second-order terms. (Note that the F-ratio for the first order model is obtained by dividing the mean square for the regression by the residual mean square calculated for the first order model, not by the value in the table, 0.154, which refers to the quadratic model.)... [Pg.210]

The experiments were performed and the properties of the resulting granulate and tablets, as well as the compression characteristics were measured for each experiment. Results are listed in table 6.3. The model coefficients for each response were estimated by linear regression and are shown in table 6.4. Coefficients which seem to be important are printed in bold type. For reasons which will be explained in chapter 8, it was not possible to determine the statistical significance of the coefficients by analysis of variance. [Pg.270]

What about an assessment of the significance of the fit of a multiple regression equation (or simple regression) to a set of data A guide to the overall significance of a regression model can be obtained by calculation of a quantity called the F statistic. This is simply the ratio of the explained mean square (MSB) to the residual mean square (MSR)... [Pg.122]

F statistic F A measure of the overall significance of a regression model... [Pg.125]

A last means for model discrimination, that is, when all methods described earlier are exhausted, is an assessment of the statistical performance of the model. The F test for the model adequacy being a rather severe test, which is only rather exceptionally satisfled, the F value for the global significance of the regression is a good measure for comparison. In addition, the width of the individual confidence intervals of the parameters as well as their correlation coefficients may contribute to the selection of the best model. [Pg.1362]

In the regression model (Table 2), the effects of Leadership and Collaboration (independent variable) and Quality of Legislation (independent variable) on Continuous Improvements (dependent variable) confirmed the goodness of fit (R-squared = 0.38). The statistical significance of the model was highly significant as assessed by F-test (F1496 = 53.81, p < 0.0001). [Pg.193]

The characteristic quantities obtained for the straight lines must be tested for their statistical significance and should be quoted with a confidence interval. Their standard deviation is a function of the residual standard deviation, Srcs of the regression model. This is defined as ... [Pg.116]

Principal component analysis (PCA) and principal component regression (PCR) were used to analyze the data [39,40]. PCR was used to construct calibration models to predict Ang II dose from spectra of the aortas. A cross-validation routine was used with NIR spectra to assess the statistical significance of the prediction of Ang II dose and collagen/elastin in mice aortas. The accuracy of the PCR method in predicting Ang II dose from NIR spectra was determined by the F test and the standard error of performance (SEP) calculated from the validation samples. [Pg.659]

Several procedures are available to determine the reliability and statistical significance of the model. The performance of regression models is commonly measured by the explained variance for the response variable y, denoted R, and the residual standard deviation (S ), calculated using the following equations ... [Pg.117]

From a theoretical point of view, the proper application of regression analysis requires the formulation of a working hypothesis, the design of experiments (i.e., compounds to be tested), the selection of a mathematical model, and the test of statistical significance of the obtained result. In QSAR studies, this is pure theory. Reality is different QSAR studies are most often retrospective studies and in several cases many different variables are tested to find out whether some of them, alone or in combination, are able to describe the data. In principle, there are no objections against this method because QSAR equations should be used to derive new hypotheses and to design new experiments, based on these hypotheses. Then the requirements for the application of statistical methods are fulfilled. [Pg.2317]

We now consider a type of analysis in which the data (which may consist of solvent properties or of solvent effects on rates, equilibria, and spectra) again are expressed as a linear combination of products as in Eq. (8-81), but now the statistical treatment yields estimates of both a, and jc,. This method is called principal component analysis or factor analysis. A key difference between multiple linear regression analysis and principal component analysis (in the chemical setting) is that regression analysis adopts chemical models a priori, whereas in factor analysis the chemical significance of the factors emerges (if desired) as a result of the analysis. We will not explore the statistical procedure, but will cite some results. We have already encountered examples in Section 8.2 on the classification of solvents and in the present section in the form of the Swain et al. treatment leading to Eq. (8-74). [Pg.445]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...