Jackknife analysis

Two non-parametric methods for hypothesis testing with PCA and PLS are cross-validation and the jackknife estimate of variance. Both methods are described in some detail in the sections describing the PCA and PLS algorithms. Cross-validation is used to assess the predictive property of a PCA or a PLS model. The distribution function of the cross-validation test-statistic cvd-sd under the null-hypothesis is not well known. However, for PLS, the distribution of cvd-sd has been empirically determined by computer simulation technique [24] for some particular types of experimental designs. In particular, the discriminant analysis (or ANOVA-like) PLS analysis has been investigated in some detail as well as the situation with Y one-dimensional. This simulation study is referred to for detailed information. However, some tables of the critical values of cvd-sd at the 5 % level are given in Appendix C. [Pg.312]

Figure 14-30 An example of application of weighted Deming regression analysis.The solid line is the estimated regression line, and the dotted line is the line of identity.The estimated 95% confidence bands obtained by the jackknife approach are the curved dashed lines.

Wu CFJ. Jackknife, bootstrap and other resampling methods in regression analysis (with discussion). Ann Stat 1986 14 1261-95. [Pg.407]

When a model is used for descriptive purposes, goodness-of-ht, reliability, and stability, the components of model evaluation must be assessed. Model evaluation should be done in a manner consistent with the intended application of the PM model. The reliability of the analysis results can be checked by carefully examining diagnostic plots, key parameter estimates, standard errors, case deletion diagnostics (7-9), and/or sensitivity analysis as may seem appropriate. Conhdence intervals (standard errors) for parameters may be checked using nonparametric techniques, such as the jackknife and bootstrapping, or the prohle likelihood method. Model stability to determine whether the covariates in the PM model are those that should be tested for inclusion in the model can be checked using the bootstrap (9). [Pg.226]

The preciseness of the primary parameters can be estimated from the final fit of the multiexponential function to the data, but they are of doubtful validity if the model is severely nonlinear (35). The preciseness of the secondary parameters (in this case variability) are likely to be even less reliable. Consequently, the results of statistical tests carried out with preciseness estimated from the hnal ht could easily be misleading—thus the need to assess the reliability of model estimates. A possible way of reducing bias in parameter estimates and of calculating realistic variances for them is to subject the data to the jackknife technique (36, 37). The technique requires little by way of assumption or analysis. A naive Student t approximation for the standardized jackknife estimator (34) or the bootstrap (31,38,39) (see Chapter 15 of this text) can be used. [Pg.393]

An area related to model validation is influence analysis, which deals with how stable the model parameters are to influential observations (either individual concentration values or individual subjects), and model robustness, which deals with how stable the model parameters are to perturbations in the input data. Influence analysis has been dealt with in previous chapters. The basic idea is to generate a series of new data sets, where each new data set consists of the original data set with one unique subject removed or has a different block of data removed, just like how jackknife data sets are generated. The model is refit to each of the new data sets and how the parameter estimates change with each new data set is determined. Ideally, no subject should show... [Pg.256]

Sometimes, the distribution of the statistic must be derived under asymptotic or best case conditions, which assume an infinite number of observations, like the sampling distribution for a regression parameter which assumes a normal distribution. However, the asymptotic assumption of normality is not always valid. Further, sometimes the distribution of the statistic may not be known at all. For example, what is the sampling distribution for the ratio of the largest to smallest value in some distribution Parametric theory is not entirely forthcoming with an answer. The bootstrap and jackknife, which are two types of computer intensive analysis methods, could be used to assess the precision of a sample-derived statistic when its sampling distribution is unknown or when asymptotic theory may not be appropriate. [Pg.354]

The most important statistical parameters r, s, and F and the 95% confidence intervals of the regression coefficients are calculated by Eqs. (20) to (23) (for details on Eqs. (20) to (23), see Refs. 39 to 42). For more details on linear (multiple) regression analysis and the calculation of different statistical parameters, as well as other validation techniques (e.g., the jackknife method and bootstrapping), see Refs. 33,39 12 ... [Pg.546]

The method for standardizing the residuals is reasonably straightforward and will not be demonstrated. However, the overall process of residual analysis by smdentizing and jackknifing procedures that use hat matrices will be explored in detail in Chapter 8. [Pg.152]

Because this author prefers the jackknife procedure, we will use it for an example of a complete analysis. The same procedure would be done for calculating standardized and Studentized residuals. First, a Stem-Leaf display was computed of the T(, ) values (Table 8.19). [Pg.317]

The —2.39 jackknife value at 6 h is extreme, but is not found to be suspect after reviewing the data records. Hence, in the process of our analysis and validation, two values were eliminated 6.57 at 6 h, and 3.52 at the immediate sample time. All other suspicious values were checked out and not removed. A new regression conducted on the amended data set increased R, as well as reducing the bo and bi values. The new regression is considered... [Pg.322]

Chapter 8 aids the researcher in determining outlier values of the variables y and x. It also includes residual analysis schemas, such as standardized, Studentized, and jackknife residual analyses. Another important feature of... [Pg.511]

ID Analysis Pooled data sef No. of variables Classification function (% correct) Jackknifed validation (% correct)... [Pg.237]

Dorfman DD.Berbaum KS.Metz CE (1992) Receiver operating characteristic rating analysis. Generalization to the population of readers and patients with the jackknife method. Invest Radiol 27 723-731... [Pg.103]

FIGURE 6.4 Fifty percent majority rule consensus tree of 2348 most parsimonious trees (L = 808, Cl = 54, RI = 79) of analysis of 86 terminals. Jackknife (in italics) and Bremer support values are shown below branches those clades that appear also in the strict consensus topology are shown with 100 above the branch. [Pg.136]

Stepwise discriminant analysis (SDA). SDA was run with the BMDP7M program (7). The F-value for selecting a peak to enter in the classification functions or to remove from already classified functions was set as 4.0. The maximum step number was 10, and the precision of classification at every step was tested by the Jackknifed classification method... [Pg.120]

The stepwise discriminant analysis. The stepwise discriminant analysis was run on the basis of eighteen compounds which had a rotated factor loading of more than 0.5 and less than -0.5 on the second factor axis. Furfuryl acetate, 5-methylfurfural, and unknown compound (Peak No. 73) were selected by the stepwise discriminant anlysis. At step 1, furfuryl acetate, with the largest F-value, was entered into the equation as the most discriminative compound. The results of the jackknifed classifi-cation at step 3 showed that the samples could be discriminated in each group with 100% correctness on the basis of the three components. Furfuryl acetate had high F-value, and was extracted as an important index for... [Pg.124]

A final mention of data on real systems is worth, although this paper is not concerned with comparisons with experimental data. It is however interesting to mention that by comparing the experimental data on S.Cerevisiae with the theoretical distribution of Eq. 1, using the jackknife method [16,17], it is possible to loeate the k parameter in the 95 % confidence interval [0.84, 0.93]. Moreover, an analysis of the same data using Bayes factors [18,19], leads to reject the hypothesis that the network is precisely critical, since the probability that X = 1 given the data is smaller than 10 under a broad range of prior distributions. The interested reader is referred to [15] for further details. [Pg.37]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...