Computer intensive statistical

As might be surmised, computer intensive statistical analysis methods have become more popular and useful with the advent of modern personal computers having... [Pg.354]

Urban Hjorth, J.S. (1994). Computer Intensive Statistical Methods, Chapman and Hall, London,... [Pg.451]

Microarray experiments generate large and complex data sets that constitute e.g. lists of spot intensities and intensity ratios. Basically, the data obtained from microarray experiments provide information on the relative expression of genes corresponding to the mRNA sample of interest. Computational and statistical tools are required to analyze the large amount of data to address biological questions. To this end, a variety of analytical platforms are available, either free on the Web or via purchase of a commercially available product. [Pg.527]

Hpp describes the primary system by a quantum-chemical method. The choice is dictated by the system size and the purpose of the calculation. Two approaches of using a finite computer budget are found If an expensive ab-initio or density functional method is used the number of configurations that can be afforded is limited. Hence, the computationally intensive Hamiltonians are mostly used in geometry optimization (molecular mechanics) problems (see, e. g., [66]). The second approach is to use cheaper and less accurate semi-empirical methods. This is the only choice when many conformations are to be evaluated, i. e., when molecular dynamics or Monte Carlo calculations with meaningful statistical sampling are to be performed. The drawback of semi-empirical methods is that they may be inaccurate to the extent that they produce qualitatively incorrect results, so that their applicability to a given problem has to be established first [67]. [Pg.55]

Stochastic kinetics requires details of individual particle reactions. It is computer-intensive and produces a huge volume of output. In this sense, it is overparameterized. However, stochastic kinetics can be made consistent with the statistics of energy deposition and reaction. [Pg.229]

The most sophisticated and computationally demanding of the variational models is microcanonical VTST. In this approach one allows the optimum location of the transition state to be energy dependent. So for each k(E) one finds the position of the transition state that makes dk(E)/dq = 0. Then one Boltzmann weights each of these microcanonical rate constants and sums the result to find fc ni- There is general agreement that this is the most reliable of the statistical kinetic models, but it is also the one that is most computationally intensive. It is most frequently necessary for calculations on reactions with small barriers occurring at very high temperatures, for example, in combustion reactions. [Pg.943]

Although this approach is still used, it is undesirable for statistical reasons error calculations underestimate the true uncertainty associated with the equations (17, 21). A better approach is to use the equations developed for one set of lakes to infer chemistry values from counts of taxa from a second set of lakes (i.e., cross-validation). The extra time and effort required to develop the additional data for the test set is a major limitation to this approach. Computer-intensive techniques, such as jackknifing or bootstrapping, can produce error estimates from the original training set (53), without having to collect data for additional lakes. [Pg.30]

Figure 55. Comparison of relative intensity distributions of P-branch rotational lines of (3,9) and (4,10) bands of N2+ (C22 , e, m)-> N2+(V25>", m ) + hv spontaneous radiative transitions. Points are experimental data and solid lines are distributions computed using statistical phase-space model to determine N2 (C221), v, m) vibrational-rotational distribution resulting from reaction He+ + N2(V 2g, v, m)— N2+(C22M,o,m)+He.419...

Mendes, B. and Tyler, D.E., Constrained M estimates for regression, in Robust Statistics Data Analysis and Computer Intensive Methods, Lecture Notes in Statistics No. 109, Rieder, H., Ed., Springer-Verlag, New York, 1996, pp. 299-320. [Pg.213]

Diaconis, P. and Efron, B. (1983). Computer Intensive Methods in Statistics. ScLAm., 96-108. [Pg.558]

Uncertainties inherent to the risk assessment process can be quantitatively described using, for example, statistical distributions, fuzzy numbers, or intervals. Corresponding methods are available for propagating these kinds of uncertainties through the process of risk estimation, including Monte Carlo simulation, fuzzy arithmetic, and interval analysis. Computationally intensive methods (e.g., the bootstrap) that work directly from the data to characterize and propagate uncertainties can also be applied in ERA. Implementation of these methods for incorporating uncertainty can lead to risk estimates that are consistent with a probabilistic definition of risk. [Pg.2310]

It has been advocated that the area under the ROC curve is a relative measure of a tesfs performance. A Wilcoxon statistic (or equivalently the Mann-Whitney U-Test) statists cally determines which ROC curve has more area under it. Less computationally intensive alternatives, which are no longer necessary, have been described. These methods are particularly helpful when the curves do not intersect. When the ROC curves of two laboratory tests for the same disease intersect, they may offer quite different performances even though the areas under their curves are identical. The performance depends on the region of the curve (i.e., high sensitivity versus high specificity) chosen. Details on how to compare statistically individual points on two curves have been developed elsewhere. ... [Pg.413]

Bootstrapping involves the repetitive drawing of random samples with replacement from the observed population and computing statistics. A complete bootstrap in an observed population with eight variables would require the calculation of bootstrap statistics for 8 = 16,777,216 samples, quite a computer-intensive process. Therefore bootstrap samples are usually Umited to hundreds or thousands of drawings. [Pg.420]

Therefore, various approaches for computing adjusted p values have been applied. These include permutation-adjusted p values, such as MaxT [55], which uses a two-sample Welch /-statistic (unequal variances) with step-down resampling procedures. Typically these adjustedp values are computed with an order of 10,000 permutations, which is computationally intensive. Although these methods are effective with large numbers of replicates, unfortunately this approach is not effective when datasets have small numbers of samples per group [56],... [Pg.143]

Sometimes, the distribution of the statistic must be derived under asymptotic or best case conditions, which assume an infinite number of observations, like the sampling distribution for a regression parameter which assumes a normal distribution. However, the asymptotic assumption of normality is not always valid. Further, sometimes the distribution of the statistic may not be known at all. For example, what is the sampling distribution for the ratio of the largest to smallest value in some distribution Parametric theory is not entirely forthcoming with an answer. The bootstrap and jackknife, which are two types of computer intensive analysis methods, could be used to assess the precision of a sample-derived statistic when its sampling distribution is unknown or when asymptotic theory may not be appropriate. [Pg.354]

The jackknife has a number of advantages. First, the jackknife is a nonparametric approach to parameter inference that does not rely on asymptotic methods to be accurate. A major disadvantage is that a batch or script file will need to be needed to delete the ith observation, recompute the test statistic, compute the pseudovalues, and then calculate the jackknife statistics of course, this disadvantage applies to all other computer intensive methods as well, so it might not be a disadvantage after all. Also, if 9 is a nonsmooth parameter, where the sampling distribution may be discontinuous, e.g., the median, the jackknife estimate of the variance may be quite poor (Pigeot, 2001). For example, data were simulated from a normal distribution with mean 100 and... [Pg.354]

The most computationally intensive step in statistical or dynamical studies based on reaction path potentials is the determination of the MEP by numerical integration of Eq. (2) and the evaluation of potential energy derivatives along the path, so considerable attention should be directed toward doing this most efficiently. Kraka and Dunning [1] have presented a lucid description of many of the available methods for determining the MEP. Simple Euler integration of Eq. [Pg.58]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...