Multiple testing, error controlling

The problem with this so-called multiplicity or multiple testing arises when we make a claim on the basis of a positive result which has been generated simply because we have undertake lots of comparisons. Inflation of the type I error rate in this way is of great concern to the regulatory authorities they do not want to be registering treatments that do not work. It is necessary therefore to control this inflation. The majority of this chapter is concerned with ways in which the potential problem can be controlled, but firstly we will explore ways in which it can arise. [Pg.147]

In Chapter 10 we spoke extensively about the dangers of multiple testing and the associated inflation of the type I error. Methods were developed to control that inflation and account for multiplicity in an appropriate way. [Pg.213]

Suppose that we wish to make inferences on the parameters 0i,i = 1,g, where 9i represents the logarithm of the ratio of the expression levels of gene i under normal and disease conditions. If the ith gene has no differential expression, then the ratio is 1 and hence 0 = 0. In testing the g hypotheses Ho, 0 = 0, / = 1,..., g, suppose we set R, = 1 if H0, is rejected and Ri = 0 otherwise. Then, for any multiple testing procedure, one could in theory provide a complete description of the joint distribution of the indicator variables R, ..., Rg as a function of 0i,..., 0g in the entire parameter space. This is impractical if g > 2. Different controls of the error rate control different aspects of this joint distribution, with the most popular being weak control of the familywise error rate (FWER), strong control of the familywise error rate, and control of the false discovery rate (FDR). [Pg.144]

Statistics based on distributions of test results from large numbers of patients are useful for detecting systematic errors (shifts and drifts) but are of no value for detecting random errors (increased variability or scatter). They are useful adjuncts to the fundamental control procedures, which use stable control materials, but should not be substituted for them. Patient values include numerous sources of Yaria-tion—demographical, biological, pathological, and preana-lytical (see Chapter 17) —in addition to the analytical variation caused by the analytical method. As a result, individual test values have too much variability to have any utility for QC however, the mean of multiple test values or groups of patients is more stable and therefore maybe useful for control purposes. [Pg.512]

The issue of type I error inflation caused by multiple testing appears in many guises in the realm of new drug development. This issue is of great importance to decision-makers, and we discuss this topic again later in the chapter. For now, we have not yet provided a full answer to our research question our description of analysis of variance is incomplete without a discussion of at least one analysis method that controls the overall type I error rate when evaluating pairwise comparisons from an ANOVA. [Pg.160]

In practical terms, this means that if we perform multiple tests and make multiple inferences, each one at a reasonably low error probability, the likelihood that some of these inferences will be erroneous could be appreciable. To correct for this, one must conduct each individual test at a decreased significance level, with the result that either the power of the tests will be reduced as well, or the sample size must be increased to accommodate the desired power. This could make the trial prohibitively expensive. Statisticians sometimes refer to the need to adjust the significance level so that the experimentwise error rate is controlled, as the statistical penalty for multiplicity. [Pg.251]

It is essential to strictly control error rates by careful significance evaluation to save the time and effort of experiments for confirming biomarker candidates selected. Error rates can be controlled more carefully and strictly by using multiple testing adjusted p-values rather than raw p-values. [Pg.78]

A number of procedures for controlling error rates have been developed to solve the multiple-testing problem (Dudoit et al., 2003). The Bonferroni procedure for controlling the FWER at level a rejects any hypothesis Hj with unadjusted p-value... [Pg.80]

Pollard, K. S., and van der Laan, M. J. (2005). Resampling-based multiple testing with asymptotic strong control of type I error. Preprint, Division of Biostatistics, University of California, Berkeley, CA. [Pg.88]

Statistical Analysis. Analysis of variance (ANOVA) of toxicity data was conducted using SAS/STAT software (version 8.2 SAS Institute, Cary, NC). All toxicity data were transformed (square root, log, or rank) before ANOVA. Comparisons among multiple treatment means were made by Fisher s LSD procedure, and differences between individual treatments and controls were determined by one-tailed Dunnett s or Wilcoxon tests. Statements of statistical significance refer to a probability of type 1 error of 5% or less (p s 0.05). Median lethal concentrations (LCjq) were determined by the Trimmed Spearman-Karber method using TOXSTAT software (version 3.5 Lincoln Software Associates, Bisbee, AZ). [Pg.96]

If a carrier solvent has been used, it is critical to compare the solvent control to the control treatment to ensure comparability. The common student s t-test can be used to compare the two groups. If any differences exist, then the solvent control must be used as the basis of comparison. Unfortunately, a t-test is not particularly powerful with typical data sets. In addition, multiple endpoints are usually assessed in a chronic toxicity test. The change of a Type 2 error, stating that a difference exists when it does not, is a real possibility with multiple endpoints under consideration. [Pg.54]

We touched on this problem in Chapter 9, where we drew attention to Cournot s criticism of multiple comparisons. To use the language of hypothesis testing, the problem is that as we carry out more and more tests, the probability of making at least one type I error increases. This probability of at least one type I error is sometimes referred to as the family-wise error rate (FWER) (Benjamini and Hochberg, 1995). Thus, controlling the type I error rates of individual tests does not guarantee control of the FWER. To put... [Pg.149]

Closed test procedure. A structured approach to multiple comparisons and controlling the type I error rate, whereby all higher-level hypotheses in which lower-level hypotheses are implicated must be rejected before rejection of the lower-level hypotheses themselves. [Pg.459]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...