Outliers in measurements

The detection of outliers, particularly when working with a small number of samples, is discussed in the following papers. Efstathiou, G. Stochastic Galculation of Gritical Q-Test Values for the Detection of Outliers in Measurements, /. Chem. Educ. 1992, 69, 773-736. [Pg.102]

The field of outlier detection and treatment is considerable, and a rigorous mathematical discussion is well beyond any treatment that is possible here. Moreover, the practice in the treatment of an2ilytical results is usually simplified, since the number of observations is often not very large. The two most common methods used by an2ilysts to detect outliers in measured data are versions of the Q-test (Refs. 1-3, 6) and Chauvenet s criterion (Refs. 4-6), both of which assume that the data are sampled from a population that is norm2Jly distributed. [Pg.1426]

Most techniques for process data reconciliation start with the assumption that the measurement errors are random variables obeying a known statistical distribution, and that the covariance matrix of measurement errors is given. In Chapter 10 direct and indirect approaches for estimating the variances of measurement errors are discussed, as well as a robust strategy for dealing with the presence of outliers in the data set. [Pg.26]

Only a few publications in the literature have dealt with this problem. Almasy and Mah (1984) presented a method for estimating the covariance matrix of measured errors by using the constraint residuals calculated from available process data. Darouach et al. (1989) and Keller et al. (1992) have extended this approach to deal with correlated measurements. Chen et al. (1997) extended the procedure further, developing a robust strategy for covariance estimation, which is insensitive to the presence of outliers in the data set. [Pg.203]

Figures 1 to 4 illustrate the results of the reconciliation for the four variables involved. As can be seen, this approach does not completely eliminate the influence of the outliers. For some of the variables, the prediction after reconciliation is actually deteriorated because of the presence of outliers in some of the other measurements. This is in agreement with the findings of Albuquerque and Biegler (1996), in the sense that the results of this approach can be very misleading if the gross error distribution is not well characterized.

In literature the above diagnostic measures are known under different names. Instead of the score distance from Equation 3.27 which measures the deviation of each observation within the PCA space, often the Hotelling T2-test is considered. Using this test a confidence boundary can be constructed and objects falling outside this boundary can be considered as outliers in the PCA space. It can be shown that this concept is analogous to the concept of the score distance. Moreover, the score distances are in fact Mahalanobis distances within the PCA space. This is easily... [Pg.94]

The methods of robust statistics have recently been used for the quantitative description of series of measurements that comprise few data together with some outliers [DAVIES, 1988 RUTAN and CARR, 1988]. Advantages over classical outlier tests, such as those according to DIXON [SACHS, 1992] or GRUBBS [SCHEFFLER, 1986], occur pri-marly when outliers towards both the maximum and the minimum are found simultaneously. Such cases almost always occur in environmental analysis without being outliers in the classical sense which should be eliminated from the set of data. The foundations of robust statistics, particularly those of median statistics, are described in detail by TUKEY [1972], HUBER [1981], and HAMPEL et al. [1986] and in an overview also by DANZER [1989] only a brief presentation of the various computation steps shall be given here. [Pg.342]

The very wide variation of the lead content of the investigated soil samples is, moreover, conspicuous. RLS regression yields qualitatively comparable results, but with lower correlation coefficients (see Tab. 9-6). This is primarily because many outliers were eliminated from the data obtained (Tab. 9-7). Since the measuring points weighted with vv, = 0 are not outliers in the classical sense, but environment-related variability, preference should be given to LMS regression when environmental data have to be analyzed. The RLS regression should primarily be reserved for cases with real outliers. [Pg.346]

However, the correlation coefficient of 0.7854 for the final curve fitting effort indicates the presence of many unexplained outlier points. One of the possible concerns was an inherent error in measuring the height of the powder bed from the wet mass density. [Pg.4090]

Axxrid the line frequency and harmonics Modem impedance instniments provide very effective filters for stochastic noise, but these filters are generally inadequate for measiuements conducted at the line frequency. The resulting meastuements generally appear as outliers in an impedance spectrum, and such outliers have a profovmd impact on nonlinear regression used to extract parameters from the data. Measurement of impedance should be avoided at line frequency and its first harmonic, i.e., 60 5 Hz and 120 5 Hz in the United States and 50 5 Hz and 100 5 Hz in Europe. [Pg.149]

Figure 42.6 shows the distribution of mean TPAR obtained from simulating 50 replicates (i.e., M2 = 50) of the base data set with the value of one tissue concentration value inflated to create an outlier in each replicate. The effect of one outlier can be measured by how big the distance is from the original mean TPAR value of... [Pg.1046]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...