Outliers samples

System installation in a permanent location may require a sample conditioning system featuring some degree of automation, such as automatic cleaning (the system illustrated above features such a system) and outlier sample collection and the need to interface to an existing control system process computer. The latter may require that the system operates with a standardized communications protocol, such as Modbus, for the chemical industry. Certain specialized industries use different protocols, such as the semiconductor industry, which uses SECS and SEC-11 protocols. A standardized approach designated the Universal Fieldbus is another method/protocol for process analyzers which is being supported by certain hardware manufacturers. [Pg.181]

Figure 12.27 (A) Scatter plot of the Hotelling P and Q residual statistics associated with the samples in the process spectroscopy calibration data set, obtained from a PCA model built on the data after obvious outliers were removed. The dashed lines represent the 95% confidence limit of the respective statistic. (B) The spectra used to generate the plot in (A), denoting one of the outlier samples.

This plot can be used to identify outlier samples. [Pg.137]

A plot of the measurement vectors of all of the samples in the training set is shown in Figure 4.61 (an offset for each class was added for clarity). The measurement vectors within a class have similar features and there is significant overlap of features between classes. Examine this plot for outlier samples and/or measurement variables as well as an indication of the need for preprocessing. In this case, all measurements appear to be reasonable and preprocessing does not appear to be warranted. [Pg.253]

Statistical Prediction Error vs. Sample Number Plot (Sample Diagnostic) The statistical prediction errors for the validation data are shown in Figure 5-84. There are no samples which have an error that is unusual relative to the rest of the validation data. This further confirms the earlier conclusion that there are no outlier samples. The maximum of 0.029 will be used for assessing the reliability of prediction in Habit 6. [Pg.321]

Summary of Prediction Diagnostic Tools for PLS/PCP, Example 1 The predicted concentration of component A in 17 of the 20 samples were deemed reliable by the prediction diagnostics. Therefore, diese predictions are expected to be within 0.25 of the true value. Four samples were identified as unusual. The predicted concentration of component A in sample 13 was accepted despite being outside the range of the calibration because of the acceptable value. If predictions are consistently outside of the calibration range, it is prudent to consider expanding the range of the model. The predictions of the other tliree outlier samples were not considered to be reliable. [Pg.340]

Figure 4.24 Studentised residuals leverage plot to detect outlier samples on the calibration set (a) general rules displayed on a hypothetical case and (b) diagnostics for the data from the worked example (Figure 4.9, mean centred, four factors in the PLS model).

The two outlier samples were not used during later data analysis. All other points except the soil surface samples are consistent with normal statistical scatter in measured data. [Pg.121]

When the laboratory value is plotted against the NIR predicted value for the calibration sample set it may well be noted that some points lie well away from the computed regression line. This will, of course, reduce the correlation between laboratory and NIR data and increase the SEC or SEP. These samples may be outliers. The statistic hi describes the leverage or effect of an individual sample upon a regression. If a particular value of hi is exceeded this may be used to determine an outlier sample. Evaluation criteria for selecting outliers, howevei are somewhat subjective so there is a requirement for expertise in multivariate methods to make outlier selection effective. [Pg.2249]

Another powerfiil tool in seeking out outlier samples is the spectral residual. This was discussed briefly in an earlier section. Similar to looking for concentration outliers, spectral outliers are detected by using a model for which the optimum number of factors has been determined by a cross-validation. [Pg.135]

The spectral residual plot in Fig. 15 was chosen to illustrate an obvious outlier (sample No. 45). It is necessary to use the F test method to determine that the samples are indeed outliers in the same maimer that is used for concentration outliers. For spectral residuals, the F ratio is calculated as... [Pg.135]

Once again, it is desirable to have a more statistical measure of a sample s potential to be an outher than simple visual inspection. For score clusters, it is possible to use a measure of the Mahalanobis distance (Figs. 16 and 17). This is calculated as the distance of the potential outlier sample point as measured from the mean of all the remaining points in the cluster. The distance is scaled for the range of variation in the cluster in all dimensions and then assigns a probabihty weight to the sample in terms of standard deviation. Any sample that Ues outside of 3 standard deviations from the mean can be considered suspicious. [Pg.137]

By calculating the sum of the squares of the spectral residuals across all the wavelengths, an additional representative value can be generated for each spectrum. The spectral residual is effectively a measure of the amount of each spectrum left over in the secondary or noise vectors. This value is the basis of another type of discrimination method known as SIMCA (Refs. 13, 36). This is similar to performing an F test on the spectral residual to determine outliers in a training set (see Outlier Sample Detection in Chapter 4). In fact, one group combined the PCA-Mahalanobis distance method with SIMCA to provide a biparametric method of discriminant analysis (Ref. 41). In this method, both the Mahalanobis distance and the SIMCA test on the spectral residual had to pass in order for a sample to be classified as a match. [Pg.177]

As with quantitative models, outlier samples in the training set can have an unwanted influence on the discrimination ability of the model. Many of the same techniques (spectral residual plots and cluster plots) used in quantitative models can also be used to check for outliers in discriminant models. However, keep in mind the purpose of the discriminant analysis experiment to build a model that can accurately match a spectrum to the training group but allow enough variation in the model to compensate for the natural variations seen in real samples. [Pg.187]

Samples for recalibration should be selected on the basis of spectral characteristics. As was stated in Section 17.4.2, the same program used to select samples for the calibration can now be used to choose samples for recalibration. The current calibration file is used as a library and the new population of samples is divided into similar samples, samples to be added to the calibration, and outlier samples. [Pg.382]

Calculate the average values of the Cp and Cpk capability indices for the photolithography thickness data in Example 21.2. Omit the two outliers (samples 5 and 15), and assume that the upper and lower specification hmits for the photoresist thickness are USL = 235 A and LSL ... [Pg.421]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...