PCA analysis

To construct the reference model, the interpretation system required routine process data collected over a period of several months. Cross-validation was applied to detect and remove outliers. Only data corresponding to normal process operations (that is, when top-grade product is made) were used in the model development. As stated earlier, the system ultimately involved two analysis approaches, both reduced-order models that capture dominant directions of variability in the data. A PLS analysis using two loadings explained about 60% of the variance in the measurements. A subsequent PCA analysis on the residuals showed that five principal components explain 90% of the residual variability. [Pg.85]

Figure 3.15 Significant mass fragments in PCA analysis of DE MS data of the examined triterpenoid substances and the value of their loadings in calculating PCI and PC2...

Among all the compounds analyzed, only eight (OP, NP, TBP, BPA, diazinon, propanil, alachlor and molinate) appeared in both matrices. The most noticeable fact is that NP is the compound that contributes more than the other compounds in both water and sediments, as it has medium value for log K0w (4.48) and the concentrations found were high enough to enable its distribution between the two matrices, between 69 and 5,999 pg kg 1 in the solid matrix. These high concentrations in the sediments are in the same range as in other rivers of the world. This compound shows a lower importance in the PCA analysis because of its appearance as punctual pollutant, mainly around the industrial area of Zaragoza. The rest of the profile shows some differences between water and sediments. Water moves constantly while the sediments are more bound to one location, and in consequence, the... [Pg.157]

PCA analysis helps to identify the spectral differences of the rock. Every specific rock appears in a divergent colour in RGB composite image of PCA bands (Krishnamurthy 1997). False colour composite image of PCA bands 5, 4, and... [Pg.487]

The main classification methods for drug development are discriminant analysis (DA), possibly based on principal components (PLS-DA) and soft independent models for class analogy (SIMCA). SIMCA is based only on PCA analysis one PCA model is created for each class, and distances between objects and the projection space of PCA models are evaluated. PLS-DA is for example applied for the prediction of adverse effects by nonsteroidal anti-... [Pg.63]

Figure 12.2 Scatter plot of the first two PC scores obtained from PCA analysis of the Fischer iris data set.

Figure 12.17 Scatter plot of the first two PC scores obtained from PCA analysis of the polyurethane foam spectra shown in Figure 12.16. Different symbols are used to denote samples belonging to the four known classes. The designated prediction samples are denoted as a solid circle (sample A) and a solid triangle (sample B).

LDA uses a space that is defined by a basis set of vectors, called linear discriminants (LDs) that are similar to the PCs obtained from PCA analysis. Like PCs, LDs are linear combinations of the original M variables in the jc data that are also orthogonal to one another. However, they are determined using a quite different criterion where the ratio of between-class variability and within-class variability in the calibration data is maximized. [Pg.396]

Figure 12.19 Scores and loadings obtained from PCA analysis of the NIR spectra of high-density/hw-denshy polyethylene blend films shown in Figure 12.18 (A) scatter plot of first 2 PC scores, (B) overlayed spectral plot of first two PC loadings. In the scores plot, different symbols are used to denote different blend compositions.

The K-Means method can be rather efficient for a large number of objects, but requires a priori selection of the number of clusters K, and thus is less exploratory in nature than the HCA methods discussed earlier. In addition, its outcome depends on the initial selection of targets. As a result, one can do a smart selection of the initial targets based on their uniqueness in the data set, for example those objects with the highest Hotelling P value (Section 12.2.5) obtained from a PCA analysis of the data. [Pg.407]

Regarding relevance, the spectral miscibility of the data obtained from these two different sources can be readily observed by doing a PCA analysis of the combined spectral data. The scatter plot of the first two PC scores obtained from PCA of such a data set for one of the process analytes is shown in Figure 12.31a. Note that there is considerable common space for the two data sources in the PC1/PC2 space, and there are some regions of this space where only samples from the old calibration strategy lie. A similar pattern is observed in the later PCs of this model. This result indicates that the on-line spectra contain some unique information, but that the on-line and injected-standard spectra are generally quite similar. [Pg.419]

In this formulation, Np is the total number of the compounds in pure classes, Nm the number of compounds in mixed classes, and Ntotal the total number of active compounds. C is the total number of classes obtained by PCA analysis and Ca the number of different activity classes in the database. Thus, according to this scoring scheme, high scores were obtained if many compounds occurred in a small number of pure classes. A scale factor of 100 was arbitrarily applied to obtain top scores greater than 1. [Pg.287]

For tliis crample, an examination of the complete three-dimensional row space (Figi 4.33) can be used to verify tliese conclusions. This plot shows the mean-cesrtered data and the principal component axes. This plot is consistent with ijbc conclusions from the PCA analysis and it is clear that the two PCs are effectively describing the object. [Pg.54]

Figure 4.67. Loadings from the PCA analysis of the complete library. The offset was added for ciarity.

When the PCA analysis is completed and the calibration and validation sets are chosen, the next step is to create SIMCA models for the calibration set samples. The initial rank estimates from PCA and software default class volumes are used, the performance of the models on the test samples is examined, and the SIMCA settings are adjusted as necessary. [Pg.80]

The PCA results for the two individual classes are examined first. The PCA of the entire training set is performed in PCA Example 2 in Section 4.2.2.2. In the current PCA analysis, the nmks and boundaries are chosen and then the SIMCA models are constructed and validated. [Pg.88]

From the PCA analysis, it was concluded that appropriate ranks for the TEA and MEK SI.MCA models are two and one, respectively. The next step is to construct SIMCA models and test their performance on validation samples. The ranks determined during the PCA analyses and the default settings for the class volume size for the models are used. [Pg.90]

In the following example only the details of the PCA analysis for class B will be examined. Nothing unusual was found from the separate PCA analyses of classes A and C. The rank for classes A and C were estimated to be one and three, respectively. [Pg.254]

The conclusion of the unsupervised PCA analysis was that there was enough selectivity to distinguish between compounds based on functional groups. However, it was unclear whether the se array could distinguish between compounds with the same chemical functionaltty. SIMCA models for 2 of the 10 compounds, triethylamine (TEA) and methylethylketone (MEK), are constructed and validated against the entire data set containing all 10 classes of compounds. [Pg.266]

Chemometric evaluation methods can be applied to the signal from a single sensor by feeding the whole data set into an evaluation program [133,135]. Both principle component analysis (PCA) and partial least square (PLS) models were used to evaluate the data. These are chemometric methods that may be used for extracting information from a multivariate data set (e.g., from sensor arrays) [135]. The PCA analysis shows that the MISiC-FET sensor differentiates very well between different lambda values in both lean gas mixtures (excess air) and rich gas mixtures (excess fuel). The MISiC-FET sensor is seen to behave as a linear lambda sensor [133]. It... [Pg.59]

FIGURE 22 Excipient library with the result of a PCA analysis for identity testing. [Pg.407]

Figure 9 PCA analysis of the scans shown in Figure 5. PCA model calculated using Preprocessing MSC (mean) Normalize with two PC.

Fig. 3.13. Partial least-squares (PLS) calibration of the API data set (5 s accumulation time). Spectra were baseline corrected, normalised to unit length and mean centred. The data set was randomly split into a calibration set (two-thirds) and a prediction set (one-third) obvious outliers from the PCA analysis were excluded from the analysis. The graph shows predicted versus measured API concentration of the prediction set. The straight line represents the 45° diagonal (this figure was published in [65], Copyright Elsevier (2008))...

Lakes. Flux calculations based only on wet and dry particle deposition were close to measured sediment fluxes. PCA analysis confirmed that wet and dry particle deposition was much more important than dry vapor deposition, based on homolog patterns. [Pg.77]

There are many different ways in which the classification space can be defined. The simplest space definition involves the use of individual selected X-variables to define each dimension in the space. One could also define the dimensions of the space using the first A PCs obtained from a PCA analysis of the calibration data. [Pg.286]

There are several distinctions of the PLS-DA method versus other classification methods. First of all, the classification space is unique. It is not based on X-variables or PCs obtained from PCA analysis, but rather the latent variables obtained from PLS or PLS-2 regression. Because these compressed variables are determined using the known class membership information in the calibration data, they should be more relevant for separating the samples by their classes than the PCs obtained from PCA. Secondly, the classification rule is based on results obtained from quantitative PLS prediction. When this method is applied to an unknown sample, one obtains a predicted number for each of the Y-variables. Statistical tests, such as the /-test discussed earlier (Section 8.2.2), can then be used to determine whether these predicted numbers are sufficiently close to 1 or 0. Another advantage of the PLS-DA method is that it can, in principle, handle cases where an unknown sample belongs to more than one class, or to no class at all. [Pg.293]

Figure 8.28 is a scatter plot of the first two rotated PC scores obtained from a PCA analysis of the NIR spectra of polyethylene blend films, after rotation was done to improve interpretability. In this example, it was determined that three PCs were optimal for the PCA model, and the rotated PCs one, two, and three explain 46.67, 49.25, and 2.56% of the variation in the NIR spectra, respectively. Although the higher explained variance of the second rotated PC might seem anomalous based on the criterion that PCA uses explained variance to determine each PC, one must be reminded that these explained variances refer to rotated PCs, rather than the original PCs. There are two interesting things to note about this plot ... [Pg.299]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...