Latent variable analysis

Latent-Variable Analysis of Multivariate Data in Infrared Spectrometry... [Pg.145]

LATENT-VARIABLE ANALYSIS OF MULTIVARIATE DATA IN INFRARED SPECTROMETRY... [Pg.146]

Many books are available in the market that explains the philosophy (1,2) and mathematical frame of methods (2-5) for latent-variable analysis. This theory section only... [Pg.146]

Another problem is to determine the optimal number of descriptors for the objects (patterns), such as for the structure of the molecule. A widespread observation is that one has to keep the number of descriptors as low as 20 % of the number of the objects in the dataset. However, this is correct only in case of ordinary Multilinear Regression Analysis. Some more advanced methods, such as Projection of Latent Structures (or. Partial Least Squares, PLS), use so-called latent variables to achieve both modeling and predictions. [Pg.205]

We have to apply projection techniques which allow us to plot the hyperspaces onto two- or three-dimensional space. Principal Component Analysis (PCA) is a method that is fit for performing this task it is described in Section 9.4.4. PCA operates with latent variables, which are linear combinations of the original variables. [Pg.213]

Other chemometrics methods to improve caUbration have been advanced. The method of partial least squares has been usehil in multicomponent cahbration (48—51). In this approach the concentrations are related to latent variables in the block of observed instmment responses. Thus PLS regression can solve the colinearity problem and provide all of the advantages discussed earlier. Principal components analysis coupled with multiple regression, often called Principal Component Regression (PCR), is another cahbration approach that has been compared and contrasted to PLS (52—54). Cahbration problems can also be approached using the Kalman filter as discussed (43). [Pg.429]

We are about to enter what is, to many, a mysterious world—the world of factor spaces and the factor based techniques, Principal Component Analysis (PCA, sometimes known as Factor Analysis) and Partial Least-Squares (PLS) in latent variables. Our goal here is to thoroughly explore these topics using a data-centric approach to dispell the mysteries. When you complete this chapter, neither factor spaces nor the rhyme at the top of this page will be mysterious any longer. As we will see, it s all in your point of view. [Pg.79]

Partial least squares regression (PLS). Partial least squares regression applies to the simultaneous analysis of two sets of variables on the same objects. It allows for the modeling of inter- and intra-block relationships from an X-block and Y-block of variables in terms of a lower-dimensional table of latent variables [4]. The main purpose of regression is to build a predictive model enabling the prediction of wanted characteristics (y) from measured spectra (X). In matrix notation we have the linear model with regression coefficients b ... [Pg.544]

The eigenvectors extracted from the cross-product matrices or the singular vectors derived from the data matrix play an important role in multivariate data analysis. They account for a maximum of the variance in the data and they can be likened to the principal axes (of inertia) through the patterns of points that represent the rows and columns of the data matrix [10]. These have been called latent variables [9], i.e. variables that are hidden in the data and whose linear combinations account for the manifest variables that have been observed in order to construct the data matrix. The meaning of latent variables is explained in detail in Chapters 31 and 32 on the analysis of measurement tables and contingency tables. [Pg.50]

O. M. Kvalheim, Interpretation of direct latent-variable projection methods and their aims and use in the analysis of multicomponent spectroscopic and chromatographic data. Chemom. Intell. Lab. Syst., 4 (1988) 11-25. [Pg.56]

A first introduction to principal components analysis (PCA) has been given in Chapter 17. Here, we present the method from a more general point of view, which encompasses several variants of PCA. Basically, all these variants have in common that they produce linear combinations of the original columns in a measurement table. These linear combinations represent a kind of abstract measurements or factors that are better descriptors for structure or pattern in the data than the original measurements [1]. The former are also referred to as latent variables [2], while the latter are called manifest variables. Often one finds that a few of these abstract measurements account for a large proportion of the variation in the data. In that case one can study structure and pattern in a reduced space which is possibly two- or three-dimensional. [Pg.88]

The analysis of Table 31.2 by CFA is shown in Fig. 31.11. As can be seen, the result is very similar to that obtained by log double-centering in Figs. 31.9 and 31.10. The first latent variable expresses a contrast between NO2 substituted chalcones and the others. The second latent variable seems to be related to the electronic properties of the substituents. The contributions of the two latent variables to the total inertia is 96%. The double-closed biplot of Fig. 31.11 does not allow a direct interpretation of unipolar and bipolar axes in terms of the original data X. The other rules of interpretation are similar to those of the log double-centered biplot in the previous subsection. Compounds and methods that seem to have moved away from the center and in the same directions possess a positive interaction (attraction). Those that moved in opposite directions show a negative interaction (repulsion). [Pg.132]

More specifically, input data analysis methods are similar to input-output methods, but rely on different strategies for extracting the relevant information. With reference to the general expression in Eq. (4), the resulting analyzed or latent variable for all input methods can be represented as... [Pg.10]

Methods based on linear projection exploit the linear relationship among inputs by projecting them on a linear hyperplane before applying the basis function (see Fig. 6a). Thus, the inputs are transformed in combination as a linear weighted sum to form the latent variables. Univariate input analysis is a special case of this category where the single variable is projected on itself. [Pg.11]

As introduced earlier, inputs can be transformed to reduce their dimensionality and extract more meaningful features by a variety of methods. These methods perform a numeric-numeric transformation of the measured input variables. Interpretation of the transformed inputs requires determination of their mapping to the symbolic outputs. The inputs can be transformed with or without taking the behavior of the outputs into account by univariate and multivariate methods. The transformed features or latent variables extracted by input or input-output analysis methods are given by Eq. (5) and can be used as input to the interpretation step. [Pg.45]

If the probability distribution of the data is or assumed Gaussian, several statistical measures are available for interpreting the data. These measures can be used to interpret the latent variables determined by a selected data analysis method. Those described here are a combination of statistical measures and graphical analysis. Taken together they provide an assessment of the statistical significance of the analysis. [Pg.55]

On the other hand, when latent variables instead of the original variables are used in inverse calibration then powerful methods of multivariate calibration arise which are frequently used in multispecies analysis and single species analysis in multispecies systems. These so-called soft modeling methods are based, like the P-matrix, on the inverse calibration model by which the analytical values are regressed on the spectral data ... [Pg.186]

Partial least squares (PLS) projections to latent structures [40] is a multivariate data analysis tool that has gained much attention during past decade, especially after introduction of the 3D-QSAR method CoMFA [41]. PLS is a projection technique that uses latent variables (linear combinations of the original variables) to construct multidimensional projections while focusing on explaining as much as possible of the information in the dependent variable (in this case intestinal absorption) and not among the descriptors used to describe the compounds under investigation (the independent variables). PLS differs from MLR in a number of ways (apart from point 1 in Section 16.5.1) ... [Pg.399]

The number of original descriptors may vastly exceed the number of compounds in the analysis (as opposed to MLR), since PLS is using only a few (usually less than 5-10) latent variables for the actual statistical analysis. [Pg.399]

The number of latent variables (PLS components) must be determined by some sort of validation technique, e.g., cross-validation [42], The PLS solution will coincide with the corresponding MLR solution when the number of latent variables becomes equal to the number of descriptors used in the analysis. The validation technique, at the same time, also serves the purpose to avoid overfitting of the model. [Pg.399]

The VolSurf method was used to produce molecular descriptors, and PLS discriminant analysis (DA) was applied. The statistical model showed two significant latent variables after cross-validation. The 2D PLS score model offers a discrimination between the permeable and less permeable compounds. When the spectrum color is active (Fig. 17.2), red points refer to high permeability, whereas blue points indicate low permeability. There is a region in the central part of the plot with both red and blue compounds. In this region, and in between the two continuous lines, the permeability prediction is less reliable. The permeability model... [Pg.410]

As a result of this protocol, four indicators were dropped because in each case, they did not pass the first consistency test, that is, failed to discriminate adequately at all levels of the scale. Next, Tyrka et al. (1995) calculated the taxon base rate for each indicator using a hybrid of MAXCOV and Latent Class Analysis estimation procedures (for details see Golden, 1982) and adjusted the estimate for the true- and false-positive rates computed earlier. The average taxon base rate was. 49. The authors did not report a variability statistic, but a simple computation shows that SD of base rate estimates was. 04. [Pg.118]

Table 1. Confusion matrix for the PLS-DA analysis of the California obsidian localities using 600 broadband LIBS spectra and 15 latent variables to produce 5 model classes. Horizontal lines separate the groupings for the Coso Volcanic Field by Draucker (2007) and the other four California obsidian localities...

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...