Principal components analysis algorithms

Kohonen network Conceptual clustering Principal Component Analysis (PCA) Decision trees Partial Least Squares (PLS) Multiple Linear Regression (MLR) Counter-propagation networks Back-propagation networks Genetic algorithms (GA)... [Pg.442]

The field points must then be fitted to predict the activity. There are generally far more field points than known compound activities to be fitted. The least-squares algorithms used in QSAR studies do not function for such an underdetermined system. A partial least squares (PLS) algorithm is used for this type of fitting. This method starts with matrices of field data and activity data. These matrices are then used to derive two new matrices containing a description of the system and the residual noise in the data. Earlier studies used a similar technique, called principal component analysis (PCA). PLS is generally considered to be superior. [Pg.248]

R. Henrion, N-way principal component analysis. Theory, algorithms and applications. Chemom. Intell. Lab. Syst., 25 (1994) 1-23. [Pg.160]

The extent of homogeneous mixing of pharmaceutical components such as active drug and excipients has been studied by near-IR spectroscopy. In an application note from NIRSystems, Inc. [47], principal component analysis and spectral matching techniques were used to develop a near-IR technique/algorithm for determination of an optimal mixture based upon spectral comparison with a standard mixture. One advantage of this technique is the use of second-derivative spectroscopy techniques to remove any slight baseline differences due to particle size variations. [Pg.81]

The constrained least-square method is developed in Section 5.3 and a numerical example treated in detail. Efficient specific algorithms taking errors into account have been developed by Provost and Allegre (1979). Literature abounds in alternative methods. Wright and Doherty (1970) use linear programming methods that are fast and offer an easy implementation of linear constraints but the structure of the data is not easily perceived and error assessment inefficiently handled. Principal component analysis (Section 4.4) is more efficient when the end-members are unknown. [Pg.9]

Key Words Biological activity chemical features chemical space cluster analysis compound databases dimension reduction molecular descriptors molecule classification partitioning algorithms partitioning in low-dimensional spaces principal component analysis visualization. [Pg.279]

Xue, L. and Bajorath, J. (2000) Molecular descriptors for effective classification of biologically active compounds based on principal component analysis identified by a genetic algorithm. J. Chem. Inf. Comput. Sci. 40, 801-809. [Pg.288]

A generalised structure of an electronic nose is shown in Fig. 15.9. The sensor array may be QMB, conducting polymer, MOS or MS-based sensors. The data generated by each sensor are processed by a pattern-recognition algorithm and the results are then analysed. The ability to characterise complex mixtures without the need to identify and quantify individual components is one of the main advantages of such an approach. The pattern-recognition methods maybe divided into non-supervised (e.g. principal component analysis, PCA) and supervised (artificial neural network, ANN) methods also a combination of both can be used. [Pg.330]

In general, there are two types of compression (1) individual spectra can be compressed and filtered and (2) the entire dataset can be compressed and filtered by representing each of the individual spectra as a linear combination of some smaller set of data, which is referred to as a basis set. In this section, we will address the processing of individual spectra by applying the fast fourier transform (FFT) algorithm and followed this discussion with one on processing sets of spectra with principal component analysis (PCA). [Pg.87]

Reasonable noise in the spectral data does not affect the clustering process. In this respect, cluster analysis is much more stable than other methods of multivariate analysis, such as principal component analysis (PCA), in which an increasing amount of noise is accumulated in the less relevant clusters. The mean cluster spectra can be extracted and used for the interpretation of the chemical or biochemical differences between clusters. HCA, per se, is ill-suited for a diagnostic algorithm. We have used the spectra from clusters to train artificial neural networks (ANNs), which may serve as supervised methods for final analysis. This process, which requires hundreds or thousands of spectra from each spectral class, is presently ongoing, and validated and blinded analyses, based on these efforts, will be reported. [Pg.194]

Thus, multilinear models were introduced, and then a wide series of tools, such as nonlinear models, including artificial neural networks, fuzzy logic, Bayesian models, and expert systems. A number of reviews deal with the different techniques [4-6]. Mathematical techniques have also been used to keep into account the high number (up to several thousands) of chemical descriptors and fragments that can be used for modeling purposes, with the problem of increase in noise and lack of statistical robustness. Also in this case, linear and nonlinear methods have been used, such as principal component analysis (PCA) and genetic algorithms (GA) [6]. [Pg.186]

Principal components analysis. There are innumerable excellent descriptions of the mathematical basis of PCA26-30 and this article will provide only a general overview. It is important, first, not to be confused between algorithms which are a means to an end, and the end in itself. There are several PCA algorithms of which NIPALS (described in Appendix A2.1) and SVD are two of the most common. If correctly applied, they will both lead to the same answer (within computer precision), the best approach depending on factors such as computing power and the number of components to be calculated. [Pg.9]

A 2.1 Principal components analysis NIPALS is a common, iterative, algorithm. [Pg.27]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...