Partial least squares efficiency

The purpose of Partial Least Squares (PLS) regression is to find a small number A of relevant factors that (i) are predictive for Y and (u) utilize X efficiently. The method effectively achieves a canonical decomposition of X in a set of orthogonal factors which are used for fitting Y. In this respect PLS is comparable with CCA, RRR and PCR, the difference being that the factors are chosen according to yet another criterion. [Pg.331]

The nonlinear iterative partial least-squares (NIPALS) algorithm, also called power method, has been popular especially in the early time of PCA applications in chemistry an extended version is used in PLS regression. The algorithm is efficient if only a few PCA components are required because the components are calculated step-by-step. [Pg.87]

Harding, Popelier, and co-workers [285,286] have employed a variety of quantum chemical approaches in their estimation of the pK s ol oxyacids. In a study of 228 carboxylic acids they used what they call quantum chemical topology to find pK estimates. They tested several different methods, including partial least squares (PLS), support vector machines (SVMs), and radial basis function neural networks (RBFNNs) with Hartree-Fock and density functional calculations, concluding that the SVM models with HF/6-31G calculations were most efficient [285]. Foi a data set of 171 phenols they found that the C-0 bond length provided an effective descriptor for pK estimation [286]. [Pg.70]

Eor multivariate calibration in analytical chemistry, the partial least squares (PLS) method [19], is very efficient. Here, the relations between a set of predictors and a set (not just one) of response variables are modeled. In multicomponent calibration the known concentrations of / components in n calibration samples are collected to constitute the response matrix Y (n rows, / columns). Digitization of the spectra of calibration samples using p wavelengths yields the predictor matrix X (n rows, p columns). The relations between X and Y are modeled by latent variables for both data sets. These latent variables (PLS components) are constructed to exhaust maximal variance (information) within both data sets on the one hand and to be maximally correlated for the purpose of good prediction on the other hand. From the computational viewpoint, solutions are obtained by a simple iterative procedure. Having established the model for calibration samples. comp>o-nent concentrations for future mixtures can be predicted from their spectra. A survey of multi-component regression is contained in [20],... [Pg.59]

The PLS approach was developed around 1975 by Herman Wold and co-workers for the modeling of complicated data sets in terms of chains of matrices (blocks), so-called path models . Herman Wold developed a simple but efficient way to estimate the parameters in these models called NIPALS (nonlinear iterative partial least squares). This led, in turn, to the acronym PLS for these models, where PLS stood for partial least squares . This term describes the central part of the estimation, namely that each model parameter is iteratively estimated as the slope of a simple bivariate regression (least squares) between a matrix column or row as the y variable, and another parameter vector as the x variable. So, for instance, in each iteration the PLS weights w are re-estimated as u X/(u u). Here denotes u transpose, i.e., the transpose of the current u vector. The partial in PLS indicates that this is a partial regression, since the second parameter vector (u in the... [Pg.2007]

The main algorithms used for eigenvectors/eigenvalues computation differ in two aspects the matrix to work on, either X X (eigenvalue decomposition (EVD) and the POWER method) or X (singular value decomposition (SVD) and non-linear iterative partial least squares (NIPALS)). However SVD may work as well on X X (giving the same results as eigenvalue decomposition). Another difference is whether PCs are obtained simultaneously (EVD and SVD) or sequentially (POWER and NIPALS) for details and comparison of efficiency see Wu et al. [38]. In all the cases for which rows dimension I is much smaller than columns dimension /, one can operate on XX instead (EVD, POWER, SVD), and on X (NIPALS). [Pg.86]

In the MCR framework, there are few cases in which the quantitative analysis is based on the acquisition of a single spectrum per sample, as is the case for classical first-order multivariate calibration methods, such as partial least squares (PLS), seen in other chapters of this book. There are some instances in which quantitation of compounds in a sample by MCR can be based on a single spectrum, that is, a row of the D matrix and the related row of the C matrix. Sometimes, this is feasible when the compounds to be determined provide a very high signal compared with the rest of the substances in the food sample, for example colouring additives in drinks determined by ultraviolet—visible (UV-vis) spectroscopy [26,27]. Recently, these examples have increased due to the incorporation of a new cmistraint in MCR, the so-caUed correlation constraint [27,46,47], which introduces an internal calihratimi step in the calculation of the elements of the concentradmi profiles in the matrix C related to the analytes to be quantified. This calibration step helps to obtain real concentration values and to separate in a more efficient way the information of the analytes to be quantified from that of the interferences. [Pg.256]

Estimating the model parameter values is in general performed using one of two methods the method of moments leading to the Yule-Walker equations or the maximum-likelihood method. Although the Yule-Walker equations are simpler, they only provide an efficient estimator for autoregressive models. Also, the Yule-Walker equations are useful for estimating the partial autocorrelation function. Least-squares estimates are also possible, but they are difficult to solve analytically due to the complex nature of the models. [Pg.241]

The facts to which the above two paragraphs refer, suggest that, at least concerning the use of square-integrable functions for the calculations of resonance states, alternative theories are needed. Indeed, the CESE-SSA, whose basic elements and characteristics are reviewed here, is structured so as to allow the practical and efficient computation of the MEP in electronic structures, the multichannel continuum and partial widths, and, in general, the production of easily usable wavefunctions that contain the information that is relevant to the state and property of interest. [Pg.214]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...