Multilinear modeling analysis

Another problem is to determine the optimal number of descriptors for the objects (patterns), such as for the structure of the molecule. A widespread observation is that one has to keep the number of descriptors as low as 20 % of the number of the objects in the dataset. However, this is correct only in case of ordinary Multilinear Regression Analysis. Some more advanced methods, such as Projection of Latent Structures (or. Partial Least Squares, PLS), use so-called latent variables to achieve both modeling and predictions. [Pg.205]

Furthermore, QSPR models for the prediction of free-energy based properties that are based on multilinear regression analysis are often referred to as LFER models, especially, in the wide field of quantitative structure-activity relationships (QSAR). [Pg.489]

The models are applicable to large data sets with a rapid calculation speed, a wide range of compounds can be processed. Neural networks provided better models than multilinear regression analysis. [Pg.504]

More than just a few parameters have to be considered when modelling chemical reactivity in a broader perspective than for the well-defined but restricted reaction sets of the preceding section. Here, however, not enough statistically well-balanced, quantitative, experimental data are available to allow multilinear regression analysis (MLRA). An additional complicating factor derives from comparison of various reactions, where data of quite different types are encountered. For example, how can product distributions for electrophilic aromatic substitutions be compared with acidity constants of aliphatic carboxylic acids And on the side of the parameters how can the influence on chemical reactivity of both bond dissociation energies and bond polarities be simultaneously handled when only limited data are available ... [Pg.60]

Algebraic expressions for terms M and C were derived using Dewar s PMO method (for C in a version similar to the co-technique [57] in order to calculate carbocation stabilization energies). The size factor S is simply a cubic function of the number of carbon atoms [97], The three independent variables of the model were assumed to be linearly related to the experimental Iball indices (vide supra). By multilinear regression analysis (sample size = 26) an equation was derived for calculating Iball indices from the three theoretical parameters. The correlation coefficient for the linear relation between calculated and experimental Iball indices is r = 0.961. [Pg.120]

Thus, multilinear models were introduced, and then a wide series of tools, such as nonlinear models, including artificial neural networks, fuzzy logic, Bayesian models, and expert systems. A number of reviews deal with the different techniques [4-6]. Mathematical techniques have also been used to keep into account the high number (up to several thousands) of chemical descriptors and fragments that can be used for modeling purposes, with the problem of increase in noise and lack of statistical robustness. Also in this case, linear and nonlinear methods have been used, such as principal component analysis (PCA) and genetic algorithms (GA) [6]. [Pg.186]

Most conventional filters involve computing local multilinear models, but in certain areas, such as process analysis, there can be spikes (or outliers) in the data which are unlikely to be part of a continuous process. An alternative method involves using... [Pg.134]

The parallel factor analysis (PARAFAC) model [18-20] is based on a multilinear model, and is one of several decomposition methods for a multidimensional data set. A major advantage of this model is that data can be uniquely decomposed into individual contributions. Because of this, the PARAFAC model has been widely applied to 3D and also higher dimensional data in the field of chemometrics. It is known that fluorescence data is one example that corresponds well with the PARAFAC model [21]. [Pg.342]

It is clear that for an unsymmetrical data matrix that contains more variables (the field descriptors at each point of the grid for each probe used for calculation) than observables (the biological activity values), classical correlation analysis as multilinear regression analysis would fail. All 3D QSAR methods benefit from the development of PLS analysis, a statistical technique that aims to find the multidimensional direction in the X space that explains the maximum multidimensional variance direction in the F space. PLS is related to principal component analysis (PCA)." ° However, instead of finding the hyperplanes of maximum variance, it finds a linear model describing some predicted variables in terms of other observable variables and therefore can be used directly for prediction. Complexity reduction and data... [Pg.592]

When data can be assumed to be approximately multilinear there is little if any benefit in matricizing the data before analysis. Even though the two-way models describe more variation per definition, the increased modeling power does not necessarily provide more predictive models in terms of modeling either the independent or the dependent variables. Even when the data do not approximately follow a multilinear model (e.g. sensory data), the multilinear models can be preferred if the possible bias in having too simple an X-model is counteracted by the smaller amount of overfit. [Pg.288]

From the perspective of multilinear modeling, we note that one can use a model in which some components for some ways are described by a parametric model, while other components or other ways continue to be described by a general multilinear model. Such a parametric submodel has fewer parameters and may thus be more parsimonious and more accurate than a general multilinear model. When a parametric model is used for all components of at least one way, use of the resulting multilinear submodel is equivalent to global analysis. [Pg.692]

When the dependence of the spectroscopic intensity from every chro-mophore on at least one experimental variable can be described by a highly specific mathematical function, then the approach known as global analysis is preferred. When this condition is not known to be met, but spectroscopic intensity is separately linear in functions of two or more experimental variables, then the multilinear models described in this chapter are valuable. [Pg.700]

The empirical correlations do not differentiate between configurational isomers or polymorphs, but they are useful in order to estimate the overall expected values of PE. As can be seen from Table 8-2, the values derived from the multilinear regression analysis (MLRA) are at average 28 % ( ) lower than the PEs calculated by the force field, which indicates a likely overestimation of polar forces by the chosen charge model. In contrast, the values for the least polar molecule listed in Table 8-2, P.B.16, are almost identical. [Pg.107]

Multiple linear regression analysis is a widely used method, in this case assuming that a linear relationship exists between solubility and the 18 input variables. The multilinear regression analy.si.s was performed by the SPSS program [30]. The training set was used to build a model, and the test set was used for the prediction of solubility. The MLRA model provided, for the training set, a correlation coefficient r = 0.92 and a standard deviation of, s = 0,78, and for the test set, r = 0.94 and s = 0.68. [Pg.500]

Paatero P, The multilinear engine - a table-driven, least squares program for solving multilinear problems, including the n-way parallel factor analysis model, Journal of Computational and Graphical Statistics, 1999, 8, 854-888. [Pg.363]

The experimental results are also listed in table 9.14. The data were analysed according to the two models, equations 9.9 and 9.10. Since some measurements of solubility were in duplicate, we can estimate the reproducibility of the experimental technique. It is possible to estimate the model with the data for the duplicated points 1-6, and then validate it with the test points 7-12. Instead we estimate it by multilinear regression over all the data and then test by analysis of variance. [Pg.412]

In the regular analysis and reinforcement of the ring beam, the beam is simplified to a multilinear beam for modeling. The required assumptions are ... [Pg.335]

If gas selectivity cannot be achieved by improving the sensor setup itself, it is possible to use several nonselective sensors and predict the concentration by model based, such as multilinear regression (MLR), principle component analysis (PCA), principle component regression (PCR), partial least squares (PLS), and multivariate adaptive regression splines (MARS), or data-based algorithms, such as cluster analysis (CA) and artificial neural networks (ANN) (for details see Reference 10) (Figure 22.5). For common applications of pattern recognition and multi component analysis of gas mixtures, arrays of sensors are usually chosen... [Pg.686]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...