Multiple linear regression, use

While simple linear regression uses only one independent variable for modeling, multiple linear regression uses more variables. [Pg.446]

The data were statistically analyzed using the SOLO Statistical System (BMDP Statistical Software, Inc., Los Angeles, CA) on a personal computer. Differences between groups were tested by the Mann-Whitney test or a paired t-test in cases where paired data sets were tested. Possible relationships were studied with (multiple) linear regression using least-square estimates. [Pg.127]

MLR Multiple linear regression (used as general term for all linear regression methods like OLS, PLS, PCR)... [Pg.308]

MULTIPLE LINEAR REGRESSION USING A POWER SERIES... [Pg.214]

MULTIPLE LINEAR REGRESSION USING THE ANALYSIS TOOLPAK... [Pg.216]

The function/is determined by multiple linear regression using precalculated sums of the individual components of the functions. [Pg.786]

Alternatively, instead of using the EBE of the parameter of interest as the dependent variable, an estimate of the random effect (t ) can be used as the dependent variable, similar to how partial residuals are used in stepwise linear regression. Early population pharmacokinetic methodology advocated multiple linear regression using either forward, backwards, or stepwise models. A modification of this is to use multiple simple linear models, one for each covariate. For categorical covariates, analysis of variance is used instead. If the p-value for the omnibus F-test or p-value for the T-test is less than some cut-off value, usually 0.05, the covariate is moved forward for further examination. Many reports in the literature use this approach. [Pg.236]

Besides these LFER-based models, approaches have been developed using whole-molecule descriptors and learning algorithms other then multiple linear regression (see Section 10.1.2). [Pg.494]

Multiple linear regression analysis is a widely used method, in this case assuming that a linear relationship exists between solubility and the 18 input variables. The multilinear regression analy.si.s was performed by the SPSS program [30]. The training set was used to build a model, and the test set was used for the prediction of solubility. The MLRA model provided, for the training set, a correlation coefficient r = 0.92 and a standard deviation of, s = 0,78, and for the test set, r = 0.94 and s = 0.68. [Pg.500]

Multiple linear regression is strictly a parametric supervised learning technique. A parametric technique is one which assumes that the variables conform to some distribution (often the Gaussian distribution) the properties of the distribution are assumed in the underlying statistical method. A non-parametric technique does not rely upon the assumption of any particular distribution. A supervised learning method is one which uses information about the dependent variable to derive the model. An unsupervised learning method does not. Thus cluster analysis, principal components analysis and factor analysis are all examples of unsupervised learning techniques. [Pg.719]

Using a multiple linear regression computer program, a set of substituent parameters was calculated for a number of the most commonly occurring groups. The calculated substituent effects allow a prediction of the chemical shifts of the exterior and central carbon atoms of the allene with standard deviations of l.Sand 2.3 ppm, respectively Although most compounds were measured as neat liquids, for a number of compounds duplicatel measurements were obtained in various solvents. [Pg.253]

Most of the 2D QSAR methods are based on graph theoretic indices, which have been extensively studied by Randic [29] and Kier and Hall [30,31]. Although these structural indices represent different aspects of molecular structures, their physicochemical meaning is unclear. Successful applications of these topological indices combined with multiple linear regression (MLR) analysis are summarized in Ref. 31. On the other hand, parameters derived from various experiments through chemometric methods have also been used in the study of peptide QSAR, where partial least square (PLS) [32] analysis has been employed [33]. [Pg.359]

We will explore the two major families of chemometric quantitative calibration techniques that are most commonly employed the Multiple Linear Regression (MLR) techniques, and the Factor-Based Techniques. Within each family, we will review the various methods commonly employed, learn how to develop and test calibrations, and how to use the calibrations to estimate, or predict, the properties of unknown samples. We will consider the advantages and limitations of each method as well as some of the tricks and pitfalls associated with their use. While our emphasis will be on quantitative analysis, we will also touch on how these techniques are used for qualitative analysis, classification, and discriminative analysis. [Pg.2]

Experimental polymer rheology data obtained in a capillary rheometer at different temperatures is used to determine the unknown coefficients in Equations 11 - 12. Multiple linear regression is used for parameter estimation. The values of these coefficients for three different polymers is shown in Table I. The polymer rheology is shown in Figures 2 - 4. [Pg.137]

The study is based on four iinear hydrocarbons (in Ci, Ce to Ca) and the model uses Antoine and Clapeyron s equations. The flashpoints used by the author do not take into account all experimental values that are currently available the correlation coefficients obtained during multiple linear regression adjustments between experimental and estimated values are very bad (0.90 to 0.98 see the huge errors obtained from a correlation study concerning flashpoints for which the present writer still has a coefficient of 0.9966). The modei can be used if differences between pure cmpounds are still low regarding boiling and flashpoints. [Pg.69]

Aqueous solubility is selected to demonstrate the E-state application in QSPR studies. Huuskonen et al. modeled the aqueous solubihty of 734 diverse organic compounds with multiple linear regression (MLR) and artificial neural network (ANN) approaches [27]. The set of structural descriptors comprised 31 E-state atomic indices, and three indicator variables for pyridine, ahphatic hydrocarbons and aromatic hydrocarbons, respectively. The dataset of734 chemicals was divided into a training set ( =675), a vahdation set (n=38) and a test set (n=21). A comparison of the MLR results (training, r =0.94, s=0.58 vahdation r =0.84, s=0.67 test, r =0.80, s=0.87) and the ANN results (training, r =0.96, s=0.51 vahdation r =0.85, s=0.62 tesL r =0.84, s=0.75) indicates a smah improvement for the neural network model with five hidden neurons. These QSPR models may be used for a fast and rehable computahon of the aqueous solubihty for diverse orgarhc compounds. [Pg.93]

There are many different methods for selecting those descriptors of a molecule that capture the information that somehow encodes the compounds solubility. Currently, the most often used are multiple linear regression (MLR), partial least squares (PLS) or neural networks (NN). The former two methods provide a simple linear relationship between several independent descriptors and the solubility, as given in Eq. (14). This equation yields the independent contribution, hi, of each descriptor, Di, to the solubility ... [Pg.302]

Two models of practical interest using quantum chemical parameters were developed by Clark et al. [26, 27]. Both studies were based on 1085 molecules and 36 descriptors calculated with the AMI method following structure optimization and electron density calculation. An initial set of descriptors was selected with a multiple linear regression model and further optimized by trial-and-error variation. The second study calculated a standard error of 0.56 for 1085 compounds and it also estimated the reliability of neural network prediction by analysis of the standard deviation error for an ensemble of 11 networks trained on different randomly selected subsets of the initial training set [27]. [Pg.385]

In multiple linear regression (MLR) we are given an nxp matrix X and an n vector y. The problem is to find an unknown p vector b such that the product y of X with b is as close as possible to the original y using a least squares criterion ... [Pg.53]

Note that the lipophilicity parameter log P is defined as a decimal logarithm. The parabolic equation is only non-linear in the variable log P, but is linear in the coefficients. Hence, it can be solved by multiple linear regression (see Section 10.8). The bilinear equation, however, is non-linear in both the variable P and the coefficients, and can only be solved by means of non-linear regression techniques (see Chapter 11). It is approximately linear with a positive slope (/ ,) for small values of log P, while it is also approximately linear with a negative slope b + b for large values of log P. The term bilinear is used in this context to indicate that the QSAR model can be resolved into two linear relations for small and for large values of P, respectively. This definition differs from the one which has been introduced in the context of principal components analysis in Chapter 17. [Pg.390]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...