Multiple linear regression MLR

An extension of linear regression, MLR involves the use of more than one independent variable. Such a technique can be very effective if it is suspected that the information contained in a single independent variable (X) is insufficient to explain the variation in the dependent variable (Y). For example, it is suspected that a single integrated absorbance of the NIR water band at 1920 nm is insufficient to provide accurate concentrations of water contents in process samples. Such a situation can occur for several reasons, such as the presence of other varying chemical components in the sample that interfere with the 1920-nm band. In such cases, it is necessary to use more than one band in the spectrum to build an effective calibration model, so that the effects of such interferences can be compensated. [Pg.236]

The MLR model is simply an extension of the linear regression model (Equation 8.6), and is given below [Pg.236]

The difference here is that X is a matrix that contains responses from M ( 1) different X-variables, and b contains regression coefficients for each of the M X-variables. If X and b are augmented to include an offset term (as in Eqn. 8.7) the coefficients for MLR (bavg) are determined using the least squares method [Pg.236]

At this point, it is important to note two limitations of the MLR method. First, note that the X-variable matrix (X) has the dimensionality of N by M. As a result, the number of [Pg.236]

X-variables (M) cannot exceed the number of samples (N), otherwise the matrix inversion operation (XlXj 1 in Equation 8.13 cannot be done. Secondly, if any two of the X-variables are correlated to one another, then the same matrix inversion cannot be done. In real applications, where there is noise in the data, it is rare to have two X-variables exactly correlated to one another. However, a high degree of correlation between any two X-variables leads to an unstable matrix inversion, which results in a large amount of noise being introduced to the regression coefficients. Therefore, one must be wary of intercorrelation between X-variables when using the MLR method. [Pg.237]

In practice this simple additive model may not describe the situation completely. There are two reasons for this. The first is that the substances of interest may interfere with each other chemically in a way that affects their spectra. The second is that the specimens from real-life sources may well contain substances other than those of interest, which make a contribution to the absorbance. In these cases it is better to use inverse calibration and calibrate with real-life specimens. The term inverse calibration means that the analyte concentration is modelled as a function of the spectrum (i.e. the reverse of the classical method). For the data in Table 8.4 the regression equations take the form q = fooi + + 2/ 2 + + Inverse [Pg.229]

The following sections describe a number of methods for predicting one set of variables from another set of variables. In each case the inverse calibration method is illustrated using the data in Table 8.4. [Pg.229]

Multiple linear regression (MLR) involves finding regression equations in the form Cl = boi + b Ai + b2iA2 + + In order to carry out MLR the number of calibration specimens must be greater than the number of predictors. This is true for the data in Table 8.4 where there are 10 specimens and six predictors. [Pg.229]

Find the regression equations for predicting c, and C3 from Ai, A2, etc., for the data in Table 8.4. [Pg.230]

This printout gives the regression equation for predicting q from Ay A2, etc., as [Pg.230]

Kohonen network Conceptual clustering Principal Component Analysis (PCA) Decision trees Partial Least Squares (PLS) Multiple Linear Regression (MLR) Counter-propagation networks Back-propagation networks Genetic algorithms (GA)... [Pg.442]

Sections 9A.2-9A.6 introduce different multivariate data analysis methods, including Multiple Linear Regression (MLR), Principal Component Analysis (PCA), Principal Component Regression (PCR) and Partial Least Squares regression (PLS). [Pg.444]

Multiple linear regression (MLR) models a linear relationship between a dependent variable and one or more independent variables. [Pg.481]

Most of the 2D QSAR methods are based on graph theoretic indices, which have been extensively studied by Randic [29] and Kier and Hall [30,31]. Although these structural indices represent different aspects of molecular structures, their physicochemical meaning is unclear. Successful applications of these topological indices combined with multiple linear regression (MLR) analysis are summarized in Ref. 31. On the other hand, parameters derived from various experiments through chemometric methods have also been used in the study of peptide QSAR, where partial least square (PLS) [32] analysis has been employed [33]. [Pg.359]

We will explore the two major families of chemometric quantitative calibration techniques that are most commonly employed the Multiple Linear Regression (MLR) techniques, and the Factor-Based Techniques. Within each family, we will review the various methods commonly employed, learn how to develop and test calibrations, and how to use the calibrations to estimate, or predict, the properties of unknown samples. We will consider the advantages and limitations of each method as well as some of the tricks and pitfalls associated with their use. While our emphasis will be on quantitative analysis, we will also touch on how these techniques are used for qualitative analysis, classification, and discriminative analysis. [Pg.2]

Classical least-squares (CLS), sometimes known as K-matrix calibration, is so called because, originally, it involved the application of multiple linear regression (MLR) to the classical expression of the Beer-Lambert Law of spectroscopy ... [Pg.51]

Multiple Linear Regression (MLR), Classical Least-Squares (CLS, K-matrix), Inverse Least-Squares (ILS, P-matrix)... [Pg.191]

Aqueous solubility is selected to demonstrate the E-state application in QSPR studies. Huuskonen et al. modeled the aqueous solubihty of 734 diverse organic compounds with multiple linear regression (MLR) and artificial neural network (ANN) approaches [27]. The set of structural descriptors comprised 31 E-state atomic indices, and three indicator variables for pyridine, ahphatic hydrocarbons and aromatic hydrocarbons, respectively. The dataset of734 chemicals was divided into a training set ( =675), a vahdation set (n=38) and a test set (n=21). A comparison of the MLR results (training, r =0.94, s=0.58 vahdation r =0.84, s=0.67 test, r =0.80, s=0.87) and the ANN results (training, r =0.96, s=0.51 vahdation r =0.85, s=0.62 tesL r =0.84, s=0.75) indicates a smah improvement for the neural network model with five hidden neurons. These QSPR models may be used for a fast and rehable computahon of the aqueous solubihty for diverse orgarhc compounds. [Pg.93]

There are many different methods for selecting those descriptors of a molecule that capture the information that somehow encodes the compounds solubility. Currently, the most often used are multiple linear regression (MLR), partial least squares (PLS) or neural networks (NN). The former two methods provide a simple linear relationship between several independent descriptors and the solubility, as given in Eq. (14). This equation yields the independent contribution, hi, of each descriptor, Di, to the solubility ... [Pg.302]

Fig, 29.10. Geometrical interpretation of multiple linear regression (MLR). The pattern of points in S representing a matrix X is projected upon a vector b, which is imaged in 5" by the point y. The orientation of the vector b is determined such that the distance between y and the given y is minimal. [Pg.52]

In multiple linear regression (MLR) we are given an nxp matrix X and an n vector y. The problem is to find an unknown p vector b such that the product y of X with b is as close as possible to the original y using a least squares criterion ... [Pg.53]

OLS is also called multiple linear regression (MLR) and is a commonly used method to obtain a linear input-output model for a given data set. The model obtained by OLS for a single output is given by... [Pg.33]

Multiple linear regression (MLR) is a classic mathematical multivariate regression analysis technique [39] that has been applied to quantitative structure-property relationship (QSPR) modeling. However, when using MLR there are some aspects, with respect to statistical issues, that the researcher must be aware of ... [Pg.398]

More on Multiple linear least squares regression (MLLSR), also known as Multiple linear regression (MLR) and P-matrix, and its sibling, K-matrix... [Pg.3]

In Chapters 2 and 3, we discussed the rules related to solving systems of linear equations using elementary algebraic manipulation, including simple matrix operations. The past chapters have described the inverse and transpose of a matrix in at least an introductory fashion. In this installment we would like to introduce the concepts of matrix algebra and their relationship to multiple linear regression (MLR). Let us start with the basic spectroscopic calibration relationship ... [Pg.28]

Therefore this chapter will continue the multiple linear regression (MLR) discussion introduced in the previous chapter, by solving a numerical example for MLR. Recalling... [Pg.34]

Since most quantitative applications are on mixtures of materials, complex mathematical treatments have been developed. The most common programs are Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Principal Component Analyses (PCA). While these are described in detail in another chapter, they will be described briefly here. [Pg.173]

In a paper that addresses both these topics, Gordon et al.11 explain how they followed a com mixture fermented by Fusarium moniliforme spores. They followed the concentrations of starch, lipids, and protein throughout the reaction. The amounts of Fusarium and even com were also measured. A multiple linear regression (MLR) method was satisfactory, with standard errors of prediction (SEP) for the constituents being 0.37% for starch, 4.57% for lipid, 4.62% for protein, 2.38% for Fusarium, and 0.16% for com. It may be inferred from the data that PLS or PCA (principal components analysis) may have given more accurate results. [Pg.387]

If the system is not simple, an inverse calibration method can be employed where it is iKst necessary to obtain the spectra of the pure analytes. The three inverse methods discussed later in this chapter include multiple linear regression (MLR), jirincipal components regression (PCR), and partial least squares (PLS). Wlien using. MLR on data sees found in chemlstiy, variable. sciectson is... [Pg.98]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...