Linear Regression with Multivariate Data

In this chapter we expand the linear regression calculation into higher dimensions, i.e. instead of a vector y of measurements and a vector a of fitted linear parameters, we deal with matrices Y of data and A of parameters. [Pg.139]

We derive the new concept by using a chemical example based on absorption data. First, consider a consecutive reaction A—with rate constants k and fe, where the absorption at one particular wavelength was recorded as a function of time. Let s say our task is to determine the molar absorptivities of species A, B and C at this wavelength, knowing all individual concentrations at all reaction times. [Pg.139]

Previously we used the notation F for the matrix of the known function. In many chemical applications involving spectroscopic absorption [Pg.139]

The above example of recording the kinetics of the reaction A— B-4C at one wavelength is then best described by the matrix equation. [Pg.140]

The (nsxl) column vector y contains the absorption data at ns reaction times the concentration profiles of three species A, B and C form the columns of an (nsx3) matrix C and their molar absorptivities form an (3x1) column vector a. Vector r contains the residuals between y and Cxa and has the same dimensions as y. [Pg.140]

In performing a QSAR study, the data are assembled into a table (matrix) of numbers in which each row represents the data for a compound and each column a physicochemical property, often referred to as a descriptor. The first column after the compound identifier is usually reserved for the measured target property. A statistical procedure [e.g., multiple linear regression (MLR), multivariate analysis, neural networks] is then used to find a relationship between the observed measurement with some combination of the properties represented in the subsequent columns. Considering the three properties discussed above, a classical QSAR equation has the following form ... [Pg.132]

Chemometrics is the discipline concerned with the application of statistical and mathematical methods to chemical data [2.18], Multiple linear regression, partial least squares regression and the analysis of the main components are the methods that can be used to design or select optimal measurement procedures and experiments, or to provide maximum relevant chemical information from chemical data analysis. Common areas addressed by chemometrics include multivariate calibration, visualisation of data and pattern recognition. Biometrics is concerned with the application of statistical and mathematical methods to biological or biochemical data. [Pg.31]

Table 2.10 Results of multivariate linear regression of data in Table 2.31 using Ln-transformed 5-FU clearance as the dependent variable using only BSA and 5-FU dose with subject 3 removed from the analysis.

Two datasets are fist simulated. The first contains only normal samples, whereas there are 3 outliers in the second dataset, which are shown in Plot A and B of Figure 2, respectively. For each dataset, a percentage (70%) of samples are randomly selected to build a linear regression model of which the slope and intercept is recorded. Repeating this procedure 1000 times, we obtain 1000 values for both the slope and intercept. For both datasets, the intercept is plotted against the slope as displayed in Plot C and D, respectively. It can be observed that the joint distribution of the intercept and slope for the normal dataset appears to be multivariate normally distributed. In contrast, this distribution for the dataset with outliers looks quite different, far from a normal distribution. Specifically, the distributions of slopes for both datasets are shown in Plot E and F. These results show that the existence of outliers can greatly influence a regression model, which is reflected by the odd distributions of both slopes and intercepts. In return, a distribution of a model parameter that is far from a normal one would, most likely, indicate some abnormality in the data. [Pg.5]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...