Correlation with Three Independent Variables

In the example we have been considering, the loss from a P.O.P. stack, there is a third variable of considerable interest, namely the strength of acid in the absorption tower, which we will denote by Xs. [Pg.72]

The solutions of these eqtiations can be obtained empirically, but a systematic method is less liable to accidental errors. [Pg.73]

Whittaker and Robinson( ) contain a general discussion, but the most satisfactory method is due to M. H. Doolittle. We set out the equations as (A ), (B ) and (C ) in a table as Table 10.2. Inserting the numerical values quoted above, we get (A ), (B ), (C ). It will be understood that all the figures occurring in the column headed Ca are effectively the coefficients of Ca in the successive equations, and similarly for Cb and Cc The right hand sides have been multiplied by 10 temporarily to avoid the occurrence of excessively large numbers of O s after the decimal points. [Pg.73]

The next step is to write dovm again (A ) as (A ), and then put underneath it the result of dividing it by the coefficient of Ca with the sign changed (here by —2270.12950). This gives us (D). [Pg.73]

Whittaker and G. Robinson, The Calculus of Observations (Blackie). 3rd edition, Chapter IX. [Pg.73]

PCA step of PCR with the regression step. Latent variables, like PCs, are calculated to explain most of the variance in the x set while remaining orthogonal to one another. Thus, the first latent variable (LVi) will explain most of the variance in the independent set, LV2 the next largest amount of variance and so on. The important difference between PLS and PCR is that the latent variables are constructed so as to maximize their correlation with the dependent variable. Unlike PCR equations where the PCs do not enter in any particular order (see eqns 7.6 to 7.8) the latent variables will enter PLS equations in the order one, two, three, etc. The properties of latent variables are ... [Pg.154]

On the face of it, PLS appears to offer a much superior approach to the construction of linear regression models than MLR or PCR (since the dependent variable is used to construct the latent variables) and for some data sets this is certainly true. Application of PLS to the charge-transfer data set described in the last section resulted in a PLS model containing only two dimensions which explained over 90 per cent of the variance in the substituent constant data. This compares very favourably with the two- and three-dimensional PCR equations (eqns 7.7 and 7.8) which explain 73 and 81 per cent of the variance respectively. Another advantage that is claimed for the PLS approach is its ability to handle redundant information in the independent variables. Since the latent variables are constructed so as to correlate with the dependent variable, redundancy in the form of colli-nearity and multicollinearity in the descriptor set should not interfere. This is demonstrated by fitting PLS models to the 31 variable and 11 variable parameter sets for the charge-transfer data. As shown in Table 7.7 the resulting PLS models account for very similar amounts of variance in k. [Pg.155]

The oldest records (a-c) and Fig. 1-2 clearly show a strong degree of temporal correlation between three biologically involved atmospheric components and climate (as indicated by temperature). Because there is a sound physical basis for the involvement of all three in climatic processes, it is necessary to study, view, and understand these variables and climate as linked components of a system. They are all dependent variables and carmot be viewed as independent with climate being imposed as an exogenous factor. [Pg.507]

Most of the more previous attempts to correlate carcinogenic potency with structural features of benzenoid hydrocarbons are related to the mechanism outlined above. The different quantitative models that have been developed can be distinguished with regard to the number of independent variables. One-variable theories (e.g. the so-called bay-region theory [94]) are normally inferior compared to two- [95] and three-variable theories which refer more explicitely to the different and partly competing metabolic reaction pathways. A particularely efficient model has been developed by v. Szentpaly [96]. In his so-called MCS model three important influences on carcinogenic potency are taken into account M, the initial epoxidation... [Pg.119]

Hansch analysis and other classical QSAR approaches evaluate the QSAR model based on the correlation of rows of compounds with known activities (dependent variables) to columns of parameters (independent variables). For this reason, classical QSAR is sometimes called 2D QSAR. CoMFA is an example of 3D QSAR because lead analogues are modeled and analyzed in a virtual three-dimensional space. The value of both methods ultimately hinges on how well experimental and calculated activities correlate (Figure 12.2) and how well the model predicts the activity of compounds not included in the training set. [Pg.315]

Several attempts were made to determine independent parameters for oxygen and fluorine, which of necessity were assigned to the three molecular orientations with one-third probability. Refinement reduced Ri as low as 0.059, but the resulting bond distances and thermal parameters were unrealistic. We conclude that the data are insufficient to permit valid refinement of so many independent variables which are highly correlated with each other. [Pg.219]

Statistical methods. Certainly one of the most important considerations in QSAR is the statistical analysis of the correlation of the observed biological activity with structural parameters - either the extrathermodynamic (Hansch) or the indicator variables (Free-Wilson). The coefficients of the structural parameters that establish the correlation with the biological activity can be obtained by a regression analysis. Since the models are constructed in terms of multiple additive contributions the method of solution is also called multiple linear regression analysis. This method is based on three requirements (223) i) the independent variables (structural parameters) are fixed variates and the dependent variable (biological activity) is randomly produced, ii) the dependent variable is normally and independently distributed for any set of independent variables, and iii) the variance of the dependent variable must be the same for any set of independent variables. [Pg.71]

Acceptable population models resulted in successful minimization, with at least three significant digits for any parameter, a successful estimation of the covariance, and the absolute value of last iteration gradients greater than 0.001 but smaller than 100. Confidence intervals of structural parameters should not include value zero correlation between any two structural parameters should never be greater than 0.95. Acceptable models should not lead to trends in the distribution of weighted residuals versus model predictions and versus independent variable. They should not be oversensitive to initial estimates nor lead to differences between the population parameters and the corresponding medians of individual POSTHOC parameters. The predictions versus observations data should be evenly distributed around the unit line. If constraints were applied on parameters, no final estimate should be equal to one of the boundaries. [Pg.1114]

The final model (coding 1) was compliant with the above model-acceptance criteria since the run finished successfully with more than three significant digits and the covariance, the 95% confidence intervals of all the parameters did not include zero, none of the correlation between the structural parameters was above 0.95, and the weighted residuals versus model predictions and versus independent variable data were evenly distributed around the zero line. However, slight trends toward overprediction were seen in the plot of predictions versus observations possible explanations are given in Section 44.2.3. [Pg.1114]

One difficulty with a method requiring three or more experimental variables is that the very large number of possible combinations of the values of these variables often makes it impractical to collect data at all of the combinations. Of the many possible combinations of wavelengths and other experimental variables, where should one actually make measurements One logical choice is to collect data so as to minimize the product of the confidence intervals of all of the individual parameters, adjusting for correlation between parameters. Such designs are known as D-optimal. Abel examined Z)-optimal designs for PARAFAC models and has shown that often data collection can be omitted for many combinations of the independent variables with little increase in the product of the confidence intervals. [Pg.694]

Since the 1960s position annihilation lifetime spectroscopy (PALS) has been used to measure free-volume cell size and/or its content in liquids or solids. The three chapters of Part III discuss correlations between the PALS experimental values and those computed from the S-S theory. Chapter 10, by Consolati and Quasso, considers free volume in amorphous polymers Chapter 11, by Dlubek, its distribution from PALS and Chapter 12, by Jamieson et al., the free volume in heterogeneous polymer systems. These state of the art texts offer intriguing observations on the structure of polymeric systems and its variation with independent variables. In all cases, good correlation has been found between the free-volume quantity measured by PALS and its variability computed from the S-S equation of state. [Pg.793]

In statistical terms, the primary goal of an experimental design is to introduce controls that eliminate or significantly reduce statistical dependence between the factor of interest and other factors that might influence the test outcome or dependent variable. To illustrate, consider three variables X, Y, and Z. X is an experimental factor, Y is a test outcome measure or dependent variable, and Z is an independent variable that is correlated with both Y and X. For example, in atmospheric corrosion, the corrosion rate (Y) is influenced by both temperature (X) and rainfall (Z). However, a wet season will also tend to be a cool one in the natural environment, so temperature and rainfall will be correlated, or confounded. [Pg.54]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...