Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Intercorrelated descriptors

Although there is a strong negative correlation between partition coefficient and aqueous solubility (Hansch et al., 1968 Chiou et al., 1977), and a strong positive correlation between % and molecular volume (Dearden et al., 1988), the use of the partial least squares (PLS) method in this study allows the simultaneous use of intercorrelated descriptors. Nevertheless, the use of four descriptors to model the bioconcentration factor of only 11 compounds contravenes the Topliss and Costello (1972) rule, and renders the QSAR of dubious validity. [Pg.348]

PLS can be used to explain biological potency when a relatively large number of intercorrelated descriptors are used in the analysis. - ... [Pg.189]

Intercorrelation coefficients are then computed. These tell when one descriptor is redundant with another. Using redundant descriptors increases the amount of fitting work to be done, does not improve the results, and results in unstable fitting calculations that can fail completely (due to dividing by zero or some other mathematical error). Usually, the descriptor with the lowest correlation coefficient is discarded from a pair of redundant descriptors. [Pg.244]

Statishcal criteria of Eq. (24) are too good the standard deviation, which was created on the basis of different measurements by various authors, is much less than even the experimental error of determinahon. This could be due to mutual intercorrelation of descriptors leading to over-ophmistic statistics [18]. Another reason may be the lack of diversity in the training set. The applicahon of the solvation equation to data extracted from the MEDchem97 database gave much more modest results n = 8844, = 0.83, root mean square error = 0.674, F = 8416... [Pg.144]

Since Eqs. 48 and 49 were found to be identical within the standard errors on coefficients, it indicated that the coefficients in Eq. 48 were not greatly affected by the intercorrelation of descriptors. On this basis they conclude that Eq. 48 is as good as, or better than, the models based on subsets of the data used in the study. [Pg.528]

This procedure assessed whether some of the different descriptors used by different equations were intercorrelated and, therefore, interchangeable [59]. The remaining diverse QSAR equations were further classified by size (number of descriptors they include). The best equations of each encountered size were kept for final validation with the VS molecules and for further analysis. Consensus models featuring average predictions over these equations were also generated and validated. We focus here on the discussion of the minimalist overlay-independent and overlay-based QSAR models, each including only six descriptors, and refer to the optimal consensus model of the overlay-based QSAR approach families for comparative purposes. [Pg.125]

Multilinear regression can be used where the investigated endpoint is correlated to a linear combination of independent variables (the descriptors). This technique assumes linearity over the whole data set with respect to the descriptors. In addition, normality of the data must be fulfilled, and the descriptors cannot be intercorrelated. Multilinear regression is widely used in (Q)SAR modeling and has the advantage that all numerical information is retained and the predicted endpoint may be better estimated. However, the model may eventually overfit the data, after which the addition of further descriptors causes a decrease in accuracy of the model however, this will typically be disclosed in the calibration step of development. [Pg.82]

The delocalizability descriptors characterizing molecular reactivities toward the attack of a nucleophile (DN cf. Table 6.1 and Equation 6.58) and electrophile (DE cf. Table 6.1 and Equation 6.59) have been calculated and analyzed comparatively only for the three semiempirical methods AMI, PM3, and PM5. The resultant statistics are summarized in Table 6.9 and show generally high intercorrelations, but also some cases with only moderate to low R2 values. [Pg.151]

The greatest R2 values are obtained for delocalizabilities confined to nitrogen atoms except for the maximum and average acceptor delocalizabilities, DN.maxN and DN.avN, with respect to AMI vs. PM3. Moreover, DH.maxN provides relatively low intercorrelations as compared to most other delocalizability descriptors, and at the same time a still significantly greater similarity of PM5 with... [Pg.151]

The orthogonality (linear independence, cross-correlation, intercorrelation) of the parameters may be assessed several ways. Nonorthogonal descriptors introduce redundancy into the equation and are therefore undesirable. For example, a descriptor could be expressed as a function of the other descriptors, thus, implying that its term in the correlation equation could be replaced by an expression involving only the other parameters. [Pg.229]

The negative signs are physically reasonable because these additional parameters would be expected to be associated with increased intramolecular attractions. The authors provided no measure of intercorrelation indeed, one might expect some correlation of a, the average molecular polarizability, with the presence of C=0, COOH, NO2, and CN groups. The counting descriptors are not QM quantities their inclusion is not philosophically satisfying, but they do improve the fit. [Pg.240]

The simplest means to obtain such a quantitative relationship is to use multiple linear regression (MLR) available in any statistical software package. In order to avoid statistically insignificant relationships or chance correlations, one should always apply the following rules of thumb (1) the ratio of compounds to descriptors should be >5 (2) the descriptors should not be intercorrelated (inter-descriptor correlation coefficient should be less than r2<0.5). [Pg.359]

In general, the various different shape descriptors in this study were not intercorrelated (R values 0.7 or less). Simple correlations of the SIMCA F-value descripmrs with the various biologictd response variables yielded rather poor results. [Pg.76]

The structure descriptors are not intercorrelated to allow the recognition of the significant variables. If, for example, activity data on a set of chlorophenols are investigated with respect to their log and values, a statistically significant QSAR may be obtained with either one of the descriptors, but an assignment of the relevant descriptor is not feasible, because those are linearly intercorrelated. [Pg.9]

The dual nature of these polarizability descriptors poses severe problems with regard to their use in QSAR analyses. Only when the test set is designed so that intercorrelations with, for example, steric or lipophilic parameters can... [Pg.34]

Table 1.5 Intercorrelations between descriptors of chemical structures for a set of diverse organic compounds, characterized by the correlation coefficient the second figure (in parentheses) gives the number of compounds available for each pair of parameters (modified from Nendza and Russom, 1991). [Pg.42]

Multiple intercorrelations between descriptors of chemical structures are illustrated best using multivariate statistics (section 3.2.2). A principal component analysis of the data set of 18 descriptors (Table 1.6, Figure 1.11) revealed that > 80% of the information content of these descriptors is expressed by four factors that explain 54.7%, 15.8%, 8.1% and 5.6% of the total variance, respectively. [Pg.44]

Table 1.6 Principal component (PC) analysis of descriptors of chemical structures for a set of diverse organic compounds (a) > 80% of the explained variance is expressed in the first four PCs (b) the loadings of the original descriptor variables in the VARIMAX rotated factor matrix reflect the grouping of the parameters (i.e. high loadings in the same PC indicate high intercorrelations between the descriptors). Table 1.6 Principal component (PC) analysis of descriptors of chemical structures for a set of diverse organic compounds (a) > 80% of the explained variance is expressed in the first four PCs (b) the loadings of the original descriptor variables in the VARIMAX rotated factor matrix reflect the grouping of the parameters (i.e. high loadings in the same PC indicate high intercorrelations between the descriptors).
Figure 3.1 Selection of compounds for a training set for a QSAR derivation based on their properties Dj and D2. The compounds selected for the QSAR analysis are indicated by the frames. A inappropriate selection of the test compounds due to the high intercorrelation between their descriptors Dj and D2 B selection of the test compounds with regard to maximum variations and minimum intercorrelation between their descriptors Dj and D2. Figure 3.1 Selection of compounds for a training set for a QSAR derivation based on their properties Dj and D2. The compounds selected for the QSAR analysis are indicated by the frames. A inappropriate selection of the test compounds due to the high intercorrelation between their descriptors Dj and D2 B selection of the test compounds with regard to maximum variations and minimum intercorrelation between their descriptors Dj and D2.
The selection of the training set becomes more sophisticated if more than three descriptors are assumed to be relevant. A manual selection is not feasible with a parameter space that is more than three-dimensional and computational techniques have to be applied. These mostly use multivariate statistics to account for underlying multiple intercorrelations among variables, such as PCA and PLS (Hellberg, 1986 Tosato et a/., 1991 Lindgren et a/., 1995), to find an acceptable compromise with respect to collinearity, variance and syn-thetic/commercial accessibility of the chemicals. [Pg.66]

There is no excuse for not making an effort to choose the appropriate compounds to form the basis of the intended QSAR. Even if little information is available on the relevant descriptors, there is at least always the possibility of using a 7r/cr-substituent constants diagram of the candidate chemicals to assess the intercorrelations between lipophilic and electronic properties. [Pg.66]


See other pages where Intercorrelated descriptors is mentioned: [Pg.250]    [Pg.188]    [Pg.250]    [Pg.188]    [Pg.490]    [Pg.354]    [Pg.486]    [Pg.528]    [Pg.532]    [Pg.542]    [Pg.159]    [Pg.268]    [Pg.83]    [Pg.83]    [Pg.92]    [Pg.131]    [Pg.137]    [Pg.231]    [Pg.238]    [Pg.161]    [Pg.519]    [Pg.528]    [Pg.657]    [Pg.137]    [Pg.249]    [Pg.250]    [Pg.475]    [Pg.16]    [Pg.44]    [Pg.67]    [Pg.72]   
See also in sourсe #XX -- [ Pg.209 ]




SEARCH



Intercorrelations

© 2024 chempedia.info