Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Validation of descriptors

When applied to QSAR studies, the activity of molecule u is calculated simply as the average activity of the K nearest neighbors of molecule u. An optimal K value is selected by the optimization through the classification of a test set of samples or by the leave-one-out cross-validation. Many variations of the kNN method have been proposed in the past, and new and fast algorithms have continued to appear in recent years. The automated variable selection kNN QSAR technique optimizes the selection of descriptors to obtain the best models [20]. [Pg.315]

Figure 18.2 Representative receiver operator curves to demonstrate the leave n out validation of K-PLS classification models (metabolite formed or not formed) derived with approximately 300 molecules and over 60 descriptors. The diagonal line represents random. The horizontal axis represents the percentage of false positives and the vertical axis the percentage of false negatives in each case. a. Al-dealkylation. b. O-dealkylation. c. Aromatic hydroxylation. d. Aliphatic hydroxylation. e. O-glucuronidation. f. O-sulfation. Data generated in collaboration with Dr. Mark Embrechts (Rensselaer Polytechnic Institute). Figure 18.2 Representative receiver operator curves to demonstrate the leave n out validation of K-PLS classification models (metabolite formed or not formed) derived with approximately 300 molecules and over 60 descriptors. The diagonal line represents random. The horizontal axis represents the percentage of false positives and the vertical axis the percentage of false negatives in each case. a. Al-dealkylation. b. O-dealkylation. c. Aromatic hydroxylation. d. Aliphatic hydroxylation. e. O-glucuronidation. f. O-sulfation. Data generated in collaboration with Dr. Mark Embrechts (Rensselaer Polytechnic Institute).
The number of latent variables (PLS components) must be determined by some sort of validation technique, e.g., cross-validation [42], The PLS solution will coincide with the corresponding MLR solution when the number of latent variables becomes equal to the number of descriptors used in the analysis. The validation technique, at the same time, also serves the purpose to avoid overfitting of the model. [Pg.399]

However, symmetry considerations cannot be the sole determinants in such cases. If we return to the substituted adamantane and replace all its methylene groups by identical (CHj), bridges of sufficient lengths to allow a permutation at one quaternary carbon without disturbing the others, a single descriptor can no longer be used. Thus one must further question the validity of the idea of die unoccupied center if its existence depends on the value of n. [Pg.229]

The first QSPR models for skin tried to establish linear relationships between the descriptors and the permeability coefficient. In many cases validation of these models using, for example, external data sets was not performed. Authors of more recent models took advantage of the progress in statistical methods and used nonlinear relationships between descriptors and predicted permeability and often tried to assess their predictive quality using some validation method. [Pg.464]

This procedure assessed whether some of the different descriptors used by different equations were intercorrelated and, therefore, interchangeable [59]. The remaining diverse QSAR equations were further classified by size (number of descriptors they include). The best equations of each encountered size were kept for final validation with the VS molecules and for further analysis. Consensus models featuring average predictions over these equations were also generated and validated. We focus here on the discussion of the minimalist overlay-independent and overlay-based QSAR models, each including only six descriptors, and refer to the optimal consensus model of the overlay-based QSAR approach families for comparative purposes. [Pg.125]

Eerguson, A.M., Clark, R.D., and Weinberger, L.E. Neighborhood behavior a useful concept for validation of molecular diversity descriptors. [Pg.139]

J.A., and Wilson, T.M. Selection, application, and validation of a set of molecular descriptors for nuclear receptor ligands. Book of Abstracts,... [Pg.196]

Martin YC, Kofron JL, Traphagen LM. (2002) Do Structurally Similar Molecules Have Similar Biological Activities /. Med. Chem. 45 4350-4358. Patterson DE, Cramer RD, Ferguson AM, Clark RD, Weinberger LE. (1996) Neighbourhood Behaviour A Useful Concept for Validation of Molecular Diversity Descriptors. /. Med. Chem. 39 3049-3059. [Pg.154]

The selection of the descriptors can happen in a forward or backward manner. A model created using the forward method starts with one descriptor followed by the addition of descriptors to the model until the model meets the specifications of the user. Models created in the backward method start with all the possible descriptors descriptors are taken away as they are deemed unnecessary. It is safe to assume that most QSAR models are created using the forward method due to the sheer number of descriptors and the desire for only a few in the model. Models can be pruned using the backward method specifically, once the model is created, the user wants to reduce the number of descriptors yet keep the same level of validity for the model. [Pg.159]


See other pages where Validation of descriptors is mentioned: [Pg.17]    [Pg.17]    [Pg.26]    [Pg.37]    [Pg.17]    [Pg.17]    [Pg.26]    [Pg.37]    [Pg.402]    [Pg.497]    [Pg.333]    [Pg.104]    [Pg.40]    [Pg.383]    [Pg.399]    [Pg.201]    [Pg.206]    [Pg.463]    [Pg.475]    [Pg.521]    [Pg.532]    [Pg.532]    [Pg.123]    [Pg.339]    [Pg.444]    [Pg.444]    [Pg.134]    [Pg.120]    [Pg.62]    [Pg.90]    [Pg.135]    [Pg.136]    [Pg.158]    [Pg.168]    [Pg.174]    [Pg.174]    [Pg.182]    [Pg.187]    [Pg.190]    [Pg.271]    [Pg.271]   
See also in sourсe #XX -- [ Pg.16 , Pg.36 ]




SEARCH



Descriptor validation

© 2024 chempedia.info