Substructure keys

If the binary descriptors for the objects s and t are substructure keys the Hamming distance Eq. (6)) gives the number of different substructures in s and t (components that are 1 in either s or but not in both). On the other hand, the Tanimoto coefficient (Eq. (7)) is a measure of the number of substructures that s and t have in common (i.e., the frequency a) relative to the total number of substructures they could share (given by the number of components that are 1 in either s or t). [Pg.407]

Substructural keys - SAR scenario one active chemical family... [Pg.229]

Products Index (CPI) database. To aid in reducing the size of a hit list, a Reagent Selector user can filter reagents and sort on properties, availability, presence or absence of functional groups, etc. (Fig. 9.18). Further list reduction can be achieved by clustering the structures by means of a cluster analysis using substructure keys as descriptors. [Pg.392]

Inverted Keys. When substructure search keys are generated for a structure, they may be stored in normal order (where each record represents a structure, and the bits or fields for that structure represent the keys). Alternatively, they may be stored in inverted or pivoted order, where each record represents a given substructure key, and the bits represent structures that have that particular key set. This type of storage benefits key searching, where a user wants all the structures that have a particular key set. [Pg.405]

Similarity Search. A type of "fuzzy" structure searching in which molecules are compared with respect to the degree of overlap they share in terms of topological and/or physicochemical properties. Topological descriptors usually consist of substructure keys or fingerprints, in which case a similarity coefficient like the Tanimoto coefficient is computed. In the case of calculated properties, a simple correlation coefficient may be used. The similarity coefficient used in a similarity search can also be used in various types of cluster analysis to group similar structures. [Pg.410]

The first groups of models are generally constructed using molecular connectivity indices, kappa environmental descriptors, electronic charges, and substructural keys. In many instances Log P has been used however, our... [Pg.135]

Daphnia EC50 Equation for Model Incorporating Molecular Connectivity Indices and Substructural Keys... [Pg.136]

A variety of parameters are included into the QSAR equation. Log P is a commonly used parameter and is obtained from Medchem or estimated using the CLOGP3 computer program. Molecular weight is calculated. In interspecies models the LD50 or LC50 value is incorporated as a typical parameter. Molecular connectivity indexes, electronic charge distributions, and kappa environmental descriptors have been proven as powerful predictors of toxicity. The efficacy of these values lies in the fact that each of these parameters describes a molecule in a fashion similar to that actually seen by the molecular receptors that initiate a toxic response. Substructural keys are identified with the help of the MOLSTAC substructural key system. MOLSTAC consists of five classes of descriptors ... [Pg.139]

Identification of electron-donating and electron-withdrawing substructural keys... [Pg.139]

Another crucial aspect of the validation process is the test of how well described and represented the molecule is in the map of the chemical toxicity space that the regression equation represents. If the substructural key does not exist in the database used to build the model, then it is unlikely that the compound can be accurately estimated. In addition, if compounds similar to the test compound do not exist, then a comparison as was done above cannot be conducted and a measure of the performance of the model with compounds similar to the test material cannot be made. This type of validation requires a large database and a substructural search algorithm, and should be included in a QSAR estimate. [Pg.142]

MACCS substructure keys on the other hand encode the presence of a predefined set of relevant 2D fragments, originally designed for speeding up database substructure searching [48,49] by eliminating those compounds from detailed consideration that can-... [Pg.413]

As it was previously shown that MACCS substructure keys outperform UNITY and Daylight 2D fingerprints [46], the IC93 database was investigated using an implementa-... [Pg.423]

Figure 13.6. Percent biological classes covered from the IC93 database versus subset sizes for maximum dissimilarity selections using selected MACCS substructure keys counting up to 1,3.5 or 9 occurrences of a particular fragment key, UNITY 2D fingerprints (Unity2D), and theoretical random selections (Random Jheo).

Thus, the presence of a substructure key appearing in the equation permits the addition of the coefficient for that key if it is present in the compound under consideration in the absence of a key, no use is made of that coefficient for the compound under study. [Pg.406]

Using the WLN-derived CROSSBOW substructure keys, modified and extended by us to a total of 336 keys, the discriminant analysis equation presented in Table VII was developed. The results of this study are shown in the classification matrix given in Table VIII. [Pg.410]

MOLSTAC is a substructural key system designed at Health Designs, Inc. (HDI), to replace the previously used CROSSBOW (Eakin et al. 1974) system. MOLSTAC generates descriptors more relevant to biological activity. MOLSTAC keys are also more specific than those generated by CROSSBOW. Details on the keys actually used in the model will be shown further down. [Pg.97]

The computer system is based on HDI s proprietary substructural key generation system MOLSTAC and the necessary MCI generation program, together with elaborate structural data entry and verification facilities, based on SMILES, a method developed by the Medicinal Chemistry Project at Pomona College, Claremont, CA. [Pg.103]

Substructure keys encode molecular information in the form of binary arrays or bitmaps (see Substructure Searching). Each element (or bit) in the array can take the values true or false , and indicates the presence or absence of a specific structural feature or pattern in the target molecule. Substructure keys were originally designed for large-scale database searching, but have also proven effective in similarity applications. [Pg.743]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...