Descriptor domain

Additional consideration of other components of the QSAR AD, such as the physico-chemical domain, descriptor domain," mechanistic domain and metabolic domain (when possible) would allow even more improved confidence levels in predictive model applications. [Pg.467]

The item/descriptor distinction is exactly the same as instance/class. (We use different words to avoid any additional confusion between the modeling domain and the software.) Indeed, in some programming languages (such as Smalltalk), classes are represented by objects, and each instance has an implicit link to its class. The class objects are themselves instances of a Class class, and new ones with new attributes can be created at runtime. This property of the language is known as reflection. [Pg.583]

When a novel homology domain has been discovered, it is possible to store the corresponding domain descriptor (profile or HMM) in a number of dedicated domain databases, which can be used to analyze newly identified sequences for their domain content [9, 10]. Several competing domain- and motif-databases exist, including PROSITE, PFAM, SMART, and Superfam, which contain descriptors for most, if not all, of the known domains involved in the ubiquitin system [11-14]. Recently, a new meta-database named INTERPRO has been established, which tries to combine the descriptors of several domain databases under a single user interface [15]. Pointers to the very useful search engines of the domain databases are provided in Table 12.1. [Pg.321]

Effective Prediction Domain Similarly, for regression-like models, especially when the model descriptors are significantly correlated, Mandel [39] proposed the formulation of effective prediction domain (EPD). It has been demonstrated, with examples, that a regression model is justified inside and on the periphery of the EPD. Clearly, if a compound is determined to be too far from the EPD, its prediction from the model should not be considered reliable. [Pg.442]

In 2008, Weaver [64] utilized PPB as an example to demonstrate the concept of "domain of applicability" in QSAR researches. The PLS model was constructed using 17 ID and 2D molecular descriptors. The performance of the model was reasonable for such a large data set for PPB modeling (n — 685, q2 — 0.56, RMSE = 0.55 AUE = 0.42, ntest = 210, q2 = 0.58, RMSEtest = 0.54, AUEtest = 0.41). How domain selection protocol affects the prediction performance will be discussed in Section 3. [Pg.117]

Once a QSAR model is constructed, it must be validated using the external test set. The data points in the test set should not appear in the training set. There are two approaches to improve the prediction accuracy for a given QSAR model. The first approach utilized the concept of "the domain of applicability," which is used to estimate the uncertainty in prediction of a particular molecule based on how similar it is to the compound used to build the model. To make a more accurate prediction for a given molecule in the test set, the structurally similar compounds in the training set are used to construct model and that model is used to make the prediction. In some cases, the domain similarity is measured using molecular descriptor similarity, rather than the structural similarity. The... [Pg.120]

Jaworska, J., Nikolova-Jeliazkova, N. and Aldenberg, T. (2005) QSAR applicabilty domain estimation by projection of the training set descriptor space a review. Altern. Lab. Arum,., 33 (5), 445-459. [Pg.41]

The descriptor was a product of the correlation weights, CW(Ik), calculated by the Monte Carlo method for each kth element of a special SMILES-like notation introduced by the authors. The notation codes the following characteristics the atom composition, the type of substance (bulk or not, ceramic or not), and the temperature of synthesis. The QSAR model constructed in this way was validated with the use of many different splits into training (n 21) and validation (n=8) sets. Individual sub-models are characterized by high goodness-of-fit (0.972 applicability domain of the model, it is not known if all the compounds (metal oxides, nitrides, mullite, and silicon carbide) can be truly modeled together. [Pg.211]

Martin et al. [68] have modeled together solubility of 14 polycyclic aromatic hydrocarbons (PAHs) and fullerene in octanol and heptane utilizing descriptors calculated with the CODESS A package [69]. Also in this case, the applicability domain has not been validated. The structural difference between planar PAHs and spherical fullerene is probably too large for making reliable predictions. Moreover, the experimental solubilities of fullerene in both solvents (log S = —4.09 in heptane and log S>4.18 in octanol) are significantly lower than the solubilities of PAHs (—3.80[Pg.211]

The packages listed in Table 3.7 (and others not listed) offer the user the capability to calculate a large variety of descriptors rapidly and efficiently. This is an excellent facility in terms of QSAR development, but the user must always remember that these are calculated values. More specifically, while a complete data sheet may be produced, calculated properties may not be valid if the compound is outside the domain of the original model. The user must resist the temptation to take any calculated value as a correct value. [Pg.52]

In this step, it is necessary to identify or compile an independent test set of data. In other words, it is necessary to obtain a data set containing values of the descriptors and response variables for chemical structures that were not used in the training set. The selection of chemicals for the test set should take into account the predefined domain of applicability for the QSAR. [Pg.434]

The domain of applicability of the QSAR was well defined by the model developer. The QSAR was stated to be applicable to chemicals having log K,v values in the range from -1.24 to 5.13, and operating by a non-polar narcosis mechanism of action. Such chemicals can be identified on a structural basis (Verhaar et al., 1992), or from physicochemical descriptors (Boxall et al., 1997). [Pg.437]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...