OntoCAPE is the core of the integrated model which describes the concepts of chemical engineering, the specific application domain of interest in the IMPROVE project. The product data in OntoCAPE are well integrated with the Document Model by direct references from the document content description to product data elements in OntoCAPE. Likewise, the elements of the document content description link the product model in OntoCAPE to the decision and work process documentation in the Process and Decision Ontologies. [Pg.748]

Applicability Domain for DT-Based Models We describe applicability domain for QSAR models as being determined by two parameters (1) prediction confidence, or the certainty of a prediction for an unknown chemical, and (2) domain extrapolation, or the prediction accuracy of an unknown chemical that lies beyond the chemical space of the training set [60]. Both parameters can be quantitatively estimated in the consensus tree approaches, where individual models are constructed as DTs. Taken together, prediction confidence and domain extrapolation assess the applicability domain of a model for each prediction. [Pg.164]

Extent of Extrapolation For a regression-like QSAR, a simple measure of a chemical being too far from the applicability domain of the model is its leverage, hi [36], which is defined as [Pg.441]

In order to establish whether a query chemical compound fits within the applicability domain of the QSAR model (i.e., that the training set for the model contained molecules chemically similar to the query molecule), the similarity of the uploaded structure (and its predicted metabolites if the metabolite QSAR prediction option is selected) to the structures used in the training set for the QSAR model can be calculated (maximal Tanimoto coefficient (15)). The higher the similarity value, the greater is the applicability of the model. Results of QSAR modeling for [Pg.233]

Sahigara F, Mansouri K, Ballabio D et al (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules (Basel, Switzerland) 17 4791-4810 [Pg.192]

Another important area of QSAR research is the determination of the reliability of QSAR models. Current research in this area includes the development of methods to define the applicability domain of a QSAR model [91] and to calculate the expected prediction accuracies for individual compounds [70]. [Pg.232]

The tools for in silico toxicology are broadly applied in the drug development process. The particular use of the tools is clearly context-dependent, which includes the quality of the prediction and the applicability domain of the model. [Pg.475]

The availability of data will dramatically transform the field and boost development of new, reliable methods for physicochemical property predictions. The development of methods to estimate the accuracy of a prediction and the applicability domain of models will make it possible to obtain more confident results on their wider use in environmental and pharmaceutical studies. [Pg.267]

Several attempts were performed to determine the accuracy of in silica prediction tools developed for lipophilicity (for a recent review, see [34]). The main factor limiting the accuracy of all predictive methods is the training sets used to generate the models, in terms of population and quality of the experimental data they contain. Since most of the methods proposed in commercial software were built with data available in the public domain, their accuracy can be expected to be comparable. Thus, in order to select the most suitable prediction tool, other criteria than accuracy have to be used such as the speed of the calculation for large databases, the price of commercial software or the application domain of the model. [Pg.96]

However, even when the predictive ability of the models was high, the estimated property should be taken carefully because a molecule might be far from the model chemical space and, then, the response would be the result of a strong extrapolation, resulting in an unreliable prediction. To cope with this problem, the concept of —> applicability domain of a model came out as a relevant aspect for the evaluation of the prediction reliability. [Pg.1253]

Important steps of this process are (a) selection of the set of molecules the modeling procedure is applied to, and the set of molecular descriptors that will define the model chemical space (b) selection of the training set for the model estimation and the test set for model validation (c) application of the validated model(s) to design new molecules with desirable properties and/or predict the response of interest for future molecules, paying attention to the applicability domain of the model. [Pg.749]

Every model has limitations. Even the most robust and best-validated regression model will not predict the outcome for all catalysts. Therefore, you must define the application domain of the model. Usually, interpolation within the model space will yield acceptable results. Extrapolation is more dangerous, and should be done only in cases where the new catalysts or reaction conditions are sufficiently close to the model. There are several statistical parameters for measuring this closeness, such as the distance to the nearest neighbor within the model space (see the discussion on catalyst diversity in Section 6.3.5). Another approach uses the effective prediction domain (EPD), which defines the prediction boundaries of regression models with correlated variables [105]. [Pg.266]

In the following, we will discuss very few important techniques for machine learning. There exists a wealth of methods [69] and we try to focus here on the ones primarily applied to virtual screening. The list, however, is not exhaustive and the interested reader is redirected to excellent machine learning literature in order to get the full picture where details of the applicability domain of models and error estimations are also discussed [69-72]. [Pg.77]

All predictions must be taken for what they are, namely, generalizations based on current knowledge and understanding. There is a temptation for a user to assume that a computer-generated answer must be correct. To determine whether this is in fact the case, a number of factors concerning the model must be addressed. The statistical evaluation of a model was addressed above. Another very important criterion is to ensure that a prediction is an interpolation within the model space, and not an extrapolation outside of it. To determine this, the concept of the applicability domain of a model has been introduced [106]. [Pg.487]

© 2019 chempedia.info