Defining Model Applicability Domain

The use of the following statistical characteristics of the test set was also recommended [33] (1) correlation coefficient between the predicted and observed activities, (2) coefficients of determination (predicted versus observed activities R, and observed versus predicted activities Rq), (3) slopes fc and k of the regression lines through the origin. Thus, we consider a QSAR model predictive if the following conditions are satisfied [33] [Pg.441]

We have demonstrated [29, 33] that all of the above criteria are indeed necessary to adequately assess the predictive ability of a QSAR model. [Pg.441]

It needs to be emphasized that no matter how robust, significant, and validated a QSAR may be, it cannot be expected to reliably predict the modeled property for the entire universe of chemicals. Therefore, before a QSAR model is put into use for screening chemicals, its domain of application must be defined and predictions for only those chemicals that fall in this domain should be considered reliable. Some approaches that aid in defining the applicability domain are described below. [Pg.441]

Extent of Extrapolation For a regression-like QSAR, a simple measure of a chemical being too far from the applicability domain of the model is its leverage, hi [36], which is defined as [Pg.441]

Effective Prediction Domain Similarly, for regression-like models, especially when the model descriptors are significantly correlated, Mandel [39] proposed the formulation of effective prediction domain (EPD). It has been demonstrated, with examples, that a regression model is justified inside and on the periphery of the EPD. Clearly, if a compound is determined to be too far from the EPD, its prediction from the model should not be considered reliable. [Pg.442]

In the area of predictive toxicology the applicability domain is taken to express the scope and limitations of a model, that is, the range of chemical structures for which the model is considered to be applicable [106]. Although this issue has been fundamental to the use of QSAR (and indeed any predictive technique) since its conception, there remain few reliable methods to define and apply an applicability domain in predictive toxicology. The current status of methods to define the applicability domain for use in (Q)SAR has been assessed recently by Netzeva et al. [106]. [Pg.487]

There is currently debate on the best methods to define the applicability domain for a model in predictive toxicology. The ultimate solution is likely to be lacking for a number of years. However, there are some initiatives that are beginning to address the issue of applicability domain, which include the use of statistical measures and also mechanistic appreciation. [Pg.487]

Every model has limitations. Even the most robust and best-validated regression model will not predict the outcome for all catalysts. Therefore, you must define the application domain of the model. Usually, interpolation within the model space will yield acceptable results. Extrapolation is more dangerous, and should be done only in cases where the new catalysts or reaction conditions are sufficiently close to the model. There are several statistical parameters for measuring this closeness, such as the distance to the nearest neighbor within the model space (see the discussion on catalyst diversity in Section 6.3.5). Another approach uses the effective prediction domain (EPD), which defines the prediction boundaries of regression models with correlated variables [105]. [Pg.266]

Develop training set models through the use of available QSAR methods or commercial software. Characterize these models with internal validation parameters, as discussed in this chapter, and define the applicability domain for each model. [Pg.70]

Another important area of QSAR research is the determination of the reliability of QSAR models. Current research in this area includes the development of methods to define the applicability domain of a QSAR model [91] and to calculate the expected prediction accuracies for individual compounds [70]. [Pg.232]

Principle 3 defines an applicability domain that refers to the response and chemical structure space in which the model makes predictions with a given reliability. Ideally the applicability domain should express the structural, physicochemical, and response space of the model. The chemical structure space can be expressed by information on physicochemical properties and/or structural fragments. The response can be any physicochemical, biological, or environmental effect that is being predicted. [Pg.757]

Sahigara F, Mansouri K, Ballabio D et al (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules (Basel, Switzerland) 17 4791-4810... [Pg.192]

Dimitrov S, Dimitrova G, Pavlov T et al (2005) A stepwise approach for defining the applicability domain of SAR and QSAR models. J Chem Inf Model 45 839-849. doi 10.1021/ ci0500381... [Pg.368]

The applicability domain study should define in which conditions it is likely that the model does not work well. [Pg.191]

A group of nanomaterials, as the only criterion of membership becomes particle size, is very diversified. Particular members of the group differ from each other by molecular geometry (i.e., nanotubes, fullerenes, crystal structures, clusters, etc.) and physicochemical characteristics (i.e., organic, inorganic, semiconductors, isolators, metals, nonmetals, etc.). Thus, it may and should be assumed that they also differ by the mechanism of action and - in consequence - defining one common applicability domain and QSAR model for all of them is impossible. [Pg.208]

A defined domain of applicability It is realized that (Q)SARs are reductionist models that inevitably have limitations in terms of the types of chemical structures that can be predicted in other words, define the applicability of the model based on the domain. [Pg.98]

A very important aspect of statistical modeling is to determine the domain in which the model is defined with high significant reliability, called the applicability domain. It is important to do this for several reasons ... [Pg.396]

The current state of design processes can essentially not be improved by making only small steps. Instead, a new approach is necessary. Thereby, we face principal questions and nontrivial problems. We find new questions and corresponding problems by coherently and uniformly modeling the application domain and by defining new and substantial tool functionality. The layered process/product model is a scientific question which - even in a long-term project like IMPROVE - can only be answered partially. [Pg.65]

The Data Model is extended by the Reference Data Library, RDL (cf. part 4 of ISO 15926). The RDL establishes the different terminologies (i.e., the Reference Data) required for the individual application domains. Particularly for the chemical process industries, taxonomies for the description of materials, plant equipment, physical properties, and units are introduced by refining the classes of the Data Model. The RDL is to be harmonized with the STEPlib hbrary of the AP 221 (cf. Annex M of ISO 10303-221). Currently (as of 2007), merging of these originally independent libraries is still in progress. So far, about 15,000 classes have been defined in the RDL. It is expected that the RDL will contain up to 100,000 standard classes in the end. [Pg.177]

Formal and refined document contents models are needed to supply all information which is later used for the definition of integration rules. Currently, only type hierarchies are used, defining all entities and relationships that are to be considered during integration. Future work will deal with adding more structural information. These models are similar to the document content models of the application layer which, at the moment, are not elaborated. However, they are much more detailed and have to be formal. Also, further information is needed here that is of no interest on the application domain layer. [Pg.614]

Next, for each integrator, a set of link types must be defined. These are based on the document contents models and the additional relationships defined in the document contents and the inter-document relationship model. Inter-document relationships that are already defined in the application domain models (within ontologies as well as document contents and relationship models) can be transformed into link types. This can be done automatically if the application models are formal. Otherwise, a manual translation has to be performed. [Pg.616]

Even, if there is a formal document contents model as part of the application domain model, it needs to be extended to provide advanced tool support for defining related patterns. [Pg.620]

Sections 6.2, 6.3, and 6.4 described some promising results about how to get tool functionality by well-defined tool construction processes starting from elaborate application domain models, as described in Sects. 2.6 and 6.1. Furthermore, these sections also discussed how the information accumulated during these processes should be organized in a layered PPM. [Pg.629]

The ambitious solution would use more advanced reuse techniques for trans forming our PPM. Assuming that the above problems were solved, then we could use generic models across different application domains with well-defined instantiations and parameterization mechanisms in order to get a domain-specific PPM. In this case, there would not be a new development by just using modeling knowledge. Instead, there would be a well-defined process by which we get a specific PPM by making use of advanced reuse techniques. [Pg.637]

While product data modeling in an application domain has focused traditionally on a detailed and complete representation of all domain concepts together with their relations and defining attributes, the IMPROVE approach has been radically different. The major focus has been on the design of an extensible architectural framework for product data modeling rather than on a detailed and comprehensive model itself. [Pg.744]

It is new that application domain models are introduced within the tool construction process. The standard practice is that the tool developer realizes tools, he believes to be useful. Usually, there is only some imagination in form of an implicit and informal application model. In this book we started by explicitly defining models on the application side, consisting of a document model, product data model, work process model, and decision model (see Chap. 2). [Pg.758]

The concept of the applicability domain concerns the predictive use of QSAR/QSPR models and, then, is closely related to the concept of model validation ( validation techniques). In other vords, the applicability domain is a concept related to the quality of the QSAR/QSPR model predictions and prevention of the potential misuse of model s results. A key component of the prediction quality is indeed to define when a QSAR/QSPR model is suitable to predict a property/activity of a new compound [Tropsha, Gramatica et al, 2003 Jaworska, Nikolova-Jeliazkova et al, 2004 Dimitrov, Dimitrova et al, 2005 Jaworska, Nikolova-Jeliazkova et al, 2005 Netzeva, Worth et al, 2005 Nikolova-Jeliazkova and Jaworska, 2005],... [Pg.18]

The first approach to applicability domain evaluation is the statistical analysis of the training set, trying to define the best conditions for interpolated prediction that is usually more reliable than extrapolation. Extrapolation is not a problem in principle, because extrapolated results from theoretically well-founded models can often be reliable. However, QSAR/QSPR models are usually based on empirical, and limited experimental evidence and/or are only locally valid therefore, extrapolation usually results in high uncertainty and not reliable predictions. [Pg.18]

Important steps of this process are (a) selection of the set of molecules the modeling procedure is applied to, and the set of molecular descriptors that will define the model chemical space (b) selection of the training set for the model estimation and the test set for model validation (c) application of the validated model(s) to design new molecules with desirable properties and/or predict the response of interest for future molecules, paying attention to the applicability domain of the model. [Pg.749]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...