Molecular descriptors interpretability

We must now mention, that traditionally it is the custom, especially in chemo-metrics, for outliers to have a different definition, and even a different interpretation. Suppose that we have a fc-dimensional characteristic vector, i.e., k different molecular descriptors are used. If we imagine a fe-dimensional hyperspace, then the dataset objects will find different places. Some of them will tend to group together, while others will be allocated to more remote regions. One can by convention define a margin beyond which there starts the realm of strong outliers. "Moderate outliers stay near this margin. [Pg.213]

Our approach is to examine small, closely-related series of nitrosamines and to develop structure-activity models based on molecular descriptors which are explicitly meaningful with respect to the organic chemistry and biochemistry of the compounds. The forms of these models can then often be interpreted in terms of the mechanisms through which these compounds exert their carcinogenic effects. [Pg.77]

D-molecular descriptors, alignment-independent and based on molecular interaction, called GRIND have been developed. These are autocorrelation transforms that are independent of the orientation of the molecules in 3D space. The original descriptors can be extracted from the autocorrelation transform with the ALMOND program. The basic idea is to compress the information present in 3D maps into a few 2D numerical descriptors which are very simple to understand and interpret. [Pg.197]

Reducing the dimensionality of the descriptor space not only facilitates model building with molecular descriptors but also makes data visualization and identification of key variables in various models possible. Notice that while a low dimension mathematically simplifies a problem such as model development or data visualization, it is usually more difficult to correlate trends directly with physical descriptors, and hence the data become less interpretable, after the dimension transformation. Trends directly linked with physical descriptors provide simple guidance for molecular modifications during potency/property optimizations. [Pg.38]

Physico-chemical properties constitute the most important class of experimental measurements, also playing a fundamental role as - molecular descriptors both for their availability as well as their interpretability. Examples of physico-chemical measurable quantities are refractive indices, molar refractivities, parachors, densities, solubilities, partition coefficients, dipole moments, chemical shifts, retention times, spectroscopic signals, rate constants, equilibrium constants, vapor pressures, boiling and melting points, acid dissociation constants, etc. [Lyman et al, 1982 Reid et al, 1988 Horvath, 1992 Baum, 1998]. [Pg.172]

Orthogonalized descriptors are used in similarity/diversity analysis and quantitative -> structure/response correlations with the aim of eliminating the bias provided by the interdependence of common molecular descriptors. Moreover, the interpretation of regression models should be facilitated, as the information encoded in each descriptor is unique. [Pg.342]

Reversible decoding is of great importance, since once a SRC model is established optimal values of the response can be chosen and values of the model molecular descriptors calculated by using the estimated SRC model. Then the possible molecular structures corresponding to the optimized descriptor values can be designed (and synthesized). This last operation is a troublesome task as the model molecular descriptors are not simple and easily interpretable. [Pg.423]

MLR Simple to use Models are easy to interpret Molecular descriptors should be orthogonal to one another Number of compounds in the training set should exceed the number of molecular descriptors by at least a factor of 5 Assumes a linear relationship between target property and molecular descriptors... [Pg.231]

FFBPNN Does not make any assumption of the type of relationship between target property and molecular descriptors Models are difficult to interpret Difficult to design an optimal architecture Risk of overfitting... [Pg.231]

It is important to stress that all the different types of molecular descriptors present advantages and drawbacks [32-37], However, availability and ease of interpretation are the main criteria to be used in their selection for deriving a QSAR model. [Pg.657]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...