Two similarity measures

Fields can be utilized in virtual screening applications for assessing the similarity (alignment) or complementarity (docking) of molecules. Two similarity measures have achieved the most attention. These are the so-called Garbo- [195] and Hodgkin indexes [196] respectively. Others are Pearson s product moment correlation coefficient [169] and Spearman s rank correlation coefficient [169]. [Pg.84]

The molecules and infrared spectra selected for training have a profound influence on the radial distribution function derived from the CPG network and on the quality of 3D structure derivation. Training data are typically selected dynamically that is, each query spectrum selects its own set of training data by searching the most similar infrared spectra, or most similar input vector. Two similarity measures for infrared spectra are useful ... [Pg.181]

FIGURE 53 A plot comparing two similarity measures, MACCS keys and path-based, of methotrexate to 32,000 variants. Note that both approaches recognize that some variants are very different to methotrexate (lower z-scores), but only MACCS keys get close to 0, that is, indistinguishable from a random structure. Path-based approaches still claim significant similarity for structures MACCS keys claim have no similarity to methotrexate. [Pg.105]

However, the question remains as to how the independence of two similarity measures can be established in analogy to the judging example described earlier. One possible approach is to determine the relationship of the representations to each, but it is not immediately obvious how to accomplish this since the mathematical forms of different representations can be quite varied (from molecular fingerprints to property vectors to 3D electron density or related distribution functions). While it may not be possible to solve this problem in general, it may be possible to solve a more limited subproblem by, for example, assessing the relationship of different representations within the same general class such as molecular fingerprints. [Pg.374]

Streaming potential, streaming current and Helmholtz-Smoluchowski equation approach is correct and valid when electrolyte solution is forced through a narrow slit formed by two similar measured surfaces. This ensures that the thickness of the electrochemical double... [Pg.205]

The pair 2,6-dimethylheptane and 2,5-dimethylheptane, found the most similar using ordered row sums, is not even near the top in Table 9.3, but is close to the bottom of the list of most similar nonane pairs using the x(HD) index Clearly, the ordered row sums and the connectivity x(HD) index, not surprisingly, have captured different structural features of graphs and do not carry the same information. But which of the two similarity measures would be better and more suitable when considering similarity of molecular properties ... [Pg.258]

Usually, the denominator, if present in a similarity measure, is just a normalizet it is the numerator that is indicative of whether similarity or dissimilarity is being estimated, or both. The characteristics chosen for the description of the objects being compared are interchangeably called descriptors, properties, features, attributes, qualities, observations, measurements, calculations, etc. In the formiilations above, the terms matches and mismatches" refer to qualitative characteristics, e.g., binary ones (those which take one of two values 1 (present) or 0 (absent)), while the terms overlap and difference" refer to quantitative characteristics, e.g., those whose values can be arranged in order of magnitude along a one-dimensional axis. [Pg.303]

In order to compare two chemical (or any other) objects, e.g., two molecules, we need a measure. Plenty of similarity measures have been proposed they are listed in Table 6-2. Generally speaking these measures can be divided into two cases one of qualitative characteristics, and the other of quantitative characteristics. Here we consider these two cases. [Pg.304]

Following Bradshaw [17], we can give the definition of a similarity measure as follows Consider two objects A and B, a is the number of features (characteristics) present in A and absent in B, b is the number of features absent in A and present in B, c is the number of features common to both objects, and d is the number of features absent from both objects. Thus, c and d measure the present and the absent matches, respectively, i.e., similarity while a and b measure the corresponding mismatches, i.e., dissimilarity. The total ntunber of features is n = a + b + c + d. [Pg.304]

D similarity search methods are quite well developed. Thus, methods which attempt to find overlapping parts (atoms and functional groups) of the molecular moieties studied were reported first [31]. As discussed above for the case of 2D searching, these methods are of combinatorial complexity. To reduce this complexity some field-based methods have been introduced. In this case, the overlap of the fields of two structures is considered as a similarity measure. [Pg.314]

As the scalar product of two vectors is related to the cosine of the angle included by these vectors by Eq. (4), a frequently used similarity measure is the cosine coefficient (Eq. (5)). [Pg.406]

A similarity measure is required for quantitative comparison of one strucmre with another, and as such it must be defined before the analysis can commence. Structural similarity is often measured by a root-mean-square distance (RMSD) between two conformations. In Cartesian coordinates the RMS distance dy between confonnation i and conformation j of a given molecule is defined as the minimum of the functional... [Pg.84]

This relationship depends on the assumption that two similar stationary phases, irrespective of their polarity, can be considered to differ by measuring the ratio of the activity coefficients of two noncomplexing solutes (this basically implies the solute is nonpolar and will only interact with the stationary phase by dispersion forces). If this were true then. [Pg.79]

The effect can be observed and measured by using two similar thermometers (Figure 23.3), one of which has its bulb enclosed in a wet wick. The drier the air passing over them, the greater will be the rate of evaporation from the wick and the greater the difference between the two readings. In the case of air at 25°C, 50% saturation, the difference will be about 6.5 K. The measurements are termed the dry bulb and wet bulb temperatures, and the difference the wet bulb dqrression. [Pg.231]

The PL spectrum and onset of the absorption spectrum of poly(2,5-dioctyloxy-para-phenylene vinylene) (DOO-PPV) are shown in Figure 7-8b. The PL spectrum exhibits several phonon replica at 1.8, 1.98, and 2.15 eV. The PL spectrum is not corrected for the system spectral response or self-absorption. These corrections would affect the relative intensities of the peaks, but not their positions. The highest energy peak is taken as the zero-phonon (0-0) transition and the two lower peaks correspond to one- and two-phonon transitions (1-0 and 2-0, respectively). The 2-0 transition is significantly broader than the 0-0 transition. This could be explained by the existence of several unresolved phonon modes which couple to electronic transitions. In this section we concentrate on films and dilute solutions of DOO-PPV, though similar measurements have been carried out on MEH-PPV [23]. Fresh DOO-PPV thin films were cast from chloroform solutions of 5% molar concentration onto quartz substrates the films were kept under constant vacuum. [Pg.115]

On the other hand, the analyst might not be interested in global retention indices. Indeed, by increasing the temperature for SF3, he would obtain similar retention indices as for the other two. He will then observe that the relative retention time, i.e. the retention times of the substances compared with each other, are the same for SF, and SF3 and different from SFj. Chemically, this means that SF3 has different polarity from SFj, but the same specific interactions. This is best expressed by using the correlation coefficient as the similarity measure. Indeed, rj3 = 1, indicating complete similarity, while r 2 23 much lower. Since both... [Pg.63]

Binary variables usually have values of 0 (for attribute absent) or 1 (for attribute present). The simplest type of similarity measure is the matching coefficient. For two objects i and i and attribute j ... [Pg.65]

The Jaccard similarity coefficient is then computed with eq. (30.13), where m is now the number of attributes for which one of the two objects has a value of 1. This similarity measure is sometimes called the Tanimoto similarity. The Tanimoto similarity has been used in combinatorial chemistry to describe the similarity of compounds, e.g. based on the functional groups they have in common [9]. Unfortunately, the names of similarity coefficients are not standard, so that it can happen that the same name is given to different similarity measures or more than one name is given to a certain similarity measure. This is the case for the Tanimoto coefficient (see further). [Pg.65]

The first of these two is also called the Tanimoto coefficient by some authors. It can be verified that, since distance = 1 - similarity, this is equal to the simple matching coefficient. Clearly, confusion is possible and authors using a certain distance or similarity measure should always define it unambiguously. [Pg.66]

Because x, as well as w are normalized, represents the cosine or correlation coefficient between the two vectors. In a variant of ART, Fuzzy ART, a fuzzy similarity measure is used instead of the cosine similarity measure [14]. [Pg.693]

Two objects are similar and have similar properties to the extent that they have similar distributions of charge in real space. Thus chemical similarity should be defined and determined using the atoms of QTAIM whose properties are directly determined by their spatial charge distributions [32]. Current measures of molecular similarity are couched in terms of Carbo s molecular quantum similarity measure (MQSM) [33-35], a procedure that requires maximization of the spatial integration of the overlap of the density distributions of two molecules the similarity of which is to be determined, and where the product of the density distributions can be weighted by some operator [36]. The MQSM method has several difficulties associated with its implementation [31] ... [Pg.215]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...