Metrics, similarity

Schuffenhauer A, Floersheim P, Acklin P, Jacoby E. Similarity metrics for ligands reflecting the similarity of the target proteins. J Chem Inf Comp Sci 2003 43 391-405. [Pg.372]

Clearly, within the conceptual framework described above, there is extensive room for exploration in creating fingerprints and similarity measures to retrieve molecules based on varying conceptions of similarity [42—441. The simplest types of fingerprint consist simply of features indices that map the presence or absence of a small library of functional groups. The most well known and effective are the MACCS keys. These were initially chemical feature indices, that we later used successfully as a similarity metric. [Pg.93]

The pragmatic beauty of the chemical fingerprint is that the more common features of two molecules that there are, the more common bits are set. The mathematic approach used to translate the fingerprint comparison data into a measure of similarity tunes the molecular comparison [5]. The Tanimoto similarity index works well when a relatively sparse fingerprint is used and when the molecules to be compared are broadly comparable in size and complexity [5]. If the nature of the molecules or the comparison desired is not adequately met by the Tanimoto index, multiple other indices are available to the researcher. For example, the Daylight software offers the user over ten similarity metrics, and the Pipeline Pilot as distributed offers at least three. Some of these metrics (e.g., Tversky, Cosine) offer better behavior if the query molecule is significantly smaller than the molecule compared to it. [Pg.94]

The aim of case-based reasoning is to provide advice based on a set of known examples that are judged to be relevant to the user s query. Files within the library contain data about past cases relevant to the area of expertise, how they were tackled, what the results of this approach were, and whether the action taken was appropriate and successful. Each case is tagged with a set of attributes that describe the case, so that when the library is searched for relevant material, it can quickly be identified through some form of similarity metric. [Pg.225]

Application of a field up, however, leads to a dramatic switching of the extinction brushes counterclockwise, to give the configuration shown on the left in Figure 8.35. This dramatic, chiral EO response is mirrored by other domains in the sample of opposite handedness but similar metrics. If the field is then removed, domains of the type shown on the left in Figure 8.35 are... [Pg.511]

Horvath, D. and Jeandenans, C. (2003) Neighborhood behavior of in silico structural spaces with respect to in vitro activity spaces - a benchmark for neighborhood behavior assessment of different in silico similarity metrics. Journal of Chemical Information and Computer Sciences, 43, 691-698. [Pg.52]

Sahm N, Holliday J, Willett P. (2003) Combination of Fingerprint-Based Similarity Coefficients Using Data Fusion. /. Chem. Inf. Comp. Set. 43 435-442. Schuffenhauer A, Floersheim P, Acklin P, Jacoby E. (2003) Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins. J. Chem. Inf. Comp. Set. 43 391-405. [Pg.155]

At the level of individual hits, the database can be queried to retrieve either marketed BioPrint drugs that have that same activity, or the ADR associations discussed in the previous section can be queried to identify potential ADRs and their relative risks. At the profile level, compounds with similar profiles can be identified using standard statistical methods such as similarity metrics and hierarchical clustering. This similarity can be assessed using the whole panel of assays or by using selected subsets of those assays as determined by the user. Once compounds with similar profiles have been identified, in vivo data for the similar compoimds can be accessed and examined for information that may permit the user to anticipate in vivo effects. [Pg.198]

One or more lead molecules may be used as a focusing target. Similarity metrics include Daylight fingerprint Tanimoto similarity. The penalty score for each compound in the library is defined as the distance between it and the most similar lead molecule. The penalty score for the library is the average of the individual compound penalty scores. QSAR predictions and docking scores can also be used in this term. [Pg.385]

As stated previously for the topological CATS descriptor [31], the influence of different similarity metrics on the overall enrichment is marginal. For the full... [Pg.65]

The first method used a similarity metric to select the top percentage of hits and the second method does the selection based only on number of common pharmacophores between the receptor active site fingerprint and the 3D fingerprint for compounds in the virtual library. Both analysis techniques are extremely fast. One of the major advantages of FLIP technology is its throughput. [Pg.199]

All pattern classification methods listed group items by similarity. Measurement of similarity differs depending upon the method, and therefore different methods yield different results. Table 12 describes several similarity metrics and why they produce different results for the same data set. [Pg.542]

If descriptor combinations are expressed as bit strings (often called fingerprints, as described in more detail later on), each test molecule is assigned a characteristic bit pattern, and pair-wise molecular similarity can be assessed by quantifying the overlap of bit strings using various similarity metrics (coefficients). Examples are shown in Table 1.4. [Pg.8]

Search algorithms have advanced over the years to the point that most of the spectral data are used in the search. The methods are referred to as full-spectra searches because the entire spectral pattern is used in the matching procedure. Again, a number of similarity metrics are used, but most produce similar results. Typically, the spectral range for the search is selectable, and the library and target spectra are all normalized so that the total spectral area is 1.0. Either the Euclidean distance or the dot product between the target and library entries is calculated. The Euclidean distance is defined as... [Pg.286]

Upon examining next that aggregation of atoms that would be known as calix[3]arene, nearly similar metric problems are in evidence. Namely, without the hydroxyl groups the molecule would approach coplanarity however, the hydroxyl groups introduce Coulomb repulsion resulting in extreme lability. As above, despite the non-viability of such a molecule,... [Pg.233]

The objective of a spread design is to identify a subset of molecules in which the molecules are as dissimilar as possible under a given similarity metric. For a given metric to measure the similarity of a subset, all subsets of size k (plus any molecules previously selected) could be evaluated and the subset that produces the lowest similarity measure chosen. In practice, simple non-optimal sequential algorithms are often used to approximate the maximally dissimilar subset two such algorithms are described below. [Pg.84]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...