Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Similarity Tanimoto

For examples of different types of similarity measures, see Table 6-2. The Tanimoto similarity measure is monotonic with that of Dice (alias Sorensen, Czekanowski), which uses an arithmetic-mean normaJizer, and gives double weight to the present matches. Russell/Rao (Table 6-2) add the matching absences to the nor-malizer in Tanimoto the cosine similarity measure [19] (alias Ochiai) uses a geometric mean normalizer. [Pg.304]

Tanimoto similarity coefficient Also known as the Jaccard c. E =1 W/B ... [Pg.693]

The Jaccard similarity coefficient is then computed with eq. (30.13), where m is now the number of attributes for which one of the two objects has a value of 1. This similarity measure is sometimes called the Tanimoto similarity. The Tanimoto similarity has been used in combinatorial chemistry to describe the similarity of compounds, e.g. based on the functional groups they have in common [9]. Unfortunately, the names of similarity coefficients are not standard, so that it can happen that the same name is given to different similarity measures or more than one name is given to a certain similarity measure. This is the case for the Tanimoto coefficient (see further). [Pg.65]

The pragmatic beauty of the chemical fingerprint is that the more common features of two molecules that there are, the more common bits are set. The mathematic approach used to translate the fingerprint comparison data into a measure of similarity tunes the molecular comparison [5]. The Tanimoto similarity index works well when a relatively sparse fingerprint is used and when the molecules to be compared are broadly comparable in size and complexity [5]. If the nature of the molecules or the comparison desired is not adequately met by the Tanimoto index, multiple other indices are available to the researcher. For example, the Daylight software offers the user over ten similarity metrics, and the Pipeline Pilot as distributed offers at least three. Some of these metrics (e.g., Tversky, Cosine) offer better behavior if the query molecule is significantly smaller than the molecule compared to it. [Pg.94]

Fig. 18.8 Similarity profile for filtered set of commercially available compounds. 5000 randomly selected compounds from the Available Chemicals Directory that pass the REOS filter were ranked according to their Tanimoto similarity scores (vertical axis) using Daylight fingerprints. 2886 compounds (58%) had similarity scores below 0.85. Fig. 18.8 Similarity profile for filtered set of commercially available compounds. 5000 randomly selected compounds from the Available Chemicals Directory that pass the REOS filter were ranked according to their Tanimoto similarity scores (vertical axis) using Daylight fingerprints. 2886 compounds (58%) had similarity scores below 0.85.
Fig. 1. An example of two hydrogen-suppressed graphs G1 G2 and a common substructure CSIG,, G2) and the maximum common substructure MCS(G1 G2) are shown above. The Tanimoto similarity index and the distance between the two chemical graphs are computed below. Fig. 1. An example of two hydrogen-suppressed graphs G1 G2 and a common substructure CSIG,, G2) and the maximum common substructure MCS(G1 G2) are shown above. The Tanimoto similarity index and the distance between the two chemical graphs are computed below.
These two expressions form the basis for several measures such as Tanimoto similarity (see Subheading 2.2. for an extensive discussion)... [Pg.9]

The most widely used similarity measure by far is the Tanimoto similarity coefficient SXan, which is given in set-theoretic language as (cf. Eq. 2.13 for the graph-theoretical case)... [Pg.11]

Fig. 4. (A) The other asymmetric Tversky similarity index, S VC, has a value of 0.69. Exchanging the roles of the query and target molecules (Q<=>T) gives (B), which shows that smaller target molecules are more likely to be retrieved from a large query structure using the asymmetric Tversky similarity index than the Tanimoto similarity index. Fig. 4. (A) The other asymmetric Tversky similarity index, S VC, has a value of 0.69. Exchanging the roles of the query and target molecules (Q<=>T) gives (B), which shows that smaller target molecules are more likely to be retrieved from a large query structure using the asymmetric Tversky similarity index than the Tanimoto similarity index.
As is the case for asymmetric similarity indices, both SPetmax (A,B) and SPet. (A,B) are bounded by zero and unity, but are ordered with respect to each other and with respect to Tanimoto similarity, that is,... [Pg.16]

Distances in these spaces should be based upon an Zj or city-block metric (see Eq. 2.18) and not the Z2 or Euclidean metric typically used in many applications. The reasons for this are the same as those discussed in Subheading 2.2.1. for binary vectors. Set-based similarity measures can be adapted from those based on bit vectors using an ansatz borrowed from fuzzy set theory (41,42). For example, the Tanimoto similarity coefficient becomes... [Pg.17]

The inner-product terms (, is the labeled graph corresponding to Zth basis fragment, vA is the labeled graph corresponding to molecule A, and STan(G ,GA) is the chemical graph-theoretical Tanimoto similarity coefficient. [Pg.26]

Fig. 3. Coverage of chemistry space by four overlapping sublibraries. (A) Different diversity libraries cover similar chemistry space but show little overlap. This shows three libraries chosen using different dissimilarity measures to act as different representations of the available chemistry space. The compounds from these libraries are presented in this representation by first calculating the intermolecular similarity of each of the compounds to all of the other compounds using fingerprint descriptors and the Tanimoto similarity index. Principal component analysis was then conducted on the similarity matrix to reduce it to a series of principal components that allow the chemistry space to be presented in three dimensions. Fig. 3. Coverage of chemistry space by four overlapping sublibraries. (A) Different diversity libraries cover similar chemistry space but show little overlap. This shows three libraries chosen using different dissimilarity measures to act as different representations of the available chemistry space. The compounds from these libraries are presented in this representation by first calculating the intermolecular similarity of each of the compounds to all of the other compounds using fingerprint descriptors and the Tanimoto similarity index. Principal component analysis was then conducted on the similarity matrix to reduce it to a series of principal components that allow the chemistry space to be presented in three dimensions.
Compounds were chosen for further testing using Cousin fingerprint descriptors (73) and the Tanimoto similarity coefficient with a 67% similarity cutoff. [Pg.99]

Fig. 7. (see facing page) Comparison of the intramolecular similarity distribution for four compound collections versus the NCI collection. This figure shows the intermol-ecular similarity (calculated using the Tanimoto similarity coefficient using ISIS fingerprint descriptors) between each compound in each library. The first panel shows how the NCI dataset contains many identical compounds (or salts of the same compound) that have been submitted for testing. [Pg.102]

S-score Similarity score based on Tanimoto similarity to selected desirable compounds. [Pg.115]

One or more lead molecules may be used as a focusing target. Similarity metrics include Daylight fingerprint Tanimoto similarity. The penalty score for each compound in the library is defined as the distance between it and the most similar lead molecule. The penalty score for the library is the average of the individual compound penalty scores. QSAR predictions and docking scores can also be used in this term. [Pg.385]

Retrieved active compound from the GVKBio database in the similarity search with a Tanimoto similarity cut-off of 0.85. [Pg.144]

Fig. 13.6. Results from the third validation study. The -axis represents the Tanimoto similarity score of returned hits with respect to their corresponding query molecule, calculated based on the FCFP4 molecular fingerprints (31). The x-axis are drug molecules in Fig. 13.5. Search hits are color coded by the PGVL reactions (VRXN) where they are originated from. Fig. 13.6. Results from the third validation study. The -axis represents the Tanimoto similarity score of returned hits with respect to their corresponding query molecule, calculated based on the FCFP4 molecular fingerprints (31). The x-axis are drug molecules in Fig. 13.5. Search hits are color coded by the PGVL reactions (VRXN) where they are originated from.
In the NN method, the property F of the target compound is calculated as an average (or weighted average) of that for its NN in the space of descriptors selected for the model. Different metrics (Euclidian distances, Tanimoto similarity coefficients, etc.), can be used to identify the neighbors. Their number k is optimized using a cross-validation procedure for the training set. [Pg.325]


See other pages where Similarity Tanimoto is mentioned: [Pg.408]    [Pg.455]    [Pg.31]    [Pg.10]    [Pg.135]    [Pg.136]    [Pg.523]    [Pg.187]    [Pg.295]    [Pg.320]    [Pg.327]    [Pg.327]    [Pg.330]    [Pg.166]    [Pg.120]    [Pg.10]    [Pg.12]    [Pg.13]    [Pg.23]    [Pg.40]    [Pg.41]    [Pg.63]    [Pg.103]    [Pg.128]    [Pg.137]    [Pg.284]    [Pg.57]    [Pg.57]    [Pg.235]   
See also in sourсe #XX -- [ Pg.65 ]

See also in sourсe #XX -- [ Pg.81 , Pg.295 , Pg.327 ]

See also in sourсe #XX -- [ Pg.70 , Pg.188 , Pg.196 , Pg.211 , Pg.292 ]

See also in sourсe #XX -- [ Pg.82 , Pg.83 ]

See also in sourсe #XX -- [ Pg.284 , Pg.358 , Pg.420 ]

See also in sourсe #XX -- [ Pg.377 ]




SEARCH



Tanimoto

© 2024 chempedia.info