Database similarity searching

Sequence similarity database searching and protein sequence analysis constitute one of the most important computational approaches to understanding protein structure and function. Although most computational methods used for nucleic acid sequence analysis are also applicable to protein sequence studies, how to capture the enriched features of amino acid alphabets (Chapter 6) poses a special challenge for protein analysis. [Pg.129]

Software tools for database searching and available protein sequence databases for protein identification were reviewed [130-131], Liska and Shevchenko [132] reviewed approaches to study proteomes with nonsequenced genomes via sequence similarity database searching. [Pg.478]

Following the similar structure - similar property principle", high-ranked structures in a similarity search are likely to have similar physicochemical and biological properties to those of the target structure. Accordingly, similarity searches play a pivotal role in database searches related to drug design. Some frequently used distance and similarity measures are illustrated in Section 8.2.1. [Pg.405]

A useful empirical method for the prediction of chemical shifts and coupling constants relies on the information contained in databases of structures with the corresponding NMR data. Large databases with hundred-thousands of chemical shifts are commercially available and are linked to predictive systems, which basically rely on database searching [35], Protons are internally represented by their structural environments, usually their HOSE codes [9]. When a query structure is submitted, a search is performed to find the protons belonging to similar (overlapping) substructures. These are the protons with the same HOSE codes as the protons in the query molecule. The prediction of the chemical shift is calculated as the average chemical shift of the retrieved protons. [Pg.522]

The generation of loops is necessary because disconnected regions are often separated by a section where a few amino acids have been inserted or omitted. These are often extra loops that can be determined by several methods. One method is to perform a database search to find a similar loop and then use its geometric structure. Often, other conformation search methods are used. Manual structure building may be necessary in order to find a conformation that connects the segments. Visual inspection of the result is recommended in any case. [Pg.188]

Another technique employs a database search. The calculation starts with a molecular structure and searches a database of known spectra to find those with the most similar molecular structure. The known spectra are then used to derive parameters for inclusion in a group additivity calculation. This can be a fairly sophisticated technique incorporating weight factors to account for how closely the known molecule conforms to typical values for the component functional groups. The use of a large database of compounds can make this a very accurate technique. It also ensures that liquid, rather than gas-phase, spectra are being predicted. [Pg.254]

NC PeiTy, VI van Geerestem. Database searching on the basis of thi ee-dimensional molecular similarity using the SPERM program. I Chem Inf Comput Sci 32(6) 607, 1992. [Pg.368]

Structure and substructure searching are very powerful ways of accessing a database, but they do assume that the searcher knows precisely the information that is needed, that is, a specific molecule or a specific class of molecules, respectively. The third approach to database searching, similarity searching, is less precise in nature because it searches the database for molecules that are similar to the user s query, without formally defining exactly how the molecules should be related (Fig. 8.3). [Pg.193]

The E-state indices may define chemical spaces that are relevant in similarity/ diversity search in chemical databases. This similarity search is based on atom-type E-state indices computed for the query molecule [55]. Each E-state index is converted to a z score, Z =(% -p )/0 , where is the ith E-state atomic index, p is its mean and O is its standard deviation in the entire database. The similarity was computed with the EucHdean distance and with the cosine index and the database used was the Pomona MedChem database, which contains 21000 chemicals. Tests performed for the antiinflamatory drug prednisone and the antimalarial dmg mefloquine as query molecules demonstrated that the chemicals space defined by E-state indices is efficient in identifying similar compounds from drug and drug-tike databases. [Pg.103]

XuE, L, Godden, J., and Bajorath, J. Database searching for compounds with similar biological activity using short binary bit string representations of molecules./. Chem. Inf. Comput. Sci. 1999, 39(5), 881-886. [Pg.196]

The standard screening approach when several active molecules have been identified is pharmacophore mapping followed by 3D database searching. This approach assumes that the active molecules have a common mode of action and that features that are common to all of the molecules describe the pharmacophoric pattern responsible for the observed bioactivity. This is a powerful technique but one that may not be applicable to the structurally heterogeneous hits that characterize typical HTS experiments or sets of competitor compounds drawn from the public literature. In such cases, it is appropriate to consider approaches based on 2D similarity searching and we present here a comparison of approaches for combining the structural information that can be gleaned from a small set of reference structures. [Pg.134]

Fig. 3. Asymmetric similarity searching might provide some benefits not afforded by symmetric similarity searching. (A) Database searching using ISIS keys and symmetric similarity searching, SXan, will not yield enalapril as a database hit because the similarity value is too low, 0.58. (B) Whereas database searching using asymmetric similarity searching, S-jvc, could yield enalapril as a database hit because the asymmetric similarity value is 0.78.

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...