Protein database printing

Interpro Combination of the major protein family databases (PRINTS, PROSITE, Pfam, ProDom) Cl dl. http // www.ebi.ac.uk/ interpro/ EBI, Cambridge, R. Apweiler et al. [76, 300]... [Pg.282]

Attwood TK et al. Progress with the PRINTS protein fingerprint database. Nucleic Acids Res 1996 24 182-188. [Pg.113]

It can be difficult if not impossible to find the domain structure of a protein of interest from the primary literature. The sequence may contain many common domains, but these are usually not apparent from searches of literature. Articles defining new domains may include the protein, but only in an alignment figure, which are not searchable. Perhaps, with the advent of online access to articles, the full text including figures may become searchable. Fortunately there have been several attempts to make this hidden information available in away that can be easily searched. These resources, called domain family databases, are exemplified by Prosite, Pfam, Prints, and SMART. These databases gather information from the literature about common domains and make it searchable in a variety of ways. They usually allow a researcher to look at the domain organization of proteins in the sequence database that have been precalculated and also provide a way to search new sequences... [Pg.143]

Most protein families are characterized by several conserved motifs. The PRINTS hngerprint database was developed to use multiple conserved motifs to build diagnostic signatures of family membership (Attwood et al., 1998). If a query sequence fails to match all the motifs in a given hngerprint, the pattern of matches formed by the remaining motifs allows the user to make a reasonable diagnosis. The PRINTS can be accessed by keyword and sequence searches at http //www.bioinf. [Pg.215]

Protein and nucleic acid sequences are submitted electronically to the United States Patent and Trademark Office (USPTO) to avoid the introduction of errors in printed documents and to simplify the job of examining patent claims that include biosequences. Short sequence listings are printable in the USPTO s full text database, but for longer sequences the electronic sequence records are stored in the Publication Site for Issued and Published Sequences (PSIPS), located at http // seqdata.uspto.gov/. [Pg.226]

In addition to conventional sequence motifs (Prosite, BLOCKS, PRINTS, etc.), the compilation of structural motifs indicative of specific functions from known structures has been proposed [268]. This should improve even the results obtained with multiple (one-dimensional sequence) patterns exploited in the BLOCKS and PRINTS databases. Recently, the use of models to define approximate structural motifs (sometimes called fuzzy functional forms, FFFs [269]) has been put forward to construct a library of such motifs enhancing the range of applicability of motif searches at the price of reduced sensitivity and specificity. Such approaches are supported by the fact that, often, active sites of proteins necessary for specific functions are much more conserved than the overall protein structure (e.g. bacterial and eukaryotic serine proteases), such that an inexact model could have a partly accurately conserved part responsible for function. As the structural genomics projects produce a more and more comprehensive picture of the structure space with representatives for all major protein folds and with the improved homology search methods linking the related sequences and structures to such representatives, comprehensive libraries of highly discriminative structural motifs are envisionable. [Pg.301]

Protein motifs can represent, among other things, the active sites of enzymes. They can also identify protein regions involved in determining protein structure and stability. The PROSITE, BLOCKS, and PRINTS databases (3-5) contain hundreds of protein motifs corresponding to enzyme active sites, binding sites, and protein family signatures. Motifs can also be used to identify features that confer particular chemical characteristics (such as thermal stability) on proteins (6). Protein sequence motifs can also be used to classify proteins into families (5). [Pg.272]

The importance of motif discovery is born out by the growth in motif databases such as TRANSFAC, JASPAR, SCPD, DBTBS, RegulonDB (7-10) for DNA motifs and PROSITE, BLOCKS, and PRINTS (3-5) for protein motifs. However, far more motifs remain to be discovered. For example, TFBS motifs are known for only about 500 vertebrate TFs, but it is estimated that there are about 2000 TFs in mammalian genomes alone (7,11). [Pg.272]

BLOCKS [22,23] is an automatically generated protein family database closely related to PRINTS like the latter, it represents patterns characterizing family membership as sets of multiply aligned, ungapped sequence segments (BLOCKS). [Pg.19]

Protein arrays with 10-500 antibodies printed onto an array use detection techniques, including fluorescence and multiple sandwich enzyme-linked immunosorbent assays, but the broad application of these assays is restricted by lack of suitable antibodies for laboratory animals and some potential cross-reactivities between antibodies with similar affinities. Calibration, reproducibility, and identification of proteins are common problems for all of these technologies. A number of databases are available to help investigators identify the numerous proteins found using these separation techniques, particularly for mass spectrometer data. [Pg.172]

An integrated database based on SwissProt, TrEMBL, Pfam, PRINTS and PROSitE, and thus includes data on proteins, protein families, and do-mains/motifs, providing useful information for predicting protein struc-tme and function for sequenced genomes. [Pg.150]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...