Amino acid sequences databases

The basic structural unit of these two-sheet p helix structures contains 18 amino acids, three in each p strand and six in each loop. A specific amino acid sequence pattern identifies this unit namely a double repeat of a nine-residue consensus sequence Gly-Gly-X-Gly-X-Asp-X-U-X where X is any amino acid and U is large, hydrophobic and frequently leucine. The first six residues form the loop and the last three form a p strand with the side chain of U involved in the hydrophobic packing of the two p sheets. The loops are stabilized by calcium ions which bind to the Asp residue (Figure S.28). This sequence pattern can be used to search for possible two-sheet p structures in databases of amino acid sequences of proteins of unknown structure. [Pg.84]

Figure 17.2 An example of prediction of the conformations of three CDR regions of a monoclonal antibody (top row) compared with the unrefined x-ray structure (bottom row). LI and L2 are CDR regions of the light chain, and HI is from the heavy chain. The amino acid sequences of the loop regions were modeled by comparison with the sequences of loop regions selected from a database of known antibody structures. The three-dimensional structure of two of the loop regions, LI and L2, were in good agreement with the preliminary x-ray structure, whereas HI was not. However, during later refinement of the x-ray structure errors were found in the conformations of HI, and in the refined x-ray structure this loop was found to agree with the predicted conformations. In fact, all six loop conformations were correctly predicted in this case. (From C. Chothia et al.. Science 233 755-758, 1986.)...

Homologous proteins have similar three-dimensional structures. They contain a core region, a scaffold of secondary structure elements, where the folds of the polypeptide chains are very similar. Loop regions that connect the building blocks of the scaffolds can vary considerably both in length and in structure. From a database of known immunoglobulin structures it has, nevertheless, been possible to predict successfully the conformation of hyper-variable loop regions of antibodies of known amino acid sequence. [Pg.370]

The World Wide Web has transformed the way in which we obtain and analyze published information on proteins. What only a few years ago would take days or weeks and require the use of expensive computer workstations can now be achieved in a few minutes or hours using personal computers, both PCs and Macintosh, connected to the internet. The Web contains hundreds of sites of Interest to molecular biologists, many of which are listed in Pedro s BioMolecular Research Tools (http // www.fmi.ch/biology/research tools.html). Many sites provide free access to databases that make it very easy to obtain information on structurally related proteins, the amino acid sequences of homologous proteins, relevant literature references, medical information and metabolic pathways. This development has opened up new opportunities for even non-specialists to view and manipulate a structure of interest or to carry out amino-acid sequence comparisons, and one can now rapidly obtain an overview of a particular area of molecular biology. We shall here describe some Web sites that are of interest from a structural point of view. Updated links to these sites can be found in the Introduction to Protein Structure Web site (http // WWW.ProteinStructure.com/). [Pg.393]

Eng, J. K. McCormack, A. L. Yates, J. R. An approach to correlate tandem mass spectra data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 1994,5, 976-989. [Pg.274]

Upon its generation, sequence information is normally submitted to various databases. The major databases in which protein primary sequence data are available are listed in Table 2.4. Also included in this table are the major nucleic acid sequence databases, as amino acid sequence information can potentially be derived from these. [Pg.21]

Salutaridinol 7-O-acetyltransferase was purified to apparent electrophoretic homogeneity from P. somniferum cell suspension cultures and the amino acid sequence of ten endoproteinase Lys-C-generated peptides was determined.28 A comparison of these amino acid sequences with those available in the GenBank/EMBL sequence databases indicated no relevant similarity to known proteins. The first attempt to isolate a cDNA encoding salutaridinol 1-0-... [Pg.173]

Another major source are the amino acid sequences direcdy derived from protein sequencing. Thousands of such sequences have been detected by the SWISS-PROT curators in publications (or have been directly submitted by researchers to SWISS-PROT) and entered into the database. Protein sequences detected by the NCBI journal scan have also been included. For some proteins the Brookhaven Protein Data Bank (PDB) (Abola et al., 1996) is the only source for the sequence information. The PDB entries are checked regularly, and new SWISS-PROT entries were created whenever necessary. [Pg.66]

How does one go about finding all of the relevant proteins in a database once it has been decided to carry out an analysis of an entire protein family The simplest approach is to use similarity search software such as SSEARCH or FASTA (Smith and Waterman, 1981 Pearson and Lipman, 1988) or BLAST (Altschul et al, 1997) with the amino acid sequences of one or two well-known members of the family as queries. The problem is initially the same as that of identifying all proteins that are homologous to a family of proteins, although with some important practical differ-... [Pg.112]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...