Sequence similarity database

The protein sequence database is also a text-numeric database with bibliographic links. It is the largest public domain protein sequence database. The current PIR-PSD release 75.04 (March, 2003) contains more than 280 000 entries of partial or complete protein sequences with information on functionalities of the protein, taxonomy (description of the biological source of the protein), sequence properties, experimental analyses, and bibliographic references. Queries can be started as a text-based search or a sequence similarity search. PIR-PSD contains annotated protein sequences with a superfamily/family classification. [Pg.261]

In the protein structure database PDB ( http //www. rcsb.org/pdb), by X-ray crystallography and NMR spectroscopy, experimentally solved 3D-protein structures are available to the public. Homology model building for a query sequence uses protein portions of known 3D-stmctures as structural templates for proteins with high sequence similarity. [Pg.778]

Peptidases have been classified by the MEROPS system since 1993 [2], which has been available viatheMEROPS database since 1996 [3]. The classification is based on sequence and structural similarities. Because peptidases are often multidomain proteins, only the domain directly involved in catalysis, and which beais the active site residues, is used in comparisons. This domain is known as the peptidase unit. Peptidases with statistically significant peptidase unit sequence similarities are included in the same family. To date 186 families of peptidase have been detected. Examples from 86 of these families are known in humans. A family is named from a letter representing the catalytic type ( A for aspartic, G for glutamic, M for metallo, C for cysteine, S for serine and T for threonine) plus a number. Examples of family names are shown in Table 1. There are 53 families of metallopeptidases (24 in human), 14 of aspartic peptidases (three of which are found in human), 62 of cysteine peptidases (19 in human), 42 of serine peptidases (17 in human), four of threonine peptidases (three in human), one of ghitamicpeptidases and nine families for which the catalytic type is unknown (one in human). It should be noted that within a family not all of the members will be peptidases. Usually non-peptidase homologues are a minority and can be easily detected because not all of the active site residues are conserved. [Pg.877]

This section focuses on the use of SWISS-PROT + TrEMBL for sequence similarity searches. Searches in protein sequence databases have now become a standard research tool in the life sciences. To produce valuable results, the source databases should be comprehensive, nonredundant, well annotated, and up-to-date. However, lack of a single protein sequence database that satisfies all four criteria has previously forced users to perform searches across multiple databases to avoid incomplete results. This strategy normally produces complete but redundant results owing to different versions of the same sequence report in different databases. [Pg.65]

SPTR is distributed in three files sprot.dat.Z, trembl.dat.Z, and trembl new.dat.Z. These files are, as indicated by their Z extension, Unix compress format files, which, when decompressed, produce ASCII files in SWISS-PROT format. Three others files are also available (sprot.fas.Z, trembl.fas.Z, and trembl new.fas.Z), which are compressed fasta format sequence files that are useful for building the databases used by FASTA, BLAST, and other sequence similarity search programs. These files should not be used for other purposes, because all annotation is lost when using this format. The SPTR files are stored in the directory /pub/databases/sp tr nrdb on the EBI FTP server (ftp.ebi.ac.uk) and in the directory /databases/sp tr nrdb on the ExPASyFTP server (ftp. expasy.ch). [Pg.67]

Inhibition of trypsin is another mechanism of activity recently discovered in plant defensins. CfDl and CfD2 from Cassia fistula were the first plant defensins to be identified as trypsin inhibitors. Cp-thionin from cowpea was more recently discovered to have inhibitory potency against trypsin. Searches of protein sequence databases have yielded a number of other plant proteins annotated as trypsin inhibitors or potential trypsin inhibitors. These annotations were most likely made on the basis of sequence similarities with other known trypsin inhibitors, namely the Bowman—Birk trypsin inhibitor. Since the actual framework of the disulfide bonds is not known, it is possible that structure and therefore activity differ from this prototype framework. ... [Pg.264]

BLAST (NCBI Sequence Similarity Search of Nucleotide and Protein Databases)... [Pg.372]

From genome sequences, a plethora of DNA sequence information is, often publicly, accessible in databases. Pieces of genome sequence have often been annotated, based on sequence similarity to other genes with similar function, to possess a certain function. From the gene sequence, the corresponding protein sequence of a putative enzyme can of course be readily derived so that at this stage the target of the search for a new enzyme is defined. [Pg.414]

After comparison of DNA sequences of an open reading frame (orf) of the mandelate pathway with databases it was found that there exists sequence similarity of an otherwise unannotated piece of sequence with some amide hydrolases. It is suspected that this DNA sequence piece codes for a mandelate amidase which catalyzes hydrolysis of mandelamide into mandelate and ammonium ion. This hypothesis recently was borne out and the mandelamide hydrolase found (MacLeish, 2003). [Pg.479]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...