Translated EMBL

TrEMBL (http //www.expasy.org/sprot), database of the European Bioinformatics Institute, translated EMBL. Generated by computer translation of genetic information from the EMBL database. Automatically annotated. [Pg.342]

SWISS-PROT (Bairoch and Apweiler, 2000) is a protein sequence database that, from its inception in 1986, was produced collaboratively by the Department of Medical Biochemistry at the University of Geneva and the EMBL. The database is now maintained collaboratively by Swiss Institute of Bioinformatics (SIB) and EBI/EMBL. SWISS-PROT provides high-level annotations, including descriptions of the function of the protein and of the structure of its domains, its post-translational modifications, its variants, and so on. The database can be accessed from http //expasy.hcuge.ch/sprot/sprot-top.html or numerous mirror sites. In 1966, Translated EMBL (TrEMBL) was created as a computer-annotated supplement to SWISS-PROT (Bleasby et al, 1994). [Pg.214]

Swiss-Prot, TrEMBL Annotated non-redundant protein sequence database, TrEMBL is a computer-annotated supplement to Swiss-Prot. TrEMBL contains the translations of all coding sequences present in the EMBL Nucleotide Sequence Database which are no yet integrated into Swiss-Prot... [Pg.571]

Biological raw data are stored in public databanks (such as Genbank or EMBL for primary DNA sequences). The data can be submitted and accessed via the World Wide Web. Protein sequence databanks like trEMBL provide the most likely translation of all coding sequences in the EMBL databank. Sequence data are prominent, but also other data are stored, e.g.yeast two-hybrid screens, expression arrays, systematic gene-knock-out experiments, and metabolic pathways. [Pg.261]

Various verification steps have been introduced to ensure that SPTR is comprehensive and contains all relevant data sources. The main source of new protein sequences is the translations of CDS in the nucleotide sequence databases. The up-to-date inclusion of new protein sequence entries is ensured by the weekly translation of EMBL-NEW (the updates to the EMBL nucleotide sequence database). The three collaborating nucleotide sequence databases DDBJ, EMBL, and GenBank exchange their data on a daily basis. Therefore any protein coding sequence submitted to DDBJ/EMBL/GenBank will appear in SPTR within 2 weeks in the worst case and within less than 1 week in the average case. [Pg.66]

SWISS-PROT (Hofmann et al., 1999) is a curated protein sequence database maintained by the Swiss Institute of Bioinfornmatics and is a collaborative partner of EMBL. The database consists of SWISS-PROT and TrEMBL, which consists of entries in SWISS-PROT-like format derived from the translation of all CDS in the... [Pg.222]

EMBL Nucleotide Sequence Database. SWISS-PROT consists of core sequence data with minimal redundancy, citation and extensive annotations including protein function, post-translational modifications, domain sites, protein structural information, diseases associated with protein deficiencies and variants. SWISS-PROT and TrEMBL are available at EBI site, http //www.ebi.ac.uk/swissprot/, and ExPASy site, http //www.expasy.ch/sprot/. From the SWISS-PROT and TrEMBL page of ExPASy site, click Full text search (under Access to SWISS-PROT and TrEMBL) to open the search page (Figure 11.3). Enter the keyword string (use Boolean expression if required), check SWISS-PROT box, and click the Submit button. Select the desired entry from the returned list to view the annotated sequence data in Swiss-Prot format. An output in the fasta format can be requested. Links to BLAST, feature table, some ExPASy proteomic tools (e.g., Compute pI/Mw, ProtParam, ProfileScan, ProtScale, PeptideMass, ScanProsite), and structure (SWISS-MODEL) are provided on the page. [Pg.223]

UniProtKB/TrEMBL a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot. [Pg.408]

The UniProt KB is an automatically and manually annotated protein database drawn from translation of DDBJ/EMBL-Bank/GenBank coding sequences and directly sequenced proteins. Each sequence receives a imique, stable identifier allowing unambiguous identification of any protein across datasets. The KB also provides cross-references to external data collections such as the underlying DNA sequence entries in the DDBJ/EMBL-Bank/GenBank nucleotide sequence databases, 2D PAGE and 3D protein structure databases, various protein domain... [Pg.23]

On the one hand, genome sequencing projects are generating a dramatically increasing number of sequences to be incorporated in SWISS-PROT. On the other, the number of annotators, who screen literature and sequence databases to populate SWISS-PROT with the high quality annotations, is limited. However, despite the increase in available raw sequence data it was not judged appropriate to automatically populate SWISS-PROT with data of lower quality standards. This is where TrEMBL (Translation of EMBL... [Pg.537]

Nucleotide Sequence Database [26]) steps in. TrEMBL was created in 1996 and consists of computer-annotated entries in SWISS-PROT-like format. It is populated by protein sequences translated from the coding sequences (CDS) in EMBL and is a supplement to SWISS-PROT. In a way, it can be considered as a preliminary section of SWISS-PROT indeed, once the manual annotation is performed, the entries move on to SWISS-PROT. [Pg.538]

The protein nr database consists of conceptual translations of the coding regions annotated on GenBank/EMBL/DDBJ database and protein sequences from databases such as SwissProt and Protein Data Bank. Information about other possible databases can be obtained from http //www.ncbi.nlm.nih.gov/blast/ producttable.shtml db. [Pg.185]

The UniProt Archive (UniParc) provides a stable, comprehensive, nonredundant sequence collection by storing the complete body of publicly available protein sequence data. Although most protein sequence data are derived from the translation of DDBJ/EMBL/GenBank sequences, primary protein sequence data are also submitted directly to UniProt or derived from the PDB entries. The Archive also captures protein sequence data from other sources such as Ensemble, International Protein Index (IPI), NCBI-RefSeq, FlyBase, and WormBase. Each protein sequence is assigned to a unique UniParc identifier (UPI ) and represented only once in the Archive. In UniParc, the... [Pg.601]

The increasing numbers of stored protein and nucleic acid sequences, and the recognition that functionally related proteins often had similar sequences, catalyzed the development of statistical techniques for sequence comparison which underlie many of the core bioinformatic methods used in proteomics today. Nucleic acid sequences are stored in three primary sequence databases - GenBank, the EMBL nucleotide sequence database, and the DNA database of Japan (DDBJ) - which exchange data every day. These databases also contain protein sequences that have been translated from DNA sequences. A dedicated protein sequence database, SWISS-PROT, was founded in 1986 and contains highly curated data concerning over 70 000 proteins. A related database, TrEMBL, contains automatic translations of the nucleotide sequences in the EMBL database and is not manually curated. [Pg.3960]

TREMBL TRanslation from EMBL sequence database... [Pg.2164]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...