Statistical methods, structure prediction from sequence

STRUCTURE PREDICTION FROM SEQUENCE BY STATISTICAL METHODS... [Pg.276]

There is a considerable impetus to predict accurately protein structures from sequence information because of the protein sequence/structure deficit as a consequence of the genome and full-length cDNA sequencing projects. The molecular mechanical (MM) approach to modeling of protein structures has been discussed in section 9.2, and the protein secondary structure prediction from sequence by statistical methods has been treated in section 9.5. The prediction of protein structure using bioinformatic resources will be described in this subsection. The approaches to protein structure predictions from amino acid sequences (Tsigelny, 2002 Webster, 2000) include ... [Pg.616]

The statistical methods for predicting secondary structures of proteins from amino acid sequences are widely practiced among investigators in biochemistry and can be accessed at Network Protein Sequence Analysis (NPS ) via http //npsa-pbil.ibcp.fr... [Pg.279]

To gain the most predictive utility as well as conceptual understanding from the sequence and structure data available, careful statistical analysis will be required. The statistical methods needed must be robust to the variation in amounts and quality of data in different protein families and for structural features. They must be updatable as new data become available. And they should help us generate as much understanding of the determinants of protein sequence, structure, dynamics, and functional relationships as possible. [Pg.314]

A common use of statistics in structural biology is as a tool for deriving predictive distributions of strucmral parameters based on sequence. The simplest of these are predictions of secondary structure and side-chain surface accessibility. Various algorithms that can learn from data and then make predictions have been used to predict secondary structure and surface accessibility, including ordinary statistics [79], infonnation theory [80], neural networks [81-86], and Bayesian methods [87-89]. A disadvantage of some neural network methods is that the parameters of the network sometimes have no physical meaning and are difficult to interpret. [Pg.338]

For example, Stolorz et al. [88] derived a Bayesian formalism for secondary structure prediction, although their method does not use Bayesian statistics. They attempt to find an expression for / ( j. seq) = / (seq j.)/7( j.)//7(seq), where J. is the secondary structure at the middle position of seq, a sequence window of prescribed length. As described earlier in Section II, this is a use of Bayes rule but is not Bayesian statistics, which depends on the equation p(Q y) = p(y Q)p(Q)lp(y), where y is data that connect the parameters in some way to observables. The data are not sequences alone but the combination of sequence and secondary structure that can be culled from the PDB. The parameters we are after are the probabilities of each secondary structure type as a function of the sequence in the sequence window, based on PDB data. The sequence can be thought of as an explanatory variable. That is, we are looking for... [Pg.338]

With regard to theoretical methods, several approaches based on statistical, hydro-phobic and pattern recognition methods have been proposed (Sawyer and Holt, 1993). Cumulative or joint prediction methods, with supplementary information from spectroscopic methods and the use of templates and sequence information from related proteins, were shown to improve the confidence of prediction, as assessed by comparison to X-ray crystallographic structures. Despite the great interest and advances in research in these areas, the accuracy of these secondary structure predictions (i.e. theoretical methods) still remains at only about 60%. Even when the structure of structurally related or homologous proteins is known, the accuracy of prediction is only 70.9% (Mehta et aL, 1995). Furthermore, these methods cannot easily be applied to monitor changes in protein secondary structure induced by processing. [Pg.20]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...