Finding New Domains

In sequence comparison, common protein domains such as the tyrosine kinase domain can mask other interesting matches (Sonnhammer and Durbin, 1994). Other weak but interesting matches may be lost in a large list of matches to the common domain. Thus domain databases are a useful way to identify these common domains so that they can be removed or masked in the sequence to allow the detection of weaker or less common domain similarities. [Pg.148]

PSI-BLAST (Altschul et al., 1997) is rapidly becoming the tool of choice for protein database searching because of its speed, sensitivity and ease of use. It presents a very large improvement over the older versions of BLAST and new gapped-BLAST implementations (Altschul et al., 1997 Altschul et al., 1990). However, a naive user can easily be misled or not get the best possible results. PSI-BLAST is available as a stand-alone program from ftp //and as an interactive web tool at http //www.ncbi. nlm.nih.gov/cgibin/BLAST/nph-psi blast [Pg.151]

Generic Problems with Iterative Multiple Sequence Methods [Pg.154]

Profile methods have some inherent problems. These are discussed in the following sections. [Pg.154]

The final round scores from iterative profile methods do not reflect the real significance of the match to the query sequence. The significance says how likely the protein segment matches to the profile constructed in the previous round. For example, if a false-positive match with an E [Pg.154]

New domains and their boundaries have been defined manually from sequence alone for literally hundreds of protein domains. Finding regions of similarity between proteins allows detection of domains. However, defining the exact boundaries of the domain is often a more difficult problem. Certain rules can be used to find the maximum size of a domain from pairwise comparisons of proteins in a related family. [Pg.141]

It can be difficult if not impossible to find the domain structure of a protein of interest from the primary literature. The sequence may contain many common domains, but these are usually not apparent from searches of literature. Articles defining new domains may include the protein, but only in an alignment figure, which are not searchable. Perhaps, with the advent of online access to articles, the full text including figures may become searchable. Fortunately there have been several attempts to make this hidden information available in away that can be easily searched. These resources, called domain family databases, are exemplified by Prosite, Pfam, Prints, and SMART. These databases gather information from the literature about common domains and make it searchable in a variety of ways. They usually allow a researcher to look at the domain organization of proteins in the sequence database that have been precalculated and also provide a way to search new sequences... [Pg.143]

The current state of design processes can essentially not be improved by making only small steps. Instead, a new approach is necessary. Thereby, we face principal questions and nontrivial problems. We find new questions and corresponding problems by coherently and uniformly modeling the application domain and by defining new and substantial tool functionality. The layered process/product model is a scientific question which - even in a long-term project like IMPROVE - can only be answered partially. [Pg.65]

Do we still need additional molecular descriptors Well, there will always be room for new and novel molecular descriptors that will incorporate some structural information that has not been well captured by current molecular descriptors. But these days, novel molecular descriptors are hard to find, particularly if they are to be conceptually and computationally simple and have straightforward structural interpretation. Finding such will continue to be not an everyday event. However, recall a statement of E. Bright Wilson [1] ... every once in a while some new theory or new experimental method or apparatus makes it possible to enter a new domain. ... [Pg.219]

Procedures like the 15 percent rule and bootlegging encourage exploration, and people further enhance their chances of finding new ideas by seeking input from new domains. Evaluation metrics focus on development processes and project milestones, and they do not try to assess the impact on current results. [Pg.207]

Most of the microporous and mesoporous compounds require the use of structure-directing molecules under hydro(solvo)thermal conditions [14, 15, 171, 172]. A serious handicap is the application of high-temperature calcination to develop their porosity. It usually results in inferior textural and acidic properties, and even full structural collapse occurs in the case of open frameworks, (proto) zeolites containing small-crystalline domains, and mesostructures. These materials can show very interesting properties if their structure could be fully maintained. A principal question is, is there any alternative to calcination. There is a manifested interest to find alternatives to calcination to show the potential of new structures. [Pg.132]

This chapter has reviewed the application of ROA to studies of unfolded proteins, an area of much current interest central to fundamental protein science and also to practical problems in areas as diverse as medicine and food science. Because the many discrete structure-sensitive bands present in protein ROA spectra, the technique provides a fresh perspective on the structure and behavior of unfolded proteins, and of unfolded sequences in proteins such as A-gliadin and prions which contain distinct structured and unstructured domains. It also provides new insight into the complexity of order in molten globule and reduced protein states, and of the more mobile sequences in fully folded proteins such as /1-lactoglobulin. With the promise of commercial ROA instruments becoming available in the near future, ROA should find many applications in protein science. Since many gene sequences code for natively unfolded proteins in addition to those coding for proteins with well-defined tertiary folds, both of which are equally accessible to ROA studies, ROA should find wide application in structural proteomics. [Pg.109]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...