Other Motif Region and Site Prediction

Neural networks have been used successfully for the detection of binding or regulatory sites in nucleic acid sequences that lack clear consensus sequences. This is also true for the prediction of cleavage or acceptor sites in protein sequences. [Pg.133]

From the analyses of many glycosylation sites (GalNAc transferase acceptor sites), a few rules of thumb have been formulated (Wilson et al., 1991) and motif patterns have been proposed (Pisano et al., 1993). Matrix statistics have been compiled for site prediction (Elhammer et al., 1993). However, since there is no clear consensus acceptor sequence pattern and it is strongly influenced by the local conformation, neural network is appropriate for the task. [Pg.133]

Hensen et al. (1998) extended the work by using a jury of four differently trained glycosylation neural networks and one surface accessibility network. The surface accessibility network was used to derive a modulated threshold (Le., cutoff value for glycosylation network output) because O-glycosylation sites were found exclusively on the surface of proteins. If the site and surroundings were predicted surface accessible, the [Pg.133]

In these studies, the input sequence was represented by seven non-orthogonal physicochemical properties, namely, hydrophobicity, volume, surface area, hydrophilicity, bulkiness, refractivity, and polarity, with normalized values of (-1, 1). Thirteen-amino acid sequence windows were used, resulting in an input vector of 91 (i.e., 13 x 7) units. With the small number of training examples, the input units were only partially connected to the hidden layer in order to reduce the number of free parameters and avoid overtraining the network. The constraint was to connect each first hidden layer unit to one amino acid property exclusively, and that the size of the second hidden layer size was no larger than the first hidden layer. The output layer had exactly one unit to indicate whether the middle residue was in a membrane/non-membrane boarder. The process of development started with randomly generated architectures and resulted in 18 and 8 units in the first and second hidden layers, with a total number of free parameters of [Pg.134]

Considering convergence properties of the evolutionary algorithm by comparing the number of different structures occurred in the process (about 655,000) to the size of the stock of structures (1070), it was suggested that local search was carried out in the network architecture and parameter spaces. [Pg.135]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...