Amino Acid Adjacency Matrix

Let us introduce the amino acid adjacency matrix (AAA matrix) [24] as a tool of graphical bioinformatics. This is a 20 x 20 matrix, the rows and columns of which belong to 20 natural amino acids, which have been ordered as follows [Pg.339]

The matrix element (i,j) of the AAA matrix gives the count of the occurrence of the adjacent pair of amino acids (i,j) in the protein sequence. The element is zero if the [Pg.339]

In Table 13.7, we have listed two proteins that we will continue to use for illustration. In Matrix 13.2, we show the AAA matrix for a short portion of 30 amino adds of the first protein of Table 13.7, the segment (61-90) [Pg.340]

As one can see from Matrix 13.2, if one goes to row L and colnmn G, there is an entry 1, which stands for LG, the first pair of amino acids. The next pair, GP, is registered in row G and colnmn P. Going to row G and colnmn P, the entry is 2 becanse the pair GP appears also in location 7. The considered AAA matrix has 29 entries—29 adjacent pairs for a list of 30 amino acids. [Pg.341]

AAA matrices have been fonnd of interest for characterization of segments of proteins and whole proteins in their comparative studies. Invariants of the AAA matrix have been nsed for characterization of proteins. As we will see in the next section, 20-component vectors based on the differences in rows and colnmns of the AAA matrix were nsed in nenral network analysis for characterization of transmembrane regions of ND6 proteins in order to discriminate between transmembrane and non-transmembrane regions of ND6 proteins [25]. In another stndy, El-Lakkani and El-Sherif [26] presented a 3-D amino acid adjacency matrix nsed on nine ND5 proteins of different species and found high correlation between their results in comparison with results obtained with ClustalW program [27]. Similarly, independently invariants from amino acid adjacency matrix and decagonal isometric matrix were also constructed and similarly nsed in neural network analysis. [Pg.341]

The two matrix representations of the protein segments, the amino acid adjacency matrix and the decagonal isometries matrix, are derived from the sequence information alone. As has been demonstrated, mathematical descriptors, dependent on the sequence information alone, have successfully revealed the underlying characteristics and patterns of given sequences. Their numerical nature also makes them easier to incorporate into a mathematical model. In addition, as has been well illustrated in chemical graph theory, when considering characterization of molecules, one can... [Pg.343]

One of the present authors arrived at the exact solution of the protein alignment problem [34] by modifying the amino acid adjacency matrix [24]. The amino acid adjacency matrix (AAA), as mentioned before, is a 20 x 20 matrix, in which each row and column belong to one of the 20 natural amino acids. The matrix elanent (AAA)ij counts how many times in the primary sequence of a protein is amino acid i followed by amino acid j. The 20 amino acids have been ordered as follows ... [Pg.345]

The AAA matrix for theND6 human protein of Table 13.9 is shown in Matrix 13.3. The first row of the amino acid adjacency matrix for the ND6 human protein thus is... [Pg.345]

It registers the presence of the pairs AR, AG, AI, and AW occurring once and the pairs AL and AM occurring twice in the primary sequence of the ND6 human protein. The zeros indicate that in the sequence of this protein there are no adjacent amino acids AA, AN, AD, AC, AQ, and so on. Similarly, the first row of the amino acid adjacency matrix for the ND6 protein of mouse, the second protein of Table 13.9, is... [Pg.346]

In the case of the AAA matrix, in order to recover the lost information on the location of individual amino acids in the sequence of proteins, one should record the sequential numbers of adjacent amino acids rather than their abundance. This is possible by modifying the amino acid adjacency matrix and using the sequential labels of amino acids as input so that the resulting matrix has full information on protein sequences. The so-modified AAA matrix allows, if required, reconstruction of the protein sequence. Observe that now the matrix elements are not numbers, but sets of numbers, the special case of which are numbers that can also be viewed... [Pg.347]

M. Randid, M. Novic, and M. Vracko, On novel representation of proteins based on amino acid adjacency matrix, SAR QSAR Environ. Res. 19 (2008) 339-349. [Pg.362]

Amino acid adjacency matrices, which in addition to the information on abundance of individual amino acids also give the count of successive pairs of amino acids, are accompanied by some loss of information, in the view that information on the location of various pairs is not known. A remedy for this loss of information is to record the locations of pairs of adjacent amino acids rather than just their occurrence. In this way, the AAA matrix of Matrix 13.13 has a new form shown in Matrix 13.4, in which we show only the portion of the matrix involving elements G-V. The modified AAA matrix shown in Matrix 13.5 is associated with the sequence of the first 45 amino acids of the human ND6 protein. Observe the elements (G, L) and (G, F) that have in the corresponding AAA matrix the abundance count 2, are now replaced by two sequential numbers. Although many matrix elements of the modified AAA matrix may remain to be single numbers, in general, the elements of the modified AAA matrix are not numbers, but sets, that is, collections of numbers. In Matrix 13.5, we show the same portion of the AAA matrix for the second protein of Table 13.10. [Pg.348]

Elastin is a macromolecule synthesized as a 70,000 single peptide chain, termed tropoelastin and secreted into the extracellular matrix where it is rapidly crosslinked to form mature elastin. The carboxy-terminal end of elastin is highly conserved with the sequence Gly-Gly-Ala-Cys-Leu-Gly-Leu-Ala-Cys-Gly-Arg-Lys-Arg-Lys. The two Cys residues that form disulfide crosslinks are found in this region as well as a positively charged pocket of residues that is believed to be the site of interaction with microfibrillar protein residues. Hydrophobic alanine-rich sequences are known to form a helices in elastin these sequences are found near lysine residues that form crosslinks between two or more chains. Alanine residues not adjacent to lysine residues found near proline and other bulky hydrophobic amino acids inhibit a helix formation. Additional evidence exists for (3 structures and 3 turns within elastin thereby giving an overall model of the molecule that contains helical stiff segments connected by flexible segments. [Pg.56]

In these mechanisms, electron transport through the various components of the electron-transport chain leads to structural changes in the proteins of the chain, such that changes in their pKa values (Chap. 3) of ionizable amino acid residues occurs. For example, an increase in the pKa of a residue adjacent to the matrix side of the membrane would lead to proton uptake from the matrix, while a decrease in the pKa of a residue adjacent to the intermembranous side of the membrane could lead to release of a proton. The net effect of these processes is the transfer of protons from the matrix to the intermembranous side of the membrane. However, proton-pump mechanisms do not make strong predictions of the H+/e stoichiometries. [Pg.410]

Protein TOMOCOMD descriptors [Marrero-Ponce, Marrero et al, 2004 Marrero-Ponce, Castillo-Garit et al, 2005a] are linear, bilinear, and quadratic indices calculated from a matrix encoding adjacencies of the a-Carbon atoms in the protein. Each a-Carbon atom is described by any property of the corresponding side-chain amino acid. Then, the vector w is here comprised of numeric values that represent a certain property of all the amino acids in the protein. [Pg.807]

If the protein is comprised of n amino acids, then the vector w of weights is w-dimensional. The protein adjacency matrix has off-diagonal elements equal to 1 if there is a covalent bond between two a-carbon atoms and zero otherwise the diagonal elements are equal to 1 if the amino acid has a hydrogen-bond interaction between its side-chain and the main chain atom. [Pg.807]

By comparing only the two first rows of the ND6 proteins, one can already see that besides being similar they also show dissimilarity. Although the AAA matrix is accompanied with a loss of information on the sequential locations of pairs of adjacent amino acids, it nevertheless offers a useful characterization of proteins. The sequential labels, sites, or locations here indicate the exact position of an amino acid in the sequence of the protein. For example, the sequential labels for alanine are 4, 72, 74, 81, 97,142,144, and 171. [Pg.347]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...