Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Shannon Entropy Information Content

There are many papers in QSAR literature that describe the use of the discontinuous form of the Shannon entropy (Shannon 1948) as amolecular descriptor. This descriptor is computed using the values of certain other descriptors and Formula (4.34). In QSAR calculations, the Shannon entropy (SE) is considered a measure of diversity of descriptor values and it is usually called information content. [Pg.117]

If k=k ef category includes the same number of values, SE in Formula (4.34) has the maximum value. If A =1 (all values are included within the same category), SE=0 because The ratio between SE and its maximum value is another descriptor. A very simple example is the computation of SE for guanine (C5H5N5Q) when the descriptor is the atomic number. The total number of atoms is 16 (N= 16). There are four nonempty categories of atoms (k=4). These categories include 5, 5, 5, and 1 atoms, respectively. Consequently, the value of SEj,o,ns is [Pg.117]

In the guanine molecule (number of chemical bonds Af=12), according to Table 4.1, there are only two nonempty categories (single and aromatic) of chemical bonds (fe=2). These categories include 2 and 10 bonds, respectively. Consequently, the value of is [Pg.118]

In guanine, the diversity of chemical bonds (from the point of view of bond orders) is much smaller than the diversity of atoms (from the point of view of atomic numbers). In addition, one can say that the amide chemical group in guanine seems to be nonexistent because the bond order of the C-N bond is too small [Pg.118]

Putting the values of many descriptors (net charges, geometric distances, features of molecular fragments, etc.) into categories is a difficult task. In contrast, it is easy to put the features of kenographs into categories. [Pg.118]


What is complexity There is no good general definition of complexity, though there are many. Intuitively, complexity lies somewhere between order and disorder, between regularity and randomness, between perfect crystal and gas. Complexity has been measured by logical depth, metric entropy, information content (Shannon s entropy), fluctuation complexity, and many other techniques some of them are discussed below. These measures are well suited to specific physical or chemical applications, but none describe the general features of complexity. Obviously, the lack of a definition of complexity does not prevent researchers from using the term. [Pg.28]

The Shannon Equation Eq. (1)) [4] enables one to evaluate the information content, I (also known as the Shannon entropy), of the system. [Pg.208]

Jeffrey W. Godden and Jurgen Bajorath, Analysis of Chemical Information Content Using Shannon Entropy. [Pg.450]

Information theory 71,72) is a convenient basis for the quantitative characterization of structures. It introduces simple structural indices called information content (total or mean) of any structured system. For such a system having N elements distributed into classes of equivalence Nls N2,..., Nk a probability distribution P pi, p2,..., pk is constructed (p = Nj/N). The entropy of this distribution, calculated 71) by the Shannon formula1 ... [Pg.42]

The Shannon information content (in thermodynamics, the entropy) can be calculated from the probability distribution of allowed amino acids substitutions (Fontana and Shuster, 1987 Saven and Wolynes, 1997 Dewey and Donne, 1998). Counting the number of sequences at a given fitness is mathematically isomorphic to calculating the entropy at a given fitness, S(F) = In fl, where the number of states (l is the number of sequences at fitness F. The entropy. v, for a given site i can also be calculated from Eq. (25)... [Pg.128]

If /is large, then the amino acid found at that position is highly conserved. Conversely, a low / indicates that many amino acids are allowed at that site. This information content is directly related to the Shannon entropy through a Boltzman weighting of the fitness changes w,(a) to calculate the probability that a residue is in amino acid state pi a). [Pg.129]

Binary descriptors should be used when the considered characteristic is really a dual characteristic of the molecule or when the considered quantity cannot be represented in a more informative numerical form. In any case, the - mean information content of a binary descriptor /char is low (the maximum value is 1 when the proportions of 0 and 1 are equal), thus the standardized Shannon s entropy = /char/log2 , where n is the number of elements, gives a measure of the efficiency of the collected information. [Pg.234]

From the obtained equivalence classes in the hydrogen-filled multigraph, for each rth order (usually r = 0 - 6), the rth order neighbourhood Information Content IC is calculated as defined by Shannon s entropy ... [Pg.235]

The mean information content 7, also called Shannon s entropy H [Shannon and Weaver, 1949], is defined as ... [Pg.239]

The standardized Shannon s entropy (or standardized information content) is the ratio between the actual mean information content and the maximum available information content (i.e. the Hartley information) ... [Pg.240]

Model complexity is defined as the ratio between the multivariate entropy Sx of the X-block n objects and p variables) of the model and - Shannon s entropy Hyoi the y response vector, thus also accounting for the information content of the y response [Authors, This Book] ... [Pg.296]

Shannon s entropy mean information content - information content shape descriptors... [Pg.390]

When the total information content is calculated on molecules, n being the total number of atoms and the number of equivalent atoms of type g, it is often referred to as molecular negentropy. The term H is Shannon s entropy that is defined below. [Pg.413]

Several descriptors are based on the concepts of information content and entropy among these are the topological information indices, indices of neighborhood symmetry, Shannon... [Pg.416]

The Shannon s entropy is also largely applied to the analysis of the information content of molecular descriptors vithin data sets of molecules and to assess the diversity of chemical libraries [Lin, 1996b Bajorath, 2001]. [Pg.416]

Information content and Shannon s entropy of molecular descriptors were extensively studied by Bajorath, Godden, and coworkers in several papers [Godden, Stahura et al., 2000 Godden and Bajorath, 2000, 2002, 2003]. [Pg.516]

Shannon s entropy = mean information content information content... [Pg.684]

Shannon entropy applied to gene expression time series. The equation for Shannon entropy (H) is shown above. For each gene, entropy accounts for the probability, p (or frequency), of a level of gene expression, /. Data from [8] (see Figure 5.1) are binned into three expression levels. SOD, superoxide dismutase NFM, neurofilament medium GRalpha4, GABA receptor alpha 4 subunit. Actin has zero entropy (zero information content) because its expression is... [Pg.561]

Going beyond statistical analyses, information encoded in molecular descriptor value distributions in databases of natural or synthetic compounds was analyzed in quantitative terms by application of an entropy-based information-theoretic approach. - Descriptor value distributions, represented in histograms, can be reduced to their information content using Shannon entropy (SE) calculations. Differences in information content between databases can be quantified using differential SE (DSE) analysis. " An extension of this approach, SE-DSE analysis," makes it possible to classify molecular descriptors according to their relative information content in diverse databases and to identify those descriptors that are most responsive to systematic differences between compound databases. Figure... [Pg.58]

Consider the Fisher information for locality [1,2], called intrinsic accuracy, which historically predates the Shannon entropy by about 25 years being proposed at about the same time that the final form of quantum mechanics was shaped. This local measure of the information content of the continuous (normalized) classical probability density p r) reads... [Pg.151]

Quantitative measures of the compactness of an A-electron wave function have been reported in Ref. [18] by means of the informational content Qc (or Shannon entropy) within the traditional Cl expansion method based on Slater determinants classified according to the excitation level with respect to a given reference determinant. Assuming that the A-electron wave function is normalized to unity Ea(S2) ICa(S2)P = 1), the counterpart formulation of that index for the seniority-based Cl approach is... [Pg.117]


See other pages where Shannon Entropy Information Content is mentioned: [Pg.117]    [Pg.117]    [Pg.409]    [Pg.410]    [Pg.203]    [Pg.28]    [Pg.635]    [Pg.263]    [Pg.135]    [Pg.177]    [Pg.10]    [Pg.177]    [Pg.127]    [Pg.230]    [Pg.231]    [Pg.236]    [Pg.560]    [Pg.248]    [Pg.417]    [Pg.175]   


SEARCH



Entropy, informational

Information content

Information entropy

Shannon

Shannon information content

© 2024 chempedia.info