Entropy-based information theory

Two conclusions can be derived from these results. First, it is feasible to use entropy-based information theory to select fewer than 10 chemical descriptors that can systematically distinguish between compounds from different sources. Second, when selecting descriptors to distinguish between compounds, it is important that these descriptors have high information content that can support separability or differentiate compounds between the datasets. The power of the entropic separation revealed in this analysis gave rise to the development of the DSE and, ultimately, the SE-DSE metric, as described earlier. [Pg.283]

Coifman, R. R., and Wickerhauser, M. V., Entropy-based algorithms for best basis selection, IEEE Trans. Inform. Theory 38(2), 713-718 (1992). [Pg.98]

The recent theoretical approach based on the information theory (IT) in studying aqueous solutions and hydration phenomena [62 66] shows such a direction. IT is a part of the system based on a probabilistic way of thinking about communication, introduced in 1948 by Sharmon and subsequently developed [114]. It consists in the quantitative description of the information by defining entropy as a function of probability... [Pg.707]

The maximum entropy method (MEM) is an information-theory-based technique that was first developed in the field of radioastronomy to enhance the information obtained from noisy data (Gull and Daniell 1978). The theory is based on the same equations that are the foundation of statistical thermodynamics. Both the statistical entropy and the information entropy deal with the most probable distribution. In the case of statistical thermodynamics, this is the distribution of the particles over position and momentum space ( phase space ), while in the case of information theory, the distribution of numerical quantities over the ensemble of pixels is considered. [Pg.115]

The maximum entropy method (MEM) is based on the philosophy of using a number of trial spectra generated by the computer to fit the observed FID by a least squares criterion. Because noise is present, there may be a number of spectra that provide a reasonably good fit, and the distinction is made within the computer program by looking for the one with the maximum entropy as defined in information theory, which means the one with the minimum information content. This criterion ensures that no extraneous information (e.g., additional spectral... [Pg.74]

The solution of a protein crystal structure can still be a lengthy process, even when crystals are available, because of the phase problem. In contrast, small molecule (< 100 atoms) structures can be solved routinely by direct methods. In the early fifties it was shown that certain mathematical relationships exist between the phases and the amplitudes of the structure factors if it is assumed that the electron density is positive and atoms are resolved [255]. These mathematical methods have been developed [256,257] so that it is possible to solve a small molecule structure directly from the intensity data [258]. For example, the crystal structure of gramicidin S [259] (a cyclic polypeptide of 10 amino acids, 92 atoms) has been solved using the computer programme MULTAN. Traditional direct methods are not applicable to protein structures, partly because the diffraction data seldom extend to atomic resolution. Recently, a new method derived from information theory and based on the maximum entropy (minimum information) principle has been developed. In the immediate future the application will require an approximate starting phase set. However, the method has the potential for an ab initio structure determination from the measured intensities and a very small sub-set of starting phases, once the formidable problems in providing numerical methods for the solution of the fundamental equations have been solved. [Pg.406]

The maximum entropy method achieves a remarkable universality and unification based on common sense reduced to calculation . It has been applied to information theory, statistical mechanics, image processing in radio astronomy, and now to X-ray crystallography. The prospects for a computational solution to the phase problem in protein crystallography appear promising and developments in the field are awaited eagerly. [Pg.408]

R. Coifman and V. Wickerhauser, Entropy-based Algorithms for Best-basis Selection, IEEE Transactions on Information Theory, 38 (1992), 496-518. [Pg.164]

The effect on the slopes of the k(E) curves for the dissociation of the bro-mobenzene ion with various assumed entropies, based on the frequencies in table 7.2, and energies of activation are shown in figure 7.3 (Baer et al., 1991). While the bromobenzene ion has no barrier in the dissociation channel, it is here treated with a vibrator TS. It is evident that two parameters can generate a whole family of k E) curves. Thus, if neither the activation energy nor the transition state structure is known, any set of data can be fit with RRKM theory. The lower the TS frequencies, the steeper the slope. However, if either the activation energy is known from other information, or if the frequencies are known from calculations, then the RRKM equation reduces to a one parameter model in which either the magnitude of the rate or the slope can be adjusted, but not both. [Pg.218]

Another related measure is the Shannon entropy [7], which is best known for its use in information theory. Direct application is to the probability distribution of discrete particles in the system. This approach is not as widely used as the above variance-based methods, but it provides... [Pg.2268]

Much of the current work on biomedical image registration utilises voxel similarity measures in particular, mutual information based on the Shannon definition of entropy. The mutual information (MI) concept comes from information theory, measuring the dependence between two variables or, in other words, the amount of information that one variable contains about the other. The mutual information measures the relationship between two random variables, i.e. intensity values in two images if the two variables are independent, MI is equal to 0. If one variable provides some information about the second one, the MI becomes > 0. The MI is related to the image entropy by ... [Pg.82]

The probabilistic reliability systems of events are analysed by Ziba (2000). The analysis is based on the concepts of entropy as defined in information theory and applied to probability theory. The recommended approach allows concentrating the system analysis only on important failure modes and connects uncertainty, redundancy and robustness of systems of events. Despite this approach, the system analysis leads to complicated computations. [Pg.1742]

Information Theory [23-26] demonstrates that an equiprobable, nonconstrained alphabet using n total characters has the maximum possible entropy (H, i.e. information content) per character H = log2(n). If twenty different amino acid residues are used, then n = 20 and H = Iog2(20) = 4.32. Since the logarithm is taken to base two, the entropy units can be considered as binary bits. If the polypeptide is N amino acid residues long, then at least (4.32 x N) total bits are required to encode the entire... [Pg.216]

Generally, many different tree structures are possible. Shorter trees are usually preferred since they express simpler classification rules which are easier to understand. The cleverness of the decision tree approach is that information theory is used to select an optimal order of the attributes to produce a shallow tree. It is common to base the attribute selection criterion on entropy. In communication theory, entropy is related to the expected information conveyed by a message. The entropy of a message, I m), depends on the probability of the message. [Pg.1521]

Maximum Entropy Method (MEM). MEM is based on information theory that the information content of a physical system is related to its entropy in a logarithmic formalism. The induced probability distribution of the system is the only one having the maximum entropy and corresponds to that with the minimum assumption about the system. In a PCS application, reconstruction of q(F) is based on the Shannon-Jaynes entropy model. The most probable solution ofq(F) or q(Fi) will be the one that will maximize the entropy function. In its discrete form, the function is... [Pg.252]

Note that the logarithm in Eq. [1] is taken to base 2. Although this amounts to a simple scaling factor, it is a convention adopted in information theory so that entropy can be considered equivalent to the number of bifurcating (binary) choices made in the distribution of the data. In other words, using base 2 allows us to address this question How many yes/no decisions do we need to make for data counts to fall into specific channels or bins in order to reproduce the observed data distribution The higher the information content, the more numerous are the decisions required to place each data point. [Pg.266]

Information theory has been applied to the F + HD reaction. An impressive feat was the prediction of a large difference in the rotational entropy deficiency for the HF and DF channels, based upon the known ratio of = (1.45) and the difference in vibrational entropy deficiencies,... [Pg.183]

In Chapter 1 we introduced tliermodyiiainies as the central macroscopic physical theory that allows us to deal with thermophysical phenomena in confined fluids. However, as we mentioned at the outset, thermodynamics as such does not permit us to draw any quantitative conclusions about a specific physical system without taking recourse to additional sources of information such as experimental data or (empirical) equations of state based on these data. Instead thermodynamics makes rigorous. statements about the relation among its key quantitic s sucli as tcanperature, internal energy, entropy, heat, and work. It does not permit one to calcrdate any numbers for these quantities. [Pg.35]

Considerable effort has been expended in the attempt to develop a general theory of reaction rates through some extension of thermodynamics or statistical mechanics. Since neither of these sciences can, by themselves, yield any information about rates of reactions, some additional assumptions or postulates must be introduced. An important method of treating systems that are not in equilibrium has acquired the title of irreversible thermodynamics. Irreversible thermodynamics can be applied to those systems that are not too far from equilibrium. The theory is based on the thermodynamic principle that in every irreversible process, that is, in every process proceeding at a finite rate, entropy is created. This principle is used together with the fact that the entropy of an isolated system is a maximum at equilibrium, and with the principle of microscopic reversibility. The additional assumption involved is that systems that are slightly removed from equilibrium may be described statistically in much the same way as systems in equilibrium. [Pg.853]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...