Hash coding

Over and beyond the representations of chemical structures presented so far, there are others for specific applications. Some of the representations discussed in this section, e.g., fragment coding or hash coding, can also be seen as structure descriptors, but this is a more philosophical question. Structure descriptors are introduced in Chapter 8. [Pg.70]

Hash coding is an established method in computer science, e.g., in registration procedures [94, 95. In chemoinformatics the structure input occurs as a sequence of characters (names) or numbers (which may also be obtained, e.g., from a connection table (see Section 2.4) by conversion of a structure drawing). Both names and numbers may be quite large and may not be usable as an address... [Pg.72]

The ciphered code is indicated with a defined length, i.e., a fixed hit/byte length. A hash code of 32 bits could have 2 (or 4 294 976 296) possible values, whereas one of 64 bits could have 2 values, However, due to tbe fixed length, several diverse data entries could assign the same hash code ( address collision ), The probability of collision rises if the number of input data is increased in relation to the range of values (bit length). In fact, the limits of hash coding are reached with about 10 000 compounds with 32 bits and over 100 million with 64 bits, to avoid collisions in databases [97. ... [Pg.73]

Thus the hash code is not used as a direct way to access data rather it serves as an index or key to the filed data entry (Figure 2-66). Since hash coding receives unique codes by reducing multidimensional data to only one dimension, information gets lo.st. Thi.s los.s prevents a recon.struction of the complete data from the hash code. [Pg.74]

The hash code has specific characteristics and provides numerous possibilities for use ... [Pg.75]

Hash codes of molecules which are already pre-computed are suitable for use in fiill structure searches in database applications. The compression of the code of a chemical structure into only one number also makes it possible to compute in advance the transformation results for a whole catalog. The files can be stored and kept complete in the core memory during execution of the program, so that a search can be accomplished within seconds. [Pg.75]

Atomic and bond hash codes are helpful in structure manipulation programs, e.g., in reaction prediction or in synthesis design [99]. [Pg.75]

The problem of perception complete structures is related to the problem of their representation, for which the basic requirements are to represent as much as possible the functionality of the structure, to be unique, and to allow the restoration of the structure. Various approaches have been devised to this end. They comprise the use of molecular formulas, molecular weights, trade and/or trivial names, various line notations, registry numbers, constitutional diagrams 2D representations), atom coordinates (2D or 3D representations), topological indices, hash codes, and others (see Chapter 2). [Pg.292]

Extract the next layer, decrypt the contents with the public key provided by the author, and check the contents. This is the hash code for the document generated when it was originally signed. [Pg.212]

Ihlenfeldt, W. D., Gasteiger,). Hash codes for the identification and classification of molecular structure elements. ]. Comput. Chem. 1994, 15, 793-813. [Pg.460]

For all nitrogen configurations Calculate hash code If not duplicate hash code Write isomer Save hash code... [Pg.277]

Hash codes for the identification and classification of molecular structure... [Pg.284]

An alternative methodology based on the ringcontent of a database, using precalculated structure-based hash codes has been proposed (110). The comparison of the hashcode tables can be used to compare two databases and the number of distinct ring-system combinations can be used as an indicator of database diversity. A method for diversity assessment called the saturation diversity approach, based on picking as many mutually dissimilar compounds as possible from a database was also proposed. The methods were used to compare a number of public databases and gave similar results. [Pg.223]

Ihlenfeldt, W.D. and Gasteiger, J. (1994). Hash Codes for the Identification and Classification of Molecular Structure Elements. J.ComputChem., 15,793-813. [Pg.588]

Hash Coding (hashing) is a method of converting data into a small unique representation that serves as digital hngerprint of the data. [Pg.356]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...