320 - canonization connection table

As a compound is represented by a square matrix having a number of rows and columns equal to its number of atoms, it is clear that such a representation cannot be used for input purposes, but is rather an internal representation of structures and reactions. Furthermore, as the numbering of the atoms in a chemical compound is a priori arbitrary, several connectivity tables can be deduced from a given compound. Thus, special algorithms are required to obtain a canonical connectivity table. For example, the Chemical Abstracts Services have recourse to an algorithm devised by Morgan [234]. [Pg.320]

In contrast to canonical linear notations and connection tables (see Sections 2.3 and 2.4), fragment codes arc ambiguous. Several different structures could all possess an identical fragment code, because the code docs not describe how the fragments arc interconnected. Moreover, it is not always evident to the user whether all possible fi aginents of the stmetures ai e at all accessible. Thus, the fragments more or less characterise a class of molecules this is also important in generic structures that arise in chemical patents (sec Section 2.7.1)... [Pg.71]

Algorithm I - Registration - Canonicalization of Connection Tables. A connection table for a chemical substance with n atoms can be numbered in as many as n different ways. The problem of generating a canonical form involves selecting a... [Pg.142]

Such SRSG s are labelled graphs which can be easily recorded using connectivity tables based on some arbitrary (or canonical) numbering of their atoms (vertices). Such numbers are represented in the above examples by encircled Greek letters. Allowing one b)die for an atom symbol, six bits for an SRSG atom number and four bits for an... [Pg.373]

When we procede from the kth to the k + 1th canonical value and there is no decrease in the number of equivalent atoms, then the calculation of the canonical values terminates. All atoms are then numbered in the order of the value of the first 10 columns of the connection table. Whenever these are equal the final canonical values are consulted to determine which atom precedes which. If the final canonical values are the same we have good grounds to suspect that the atoms are equivalent but they must meet two more requirements to be judged equivalent ... [Pg.140]

Calculate the initial canonical values for all the atoms by copying the first 32 bits of each connection table row and shifting right two bits to eliminate the chirality information. [Pg.145]

Sort the rows of the connection table and the stereorelation table according to the precedence determined first by the initial canonical values and in case where the initial canonical values are equal, by the final canonical values. In this process prepare a translation table such that the ith entry of the table is the new number of old atom number i. [Pg.145]

The first solution uses some algorithm that transforms any connection table of a molecule into a unique, canonical, form. The best known of these, the Morgan algorithm, chooses the numbering based on the numbers and properties of the neighbors of each atom of the structure. It is the basis of the Chemical Abstracts System Chemical Registry Service. There is also a canonicalization scheme for the SMILES notation of a chemical structure. ... [Pg.220]

Regardless of whether a canonical description of a complete structure is available at the time of redrawing, it is not useful because the acyclic portions destroy the symmetry that would otherwise be present between different ring atoms. The first step, therefore, is to create a scratch structure to which only the ring system is copied. Heteroelements are converted to carbon charges and all stereochemistry are removed and all bond orders are set to 1. Other types of connection table may require further reduction. The second step is to canonical-ize the reduced connection table, either with the same algorithm used for regular molecules,or with a specialized version that is more efficient with reduced connection tables. [Pg.367]

Figure 10 Imaginary transition structure of the Diels-Alder reaction of cyclopentadiene. The canonical numbering of the ITS gives the connection table from Table 1. Adapted with permission from Ref. 23. Copyright 1988, American Chemical Society...

Figure 4 59-05-2 node numbering in canonical form of a connection table... [Pg.281]

Systematic names of compounds have been used in many systems to identify and describe their chemical structures. Until recently, however, those names were mere identifiers of the structures and could not be decoded into connection tables. Stereochemical naming schemes have used the Cahn-Ingold-Prelog (CIP) priority system extensively. As it associates a label (R, S, r, s, 0, or E, Z, e, z, see structures 3-5 for some examples) with an atom or bond, it has been used to code stereochemical information as node and edge attributes in standard connection tables. The labels were manually assigned and could be used in canonical numbering methods because their values did not depend on the particular numbering. [Pg.2728]

The problem of canonically numbering a connection table with stereochemical features can be considered solved. However, there are problems akin to the representation issues treated by step I of the unique naming algorithm. They are caused by incomplete description of stereochemistry. This may be due to lack of knowledge or because a mixture is being described. [Pg.2734]

One purpose of canonical numbering is the construction of a unique name of a compound. The canonically numbered connection table representation is uniquely defined. However, it usually contains a lot of redundancy and perceived information. The final unique name is normally a compressed form of the connection table which contains just enough information to reconstruct the connection table in its canonical form. Examples of such codes are the SEMA (stereochemically extended Morgan algorithm) name and canonical SMILES. [Pg.2735]

An encoding of a molecular structure based on its connection table, which represents only the constitution of a molecule without consideration of atom coordinates, bond lengths, or bond angles. See Canonical Numbering and Constitutional Symmetry Graph Theory in Chemistry and Structure Representation. [Pg.3055]

Figure 5 Protein-RNA interactions of aaRSs. The cloverleaf secondary structure of tRNA " folds into an L-shaped tertiary molecule. The tRNA can bind in an aminoacylation complex, where the 3 end is located in the canonical Class I or Class II core as shown in the upper right for the P. horikoshii LeuRS-tRNA - aminoacylation complex. In aaRSs that edit, a second complex can be formed, where the 3 end interacts with a separate domain such as the connective polypeptide insertion (CPI) that contains a hydrolytic active site as shown in the lower right for the T. thermophilus LeuRS-tRNA - editing complex. (Table (1) PDB files 1WZ2 and 2BYT).

On the other hand, groups that have a multiple-bonded electronegative atom directly connected to an unsaturated system are —M groups. In such cases, we can draw canonical forms in which electrons have been taken from the unsaturated system into the group, as in nitrobenzene, 1. Table 9.1 contains a list of some +M and —M groups. [Pg.396]

The first structure determined for a sialyltransferase (Cstll from Campylobacter jejuni of family GT42 see O Fig. 8c) revealed an unusual variant of the GT-A fold [350]. The protein displays a similar t) e of fold as the canonical GT-A fold, but with some differences in the connectivity of /3-strands (parallel /3-sheet of topology 8712456) and it has no DxD motif. Therefore the Cstll structure represents a new type of fold. Another prokaryotic sialyltransferase has, though, a GT-B fold (see O Table 2). [Pg.2294]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...