Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

SMILES atomic coordinates

In the following, we will discuss two-dimensional (2D)-to-3D conversion in this context. However, it should be emphasized that we do so only for the sake of brevity. In reality, none of the conversion programs uhlizes informahon of a 2D image of a chemical structure. Only the information on the atoms of a molecule and how they are cormected is used (i.e. the starhng informahon is the conshtution of the molecule). One could even refer to linear structure representations such as SMILES as one-dimensional. However this is not true since SMILES allows for branches and ring closure which makes its informahon content essentially 2D. Thus, all structure representahons which lack 3D atomic coordinates will in the following simply be referred to as 2D. [Pg.159]

Similar to SMILES, InChI does not store atom coordinates. In contrast to SMILES, which by default omits hydrogen atoms that are then added implicitly to match the most common valency of an atom, InChI stores hydrogen atoms but does not store bond orders. These two techniques are just different approaches to the same problem for a given molecular skeleton, the bond orders and number of hydrogen atoms... [Pg.86]

Babel [http //openbabel.sourceforge.net/]), it is important to keep in mind the limitations of each format. For example, conversion from SMILES, which does not store atom coordinates, into Mobile would result either in the molecule having undefined geometry or the geometry being supplied by the conversion tool, which might be completely different from the real one. [Pg.93]

Another choice for the internal representation of molecular structure is a molfile. It would be possible to construct SQL functions like those described in this chapter that would operate on this type of data. One disadvantage of molfiles is their greater size compared with SMILES. One advantage is that it is possible to store atomic coordinates, which is not possible with SMILES. There are other molecular file formats, but these are substantially the same as a molfile, except perhaps for specific atom types that may be of use in some database applications. [Pg.84]

The recommendation here is to use SMILES to store molecular structure itself. If other features of the molecule or atoms need to be stored, other data types and columns can be added to the row describing the molecule. It is the "SQL way" to not encode a lot of information into one data type. When using a molfile as the structural data type, too much data is encoded in a single data type. The individual data items must be parsed and validated. Errors creep into the data, due to missing, extra, or invalid portions of the molfile. Ways of storing atomic coordinates, atom types, and molecular properties are discussed Chapter 11. [Pg.84]

Each row in the coordtest table represents a molecule. The smiles column is a string of atom symbols and bonds and the coord column is an array of atom coordinates. How is it possible to keep the ordering of atoms in the smiles string in sync with the ordering of atom coordinates in the coord array When the coordinates are initially entered from the external source, they are likely to be in a common chemical file format. The program that converts from that file format to SMILES would have to output the atom coordinates in the same order as the atoms in the SMILES. [Pg.116]

In a molecular structure file, an atom record typically contains all of the information about that atom the atomic number or symbol, the charge, coordinates, etc. When such a file is parsed into a SMILES string and an array of coordinates, it is important to be able to associate the proper coordinate with the proper atom. The use of canonical SMILES ensures this. Because canonical SMILES defines a unique order of the atoms in a molecule, that order is used to store the coordinates. Later sections of this chapter will discuss ways in which atomic coordinates might be stored in columns of a table. [Pg.125]

The column structure.id is a unique integer relating the structure, sdf and property tables. The sdf.molfile column contains the molfile for each structure as defined by the vendor. The structure.name and structure.cansmiles columns contain the name and canonical smiles parsed and computed from the molfile. The structure.coord column will contain an array of atomic coordinates. The structure, atom column will contain an array of atom numbers from the file in canonical order to correspond to the atom order in the canonical SMILES. The OpenBabel/plpythonu extension functions molfile mol and molfile properties will be used to parse the vendor SDF molfiles and populate these tables. The molfile column of the sdf table is first populated from the SDF file, using the following perl script. [Pg.126]

Babel (we tested Version 1.6) is a utility for converting computational chemistry input hies from one format to another. It is able to interconvert about 50 different hie formats, including conversions between SMILES, Cartesian coordinate, and Z-matrix input. The algorithm that generates a Z-matrix from Cartesian coordinates is fairly simplistic, so the Z-matrix will correctly represent the geometry, but will not include symmetry, dummy atoms, and the like. Babel can be run with command line options or in a menu-driven mode. There have been some third-party graphic interfaces created for Babel. [Pg.352]

The first dimension refers to the atom number and the second dimension refers to the Cartesian coordinates of the atom. So, the following SQL would select the smiles and the coordinates of the first two atoms of 1 -methylthymine. [Pg.115]

The prediction of three-dimensional chemical structure from a list of atoms in a molecule and their connectivity is a good example of a chemical problem that may be solved by an expert system. We have already seen (Fig. 9.2) how the SMILES interpreter can construct a two-dimensional representation of a structure from its one-dimensional representation as a SMILES string. The CONCORD program (CONnection table to CoORDinates) takes a SMILES string and, very rapidly, produces a three-dimensional model of an input molecule. This system is a hybrid between an expert system and a molecular mechanics program, molecular mechanics being the method by which molecular structures are minimized in most molecular modelling systems. The procedure operates as follows. [Pg.203]

The basis of ALADDIN is the Daylight Chemical Information Systems software, particularly GENIE, a substructure specification language. When GENIE finds a query substructure in an input SMILES structure, it can return to the user those atoms in the structure that correspond to those hit. Since in a MENTHOR database the coordinates of the atoms are stored in the order in which they occur in the SMILES for that molecule, the coordinates of the atoms of interest are thereby identified. Thus, our geometric objects are established from this set of atoms, and geometric tests are performed on them. Steric tests are performed on molecules that meet the geometric criteria. [Pg.243]

De novo generation. If there are no preexisting coordinates, the process is called de novo generation. This is the most common case, arising in a number of situations, in particular chemical name translation, isomer enumeration, translation from a linear notation such as SMILES, atom label expansion. [Pg.313]

De novo. Ignores initial coordinates, and all preservation flags are off. Useful for atom label expansion, SMILES translation, and 3-D -> 2-D structure conversion. [Pg.322]


See other pages where SMILES atomic coordinates is mentioned: [Pg.56]    [Pg.117]    [Pg.31]    [Pg.368]    [Pg.123]    [Pg.729]    [Pg.158]    [Pg.538]    [Pg.58]    [Pg.382]    [Pg.158]    [Pg.178]    [Pg.239]    [Pg.220]    [Pg.38]    [Pg.556]    [Pg.56]    [Pg.506]    [Pg.235]    [Pg.1436]   
See also in sourсe #XX -- [ Pg.173 ]




SEARCH



Atomic coordinates

Atoms coordination

© 2024 chempedia.info