Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Input and Output of Molecular Structures

As with all data in an RDBMS, there is an external and internal representation of data. This was discussed in an earlier chapter for standard data types, such as text and numeric. For molecular structures, there is of course no SQL standard. When building a database containing molecular structures, a decision should first be made which internal representation will be used and which external representation. [Pg.83]

This chapter focused primarily on SMILES and canonical SMILES. It is feasible and common to use SMILES as the internal representation of molecular structure. Using the SQL functions described in this chapter, [Pg.83]

Another choice for the internal representation of molecular structure is a molfile. It would be possible to construct SQL functions like those described in this chapter that would operate on this type of data. One disadvantage of molfiles is their greater size compared with SMILES. One advantage is that it is possible to store atomic coordinates, which is not possible with SMILES. There are other molecular file formats, but these are substantially the same as a molfile, except perhaps for specific atom types that may be of use in some database applications. [Pg.84]

The recommendation here is to use SMILES to store molecular structure itself. If other features of the molecule or atoms need to be stored, other data types and columns can be added to the row describing the molecule. It is the SQL way to not encode a lot of information into one data type. When using a molfile as the structural data type, too much data is encoded in a single data type. The individual data items must be parsed and validated. Errors creep into the data, due to missing, extra, or invalid portions of the molfile. Ways of storing atomic coordinates, atom types, and molecular properties are discussed Chapter 11. [Pg.84]

The external representation of molecular structure is a less rigorous definition. For example, there are many programs available that can convert to and from SMILES and molfiles. These can be used when a molfile (the external representation) needs to be imported as a SMILES (the internal representation) into the database. Similarly, a SMILES can be easily exported as a SMILES or converted to a molfile or other file format. It is useful to have these conversion functions as SQL extensions. [Pg.84]


Graphical manipulations these are important for the rapid input and output of molecular structures, and their correct 3D manipulation. [Pg.26]

To date, EROS development has concentrated very much on the system itself and its chemistry. Far less attention has been paid to the interface between EROS and user. To rectify this situation, work has begun on implementing procedures for graphical input and output of molecular structures. [Pg.71]

Tel. 412-621-2050, fax 412-621-3563, e-mail info gaussian.com Gaussian 92. Ab initio molecular orbital calculations (Hartree-Fock, Direct HF, Moller-Plesset, Cl, Reaction Field Theory, electrostatic potential-derived charges, vibrational frequencies, etc.). Input and output of molecular structures in formats of many other molecular modeling systems. Browse for archival storage of computed results. VAX, Cray, DEC-RISC (Ultrix), Fujitsu (UXP/M), Kubota, IBM RS/6000, Multiflow, Silicon Graphics, Sun, and other versions. Gaussian 90 for Convex, FPS-500, Fujitsu (MSP), IBM (VM, MVS), HP-700, and NEC SX/3 systems. [Pg.241]

As was said in the introduction (Section 2.1), chemical structures are the universal and the most natural language of chemists, but not for computers. Computers woi k with bits packed into words or bytes, and they perceive neither atoms noi bonds. On the other hand, human beings do not cope with bits very well. Instead of thinking in terms of 0 and 1, chemists try to build models of the world of molecules. The models ai e conceptually quite simple 2D plots of molecular sti uctures or projections of 3D structures onto a plane. The problem is how to transfer these models to computers and how to make computers understand them. This communication must somehow be handled by widely understood input and output processes. The chemists way of thinking about structures must be translated into computers internal, machine representation through one or more intermediate steps or representations (sec figure 2-23, The input/output processes defined... [Pg.42]

To demonstrate the use of binary substructure descriptors and Tanimoto indices for cluster analysis of chemical structures we consider the 20 standard amino acids (Figure 6.3) and characterize each molecular structure by eight binary variables describing presence/absence of eight substructures (Figure 6.4). Note that in most practical applications—for instance, evaluation of results from searches in structure databases—more diverse molecular structures have to be handled and usually several hundred different substructures are considered. Table 6.1 contains the binary substructure descriptors (variables) with value 0 if the substructure is absent and 1 if the substructure is present in the amino acid these numbers form the A-matrix. Binary substructure descriptors have been calculated by the software SubMat (Scsibrany and Varmuza 2004), which requires as input the molecular structures in one file and the substructures in another file, all structures are in Molfile format (Gasteiger and Engel 2003) output is an ASCII file with the binary descriptors. [Pg.270]

Measures such as the number of compounds synthesized and the number of patents issued have been criticized on the grounds that they are more measures of R D activity (input) rather than of output.(11) Novelty of molecular structure represents a technically difficult assessment which, if performed at the time of synthesis, involves molecules with unknown pharmacologic and therapeutic properties. Novelty of pharmacologic action represents a fundamental measure of at least the potential for therapeutic innovation. In practice, however, this represents a judgmental issue and the necessary data on untested or unmarketed drugs would be difficult to obtain. [Pg.134]

Internal coordinate systems include normal coordinates which are symmetry adapted and used in spectroscopy, and coordinate systems based on interatomic distances ( bond lengths ), three-center angles ( valence angles ) and four-center angles ( torsion angles ). In the latter case a Z-matrix of the form shown in Table 3.1 defines the structure of a molecule. The input and output files of nearly all molecular mechanics programs are in cartesian coordinates. [Pg.41]

Searching of the database can be based on molecular formula, journal name, year, author, property, and a number of other key words. The database can also be searched according to a two-dimensional representation of the molecule of intCTest or even by a substmcture. In recent years many advances have been made in the input and output interfaces to the database. In Version 4 a graphical input/output interface was introduced which made the setting up and analysis of searches much easier. A similarity searching technique was also introduced which allowed matches to structures close to but not exactly the same as the query molecule. In Version 5 three-dimensional matching was introduced thereby allowing particular patterns to be examined both within and between molecules. Clearly, this is a powerful search facility which increases considerably the value of the information contained within the database. [Pg.130]


See other pages where Input and Output of Molecular Structures is mentioned: [Pg.83]    [Pg.412]    [Pg.83]    [Pg.412]    [Pg.43]    [Pg.78]    [Pg.214]    [Pg.139]    [Pg.478]    [Pg.480]    [Pg.334]    [Pg.341]    [Pg.133]    [Pg.138]    [Pg.165]    [Pg.40]    [Pg.235]    [Pg.59]    [Pg.145]    [Pg.211]    [Pg.249]    [Pg.251]    [Pg.311]    [Pg.541]    [Pg.57]    [Pg.105]    [Pg.239]    [Pg.61]    [Pg.113]    [Pg.10]    [Pg.175]    [Pg.280]    [Pg.452]    [Pg.320]    [Pg.91]    [Pg.1850]    [Pg.81]    [Pg.1300]   


SEARCH



Input and output

Input structures

Input/output

Molecular Structure of

Molecular structure and

© 2024 chempedia.info