Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Standardization SMILES

The use of Simplified Molecular Input Line Entry System (SMILES) as a string representation of chemical structure makes possible much of what has been discussed in earlier chapters of this book. A chemical reaction could be represented as a collection of SMILES, some identified as reactants and some as products. It is possible to define a table to do this, or perhaps use some arrays of character data types, but a syntax extension of standard SMILES allows reaction to be expressed easily. SMIRKS is an extension of SMILES and SMiles ARbitrary Target Specification (SMARTS). It is used to represent chemical transformations. SMIRKS can also be used in a transformation function to combine SMILES reactants to produce SMILES products. [Pg.99]

Reaction SMILES is an extension to standard SMILES used to represent a specific reaction. It uses punctuation to distinguish reactants from products. For example, the reaction SMILES CC(=0)0.CN CC(=0)NC.0 represents the reaction of acetic acid with methylamine to form N-methylacetamide plus water. As with standard SMILES, explicit H atoms are typically not shown, although they may be. For example, the same reaction can represented as [CH3]C(=0)[0FI].[CH3][NH2] [CH3]C(=0)[NH][CH3].[H]0[H]. The punctuation is used to separate reactants from products, and the period is used to separate reactants or products from each other. There are no rules in reaction SMILES that enforce correct reaction stoichiometry or other aspects of actual chemical reactions. [Pg.99]

Reaction SMILES can be used to store and search chemical reactions using the same functions described earlier for standard SMILES. For example, cansmiles CCC(=0)0.NC CC(=0)NC.0 ) returns... [Pg.99]

When the button "submit smiles is pressed, the SMILES string is sent to the web server of Molsoft, converted to 3D, and the 3D structure is displayed in a java molecule viewer on an automatically created web page (see Figure 2-139). Unfortunately, the Molsoft server does not support downloading of the 3D structures in a standard file format. [Pg.158]

The text/plain example above demonstrates that HTTP networks can support distributed information systems when given appropriate languages, that is, languages that describe abstractions appropriate to that information system. Many other standard MIME types are useful. Most are very specific, for example, image/gif is a specific format for bitmapped images, application/PDF is a page description format and application/tar is a 4.3 BSD archive. Some describe more general abstractions, for example, application/xml . Private (unauthorized) MIME types are also available, for example, chemical/x-pdb and chemical/x-smiles . [Pg.250]

Preparation of virtual screening databases starts with standardization of the input SMILES. This procedure was originally developed to deal with databases from commercial suppliers. Preferred tautomeric forms are generated in this step and ionized species are neutralized. Ionization states are set in the second step for biased equilibria and multiple forms are enumerated in a third step to represent balanced equilibria. The model treats an equilibrium as balanced if the equilibrium constant associated with its defining rule is likely to be less than about 1.5 log units. [Pg.281]

Although we have made use of SD files up to this point, at this stage we switch to SMILES files (19). This becomes necessary because even for small libraries the file size for a fully enumerated set can be quite large. For example, a sample library of just 2500 compounds resulted in 4.85 MB SD file while the SMILES file was only 384 KB. The one caveat with the SMILES format is that there is no standard for handling data fields. Our solution was to reformat the SD file type data field tags into the SMILES file,... [Pg.81]

Standardized and consistent representations of stereoisomers and stereoisomeric mixtures are similarly important for the unique representations of distinct compounds. Recent file formats such as SDF v3000 and ChemAxon Extended SMILES provide clear definition and representation of complex relative and absolute stereochemical configurations. In practice these are not widely used because many commercially available files are represented by established v2000 or SMILES formats and also because HTS compounds are mostly relatively simple low molecular weight structures. [Pg.241]

JME molecular editor JME is a free Java applet which allows generation and editing of molecules and reactions, and creation of molecule SMILES. The JME applet, written by Peter Ertl from Novartis, has become a standard for molecule structure input on the internet. http //www.molinspiration.com/ jme/index.html... [Pg.265]

SMILES (Simplified Molecular Input Line Entry System) was invented by Weininger5 to facilitate the representation and manipulation of molecular structures using computers. It uses standard atomic symbols to represent atoms and the symbols - for single bond, = for double bond, and for triple bond. Hydrogen atoms can be represented explicitly but are almost always represented implicitly using normal conventions of valence bond theory. Single bonds need not be explicitly written. For example, propane is C-C-C or simply CCC. Methylamine is CN, and C N is hydrogen cyanide. Propene is C=CC. For more complex structures with branched bonds, parentheses are used. For example, CC(C)0 is isopropyl alcohol, whereas CCCO is propanol. [Pg.72]

Notice that there are several ways in which SMILES could be written for the same structure, even the simplest ones. For example, hydrogen cyanide can be written as C N or N C, propene is either C=CC or CC=C. More complex structures can have three or many more SMILES that represent the same structure. If there were one standard way to write SMILES, then standard SQL text comparisons could be used to locate any particular structure. SMILES would become a uniquely spelled "name" for each unique structure. Canonical SMILES does just that. Using rules about which atoms should come before other atoms in the spelling of each SMILES, a unique name for each molecular structure can be provided.6... [Pg.72]

Operators, such as +, 11 and functions such as sqrt, round, and upper can be used with these data types. SQL has the ability to search data, using functions such as =, <, and the like. The goal of the SQL extensions is to enable SMILES to be handled as readily as any standard data type. This requires that SQL be extended to validate and standardize, or canon-icalize SMARTS. In addition, these SQL extensions provide functions and operators to allow comparisons and searches of molecular structures stored as SMILES. [Pg.73]

It is possible to use SMIRKS transformations to modify SMILES to conform to a standard valence model. For example, if a SMILES for nitromethane is entered in the charge separated form C[N+](=0)[0-], it can be transformed to the other form CN(=0)=0. Chapter 9 discusses transformations and gives examples that will help resolve issues with structures that can be represented equally well using two distinct valence forms. [Pg.80]

The standard SQL data type Text has been used to store SMILES. This is appropriate because every SMILES is a valid text string. But not every text string is a valid SMILES. Without additional information about SMILES, the RDBMS cannot enforce any rules about which text strings ought to be in a column intended to contain SMILES. [Pg.86]

Using a domain like this, the smiles data type behaves much like a standard data type. When one attempts to insert an invalid number into a numeric column, an SQL error is reported and the value is not inserted. This fundamental behavior of an RDBMS is readily extended to SMILES using a domain. [Pg.86]

Suppose it is decided that the valence 5, noncharge-separated representation of the nitro group is to be used throughout the database. The SMIRKS [0 2]=[N+ 1][0- 3] [0 2]=[N+0 1]=[0+0 3], when applied to any charge-separated nitro group will transform it into the proper form. This is accomplished by creating another new SQL function, xform(smiles, smarts). As with the cansmiles and matches functions, this is an extension to standard SQL. Some form of this transformation function is... [Pg.102]

Table 9.1 Example SMIRKS for SMILES Standardization Name SMIRKS... Table 9.1 Example SMIRKS for SMILES Standardization Name SMIRKS...
Examples of the is std smiles and make std smiles functions are not shown here because neither of these approaches is ideal. In the first case, using a check constraint, the nonstandard SMILES would not be inserted, but the user would still be responsible for standardizing the SMILES and attempting the insert again. The second case using a function is better, but it would still be possible to accidentally insert a SMILES directly without the make std smiles function. [Pg.104]

The standardize function checks whether any smirks in the std smirks table, when used in the xform function, results in a modification of the input SMILES stored in NEW.smiles. If the xform function does return a transformed SMILES, then that transformed value is used in place of the value the user attempted to insert. Finally, the trigger is created using the standardized function to possibly modify any SMILE before inserting or updating a table. [Pg.104]


See other pages where Standardization SMILES is mentioned: [Pg.27]    [Pg.91]    [Pg.148]    [Pg.415]    [Pg.62]    [Pg.198]    [Pg.51]    [Pg.91]    [Pg.92]    [Pg.92]    [Pg.196]    [Pg.42]    [Pg.58]    [Pg.239]    [Pg.94]    [Pg.71]    [Pg.155]    [Pg.199]    [Pg.425]    [Pg.761]    [Pg.212]    [Pg.220]    [Pg.163]    [Pg.99]    [Pg.73]    [Pg.73]    [Pg.74]    [Pg.74]    [Pg.84]    [Pg.103]    [Pg.160]   
See also in sourсe #XX -- [ Pg.102 ]




SEARCH



© 2024 chempedia.info