Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

SMILES notation described

Stereochemistry can also be expressed in the SMILES notation [113]. Depending on the clockwise or anti-clockwise ordering of the atoms, the stereocenter is specified in the SMILES code with or respectively Figure 2-78). The atoms around this stereocenter are then assigned by the sequence of the atom symbols following the identifier or (g). This means that, reading the SMILES code from the left, the three atoms behind the identifiers ( ) or ( )( )) describe the stereochemistry of the stereocenter. The sequence of these three atoms is dependent only on the order of writing, and independent of the priorities of the atoms. [Pg.84]

The solubility is expressed as logarithm of molar fractions log(S). A recommended partition of the data into training and test sets is also taken from the mentioned paper. Six outliers described in Huanxiang et al. (2005) were removed from the considerations. SMILES notations of organic solvents for this study have been obtained with ACD/ChemSketch software (http //www.acdlabs.com/) according to CAS numbers from US National Library of Medicine (http //toxnet.nlm.nih.gov/). [Pg.341]

The method Meylan et al. (1992) described also has been encoded in a computer program, PCKOCWIN. After the user enters the structure of the chemical of interest represented in SMILES notation (Anderson et al., 1987 Weininger, 1988), the program automatically calculates y and determines the appropriate fragments and correction factors. [Pg.177]

The software now uses structurally intrinsic parameters for only one QSAR model (LSER) and the results are used to predict one property (acute toxicity) to four aquatic species by one mechanism (nonreactive, non-polar narcosis) however, we intend to continue to refine our equations as databases grow, incorporate other models, predict other properties, and include other organisms. We will attempt to differentiate between modes of toxic action and improve our estimates accordingly. For the widely divergent classes of chemicals and types of environmental behavior, no one model will best describe every situation and no one species is the optimal organism to monitor. As the software evolves, the expert system should choose the best model based on the contaminant, the species, and the property to be predicted (e.g., toxicity or bioaccumulation). In addition, we envision an interactive screen system for data entry that will bypass the SMILES notation and allow the user to describe the molecule by posing a series of questions about the compound s backbone and functional groups. The responses will translate directly into values of LSER variables. [Pg.110]

The SMILES notation is a means by which certain chemical structures can be described using a series of simple letters and numbers expressed in linear fashion, even for complex cyclic structures. This approach is particularly useful as input for computer models when chemical names and CASRN are unknown. As mentioned above, SMILES is an important tool in hazard and exposure modeling used in EPA s voluntary Sustainable Futures Program [1]. It can also be used to identify substances under REACH, and examples are shown in the nomenclature Technical Guidance Document (TGD) along with molecular and structural formulas. [Pg.28]

For many computer tasks and for the transfer of structiural information from one computer program to another, a linear representation of the chemical structure may be more suitable. " A popular linear representation is the SMILES notation. Part of its appeal is that for acyclic structures the SMILES is similar to the traditional linear diagram. For example, ethane is denoted by CC and ethylene C=C. Examples of additional SMILES are given in Figure 4. SMILES is the basis of a chemical information system, and this notation provides a convenient framework for more sophisticated computer coding of chemistry described below. For some internal computer functions, structures encoded in a linear notation may be converted to connection tables. [Pg.218]

In 1986, David Weininger created the SMILES Simplified Molecular Input Line Entry System) notation at the US Environmental Research Laboratory, USEPA, Duluth, MN, for chemical data processing. The chemical structure information is highly compressed and simplified in this notation. The flexible, easy to learn language describes chemical structures as a line notation [20, 21]. The SMILES language has found widespread distribution as a universal chemical nomenclature... [Pg.26]

Topological indices are used to describe some components of connectivity. A more complete description is afforded by unidimensional codes (linear line notations) such as SMILES. Connectivity plus explicit attention to valence electrons is afforded by the electrotopological indices... [Pg.6]

Computer-Aided Property Estimation Computer-aided structure estimation requires the structure of the chemical compounds to be encoded in a computer-readable language. Computers most efficiently process linear strings of data, and hence linear notation systems were developed for chemical structure representation. Several such systems have been described in the literature. SMILES, the Simplified Molecular Input Line Entry System, by Weininger and collaborators [2-4], has found wide acceptance and is being used in the Toolkit. Here, only a brief summary of SMILES rules is given. A more detailed description, together with a tutorial and examples, is given in Appendix A. [Pg.5]

In order to calculate a physicochemical property, the structure of a molecule must be entered in some manner into an algorithm. Chemical structure notations for input of molecules into calculation software are described in Chapter 2, Section VII and may be considered as either being a 2D string, a 2D representation of the structure, or (very occasionally) a 3D representation of the structure. Of this variety of methods, the simplicity and elegance of the 2D linear molecular representation known as the Simplified Molecular Line Entry System (SMILES) stands out. Many of the packages that calculate physicochemical descriptors use the SMILES chemical notation system, or some variant of it, as the means of structure input. The use of SMILES is well described in Chapter 2, Section VII.B, and by Weininger (1988). There is also an excellent tutorial on the use of SMILES at www.daylight.com/dayhtml/smiles/smiles-intro.html. [Pg.45]

These identifiers were developed as an lUPAC project in 2000-2004. They are the most recent technology aimed at an unambiguous text-string representation of chemical structures. (Earlier technologies included Wiswesser line notation, which is not described here, and SMILES, described below.)... [Pg.165]

Another approach for representing 2D chemical structures is the linear notation. Linear notations are strings that represent the 2D structure as a more or less complex set of characters and symbols. Characters represent the atoms in a linear manner, whereas symbols are nsed to describe information about the connectivity [3]. The most commonly nsed notations are the Wiswesser line notation (WLN) and the simpUfled molecnlar inpnt line entry specihcation (SMILES) [2]. The WLN, invented by William J. Wiswesser in the 1949, was the hrst line notation capable of precisely describing complex molecnles [4]. It consists of a series of uppercase characters (A-Z), numerals (0-9), the ampersand ( ), the hyphen (-), the oblique stroke (/), and a blank space. [Pg.63]

Simplified Molecular Input Line Entry Specification (SMILES) is a simplistic line notation for describing chemical structnres as a set of characters, numbers, and symbols that represent atoms, bonds, and stereochemistry. [Pg.115]

Even though chemical stmctures are reported in many of the public domain and cotmnercially available databases described above, they are not readily available for download as stracture files, for instance in. MOL2 or Stmcture data format (.SDF). Nonetheless, there is software available that can convert stracture names, SMILES, SMARTS, or InChl notation into molecular or structural files. This is discussed in the next section of this chapter. [Pg.240]

A different approach to the notation of chemical structures is taken by the SMILES language. It describes a structure with a concise linear notation. SMILES are created and understood by many programs and can be used to represent a chemical structure very compactly. Given a canonical numbering and suitable bonding conventions, SMILES can also be used as a structural index in database systems. [Pg.2732]


See other pages where SMILES notation described is mentioned: [Pg.179]    [Pg.199]    [Pg.369]    [Pg.21]    [Pg.128]    [Pg.186]    [Pg.230]    [Pg.63]    [Pg.484]    [Pg.168]    [Pg.16]    [Pg.21]   
See also in sourсe #XX -- [ Pg.368 , Pg.371 ]

See also in sourсe #XX -- [ Pg.368 , Pg.371 ]




SEARCH



SMILES notation

© 2024 chempedia.info