Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Simplified molecular input line entry system notation

In 1986, David Weininger created the SMILES Simplified Molecular Input Line Entry System) notation at the US Environmental Research Laboratory, USEPA, Duluth, MN, for chemical data processing. The chemical structure information is highly compressed and simplified in this notation. The flexible, easy to learn language describes chemical structures as a line notation [20, 21]. The SMILES language has found widespread distribution as a universal chemical nomenclature... [Pg.26]

SMILES (Simplified Molecular Input Line Entry Systems) is a line notation system based on principles of molecular graph theory for entering and representing molecules and reactions in computer (10-13). It uses a set of simple specification rules to derive a SMILES string for a given molecular structure (or more precisely, a molecular graph). A simplified set of rules is as follows ... [Pg.30]

Computer-Aided Property Estimation Computer-aided structure estimation requires the structure of the chemical compounds to be encoded in a computer-readable language. Computers most efficiently process linear strings of data, and hence linear notation systems were developed for chemical structure representation. Several such systems have been described in the literature. SMILES, the Simplified Molecular Input Line Entry System, by Weininger and collaborators [2-4], has found wide acceptance and is being used in the Toolkit. Here, only a brief summary of SMILES rules is given. A more detailed description, together with a tutorial and examples, is given in Appendix A. [Pg.5]

The Simplified Molecular Input Line Entry System (SMILES) is frequently used for computer-aided evaluation of molecular structures [1-3]. SMILES is widely accepted and computationally efficient because SMILES uses atomic symbols and a set of intuitive rules. Before presenting examples, the basic rules needed to enter molecular structures as SMILES notation are given. [Pg.178]

The most commonly used identifiers today include line notation identifiers (e.g., Simplified Molecular Input Line Entry System [SMILES] and International Chemical Identifier [InChls]), tabular identifiers (e.g., Molfile and Structure Definition [SD] file types), and portable mark-up language identifiers (e.g., Chemical Markup Language [CML] and FlexMol). Each identifier has its strengths and weaknesses as detailed in Chapter 5. Chapters 5 and 6 provide enough information to guide researchers in choosing the most appropriate formats for their individual use. [Pg.14]

WEN and SMILES fragments correspond respectively to substrings of the Wiswesser Line Notation or Simplified Molecular Input Line Entry System strings used for encoding the chemical structures. Since simple... [Pg.5]

SMILES. Simplified Molecular Input Line Entry System—linear notation used in Day-... [Pg.410]

The Simplified Molecular Input Line Entry System (SMILES) is the most popular line notation in use today, and though it technically remains a proprietary product of Daylight Chemical Information Systems Inc., it has been widely implemented by other vendors. Unfortunately this has led to some divergence of dialects of the notation, especially with respect to extensions to... [Pg.167]

The simplified molecular input line entry system (SMILES) [68-71] is a compact and comfortable representation of the molecular structure from a chemical point of view. An increasing munber of SMlLES-based databases are gradually appearing on the internet, and thus it is interesting and important to search for suitable ways of using such a representation in QSPR-QSAR analyses. It has to be noted that the molecular graph contains details of the molecular architecture which is absent in SMILES. For instance, an extended connectivity of increasing order cannot be calculated directly from this notation. [Pg.31]

SMILES notation (Simplified Molecular Input Line Entry System) used in the Pomona College MedChem project ( ) the Dyson notation (5) adopted as a standard by the International Union for Pure and Applied Chemistry (lUPAC) and Wiswesser Line Notation (WLN), which became the actual standard used in the chemical industry (6). The rules of WLN were standardized by the Chemical Notation Association (CNA), but the two major WLN data bases. Index Chemicus Registry System, ICRS, from the Institute for Scientific Information and the Commercially Available Organic Chemicals Index, CAOCI (7) used somewhat different conventions. [Pg.2]

Rule scripts operate on substances defined in a data file in either SMILES (simplified molecular input line entry specification) or CMP (compound) format. The conventional SMILES notation as developed by Weininger [28] provides a basic description of molecules in terms of two-dimensional chemical graphs. The CMP file format developed with the OASIS system [29] provides separate logical records for information about connectivity, three-dimensional structure, electronic structure from quantum-chemical molecular-orbital computations, as well as physicochemical and experimental toxicological data. [Pg.56]

In order to calculate a physicochemical property, the structure of a molecule must be entered in some manner into an algorithm. Chemical structure notations for input of molecules into calculation software are described in Chapter 2, Section VII and may be considered as either being a 2D string, a 2D representation of the structure, or (very occasionally) a 3D representation of the structure. Of this variety of methods, the simplicity and elegance of the 2D linear molecular representation known as the Simplified Molecular Line Entry System (SMILES) stands out. Many of the packages that calculate physicochemical descriptors use the SMILES chemical notation system, or some variant of it, as the means of structure input. The use of SMILES is well described in Chapter 2, Section VII.B, and by Weininger (1988). There is also an excellent tutorial on the use of SMILES at www.daylight.com/dayhtml/smiles/smiles-intro.html. [Pg.45]

Devillers et al. (1996) have commented that most QSARs for the prediction of BCF perform similarly up to log Kow 6. In view of the fact that the computer program BCFWIN version 2.14 is freely available from the EPA website (www.epa.gov/oppt/exposure/docs/episuitedl.htm), it is recommended that this be used for BCF prediction for chemicals with log < 6 the proviso is that highly reactive chemicals will probably have a higher than predicted BCF, perhaps by up to two orders of magnitude. BCFWIN requires that the chemical structure be input using Simplified Molecular Line Entry System (SMILES) notation (Weininger, 1988) or as a Chemical Abstracts Service (CAS) number. [Pg.355]


See other pages where Simplified molecular input line entry system notation is mentioned: [Pg.186]    [Pg.88]    [Pg.421]    [Pg.99]    [Pg.764]    [Pg.128]    [Pg.54]    [Pg.543]    [Pg.42]    [Pg.279]    [Pg.21]    [Pg.328]   
See also in sourсe #XX -- [ Pg.2 ]




SEARCH



Molecular Notations

Notational systems

Simplified

Simplified Molecular Input Line Entry System

Simplified molecular input line system

Simplify

System inputs

© 2024 chempedia.info