Notation, linear

The input of chemical species into the computer by means of a linear notation necessitates only standard computer devices (such as perforated cards), but requires a preliminary coding of the chemical formulae. The methods of coding will be classified according to their increasing sophistication. [Pg.318]

BUTANE for 2-methylbutane, or a global formula, e.g. 2- C4H9 . for the butyl radical, or a detailed description of the bonds in the molecule, e.g. (OH)2C CHCOOH for malonic acid, (HO)2C=CH— COOH. [Pg.319]

Come and co-workers [182] have described a linear notation which, at least for acyclic compounds, is very close to the notation familiar to a chemist. For example, the molecules of ethane, ethylene and acetylene are denoted by CH3/CH3, CH2//CH2 and CH///CH, respectively. The neopentane molecule and neopentyl free radical are denoted by C(CH3)4 and C(CH3)3/CH2., respectively. The main rules of this notation are as follows. [Pg.319]

With a few more rules, the notation can be extended to cyclic and stereo compounds. [Pg.319]

A very similar notation has been proposed by Kirby and Morgan [225]. Such languages are very easy to learn to read and to write. [Pg.319]

Another approach for representing 2D chemical structures is the linear notation. Linear notations are strings that represent the 2D structure as a more or less complex set of characters and symbols. Characters represent the atoms in a linear manner, whereas symbols are nsed to describe information about the connectivity [3]. The most commonly nsed notations are the Wiswesser line notation (WLN) and the simpUfled molecnlar inpnt line entry specihcation (SMILES) [2]. The WLN, invented by William J. Wiswesser in the 1949, was the hrst line notation capable of precisely describing complex molecnles [4]. It consists of a series of uppercase characters (A-Z), numerals (0-9), the ampersand ( ), the hyphen (-), the oblique stroke (/), and a blank space. [Pg.63]

4 SiMPtiFiED Molecular Input Line Entry Specification (SMILES) [Pg.63]

Examples of Chemical Compounds and Their SMILES Notation [Pg.63]

Carbon dioxide Hydrogen cyanide Triethylamine Acetic acid Cyclohexane Benzene Hydronium ion E-difluoroethene L-alanine D-alanine Nicotine Vitamin A [Pg.63]

Figure 6. Bending potential curves for the X Ai, A B electronic system of BH2 [33,34], Full hotizontal lines K —Q vibronic levels dashed lines /f — I levels dash-dotted lines K — 2 levels dotted lines K — 3 levels. Vibronic levels of the lower electronic state are assigned in benf notation, those of the upper state in linear notation (see text). Zero on the energy scale corresponds to the energy of the lowest vibronic level.

The ROSDAL syntax is characterized by a simple coding of a chemical structure using alphanumeric symbols which can easily be learned by a chemist [14]. In the linear structure representation, each atom of the structure is arbitrarily assigned a unique number, except for the hydrogen atoms. Carbon atoms are shown in the notation only by digits. The other types of atoms carry, in addition, their atomic symbol. In order to describe the bonds between atoms, bond symbols are inserted between the atom numbers. Branches are marked and separated from the other parts of the code by commas [15, 16] (Figure 2-9). The ROSDAL linear notation is rmambiguous but not unique. [Pg.25]

In contrast to canonical linear notations and connection tables (see Sections 2.3 and 2.4), fragment codes arc ambiguous. Several different structures could all possess an identical fragment code, because the code docs not describe how the fragments arc interconnected. Moreover, it is not always evident to the user whether all possible fi aginents of the stmetures ai e at all accessible. Thus, the fragments more or less characterise a class of molecules this is also important in generic structures that arise in chemical patents (sec Section 2.7.1)... [Pg.71]

WC. Herndon, Canonical labelling and linear notation for chemical graphs, in Chemical AppUcations of Topology and Graph Theory, R.B. King (Ed.), Elsevier, Amsterdam, 1983, pp. 231-242. [Pg.164]

An alternative way to represent molecules is to use a linear notation. A linear notation uses alphanumeric characters to code the molecular structure. These have the advantage of being much more compact than the connection table and so can be particularly useful for transmif-ting information about large numbers of molecules. The most famous of the early line notations is the Wiswesser line notation [Wiswesser 1954] the-SMILES notation is a more recent example that is increasingly popular [Weininger 1988]. To construct the Wiswesser... [Pg.659]

Four main approaches have been suggested for the representation of chemical structures in machine-readable form fragment codes, systematic nomenclature, linear notations, and connection tables. [Pg.188]

Figure 1. Five representations of the same chemical information. The canonical chemical reaction graph (a) can be represented in linear notation (b, see Appendix) or as a bond-centered labeled graph (c) by using time-variant bonds. The labeled graph affords an adjacency table (d) and a LISP list representation (e).

Interactive Sessions, The system has been implemented in LISP (Franz Lisp) and is running on a Digital Corporation VAX 11/780 at the University of Texas. Interactive sessions with the system are illustrated in Figures 4-7. (During the development stages of this project a linear notation was created for reaction input and output. A brief description of this notation is provided in the Appendix.)... [Pg.219]

The representation is unambiguous since it corresponds to one and only one substance, but it is not unique because alternative numberings of the connection table would result in different representations for the same chemical substance (the connection table representation is discussed in more detail below). In addition to being categorized according to their uniqueness and ambiguity, chemical substance representations commonly used within computer-based systems can be further classified as systematic nomenclature, fragment codes, linear notations, connection tables, and coordinate representations. [Pg.130]

Dittmar, Stobaugh, and Watson [8] describe the connection table utilized in the CAS Chemical Registry System. Lefkowitz [9] describes a concise form of a connection table, called the Mechanical Chemical Code, which does not explicitly identify the bonds and has attributes of both a connection table and linear notation. [Pg.133]

With the variety of chemical substance representations, i.e., fragment codes, systematic nomenclature, linear notations, and connection tables, a diversity of approaches and techniques are used for substructure searching. Whereas unique, unambiguous representations are essential for some registration processes, it is important to note that this often cannot be used to advantage in substructure searching. With connection tables, there is no assurance that the atoms cited in the substructure will be cited in the same order as the corresponding atoms in the structure. With nomenclature or notation representation systems, a substructural unit may be described by different terms or... [Pg.135]

Substructure searching based on linear notations can be accomplished in both an automated and non-automated mode. Dyson... [Pg.136]

Programs now exist to convert Wiswesser Line Notation [29], Hayward [30], and IUPAC [18] linear notations to connection tables. Because fragment codes alone do not provide the complete description of all structural detail, conversion to other representations is typically not possible. [Pg.141]

The conversion from a connection table to other unambiguous representations is substantially more difficult. The connection table is the least structured representation and incorporates no concepts of chemical significance beyond the list of atoms, bonds, and connections. A complex set of rules must be applied in order to derive nomenclature and linear notation representations. To translate from these more structured representations to a connection table requires primarily the interpretation of symbols and syntax. The opposite conversion, from the connection table to linear notation, nomenclature, or coordinate representation first requires the detailed analysis of the connection table to identify appropriate substructural units. The complex ordering rules of the nomenclature or notation system or the esthetic rules for graphic display are then applied to derive the desired representation. [Pg.141]

Computer-Aided Property Estimation Computer-aided structure estimation requires the structure of the chemical compounds to be encoded in a computer-readable language. Computers most efficiently process linear strings of data, and hence linear notation systems were developed for chemical structure representation. Several such systems have been described in the literature. SMILES, the Simplified Molecular Input Line Entry System, by Weininger and collaborators [2-4], has found wide acceptance and is being used in the Toolkit. Here, only a brief summary of SMILES rules is given. A more detailed description, together with a tutorial and examples, is given in Appendix A. [Pg.5]

Linear Notation A SMILES notation is a string consisting of alphanumeric and certain punctuation characters. The notation terminates at the first space encountered while reading sequentially from left to right. [Pg.178]

Vogin et al. [235] have created a program for the computer design of a free radical reaction mechanism in the gas phase, in agreement with the rules formulated in Sect. 2.5.3. An algorithm has been devised to transform by the computer the formula of a compound, written in the linear notation described in Sect. 6.2.1 [182], into a canonical notation. Thus, the system both preserves the flexibility of a simple natural language and gains the sophistication of a canonical notation. [Pg.322]

Sequences, names, linear notations 1-dimaisional information... [Pg.364]

Figure 9.7. Various linear notation schemes for chemical representation. Some contain only atom types and connectivity (WLN, ROSDAL, SMILES, SLN) and are chemist-readable. Others are compressed versions of molecule file formats (CHIME) and are meant for computer interpretation.

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...