Function RDBMS

New SQL functions and data types can be used to extend a relational database. This is explained in Chapter 10 using PostgreSQL as an example. Ways in which three-dimensional molecular structures can be stored are examined in Chapter 11. This chapter also advocates using an RDBMS instead of molecular structure files and shows how this transition might be accomplished. [Pg.3]

A table is said to be in first normal form if each row has the same number of columns, each column has a value, and there are no duplicate rows. Because an RDBMS uses a table defined with a fixed number of columns, it is always true that each row contains the same number of columns. If one allows that null is a value, then every column will have a value. It should be obvious that repeating a row in a table is wasteful, but also potentially confusing and prone to error. For example, if two rows in a table of logP contained the same name and logP, one row may have the logP changed at some point. Then which row would be the correct row This condition also illustrates the final aspect of first normal form There should be at least one column, or combination of columns, that could function as a key that uniquely identifies the row. This is the name column or compound id column in the above examples. The data in this column must be unique. [Pg.17]

In Chapter 2, the concept of relational tables was introduced. In this chapter, the most common way of working with tables in an RDBMS is introduced. The SQL language provides ways to create tables, insert data, select data, delete data, update data, join tables, create table schemas, define functions, etc. SQL has many other features, not all of which are covered here. [Pg.21]

A database can be thought of as a collection of schemas. It is possible to have many databases managed by one RDBMS, but each database is independent of any other. SQL was not designed to facilitate access to data in different databases. Recently, methods such as dbSwitch1 or dblink2 have made it possible to link together different databases. However, these are not considered here because they do not conform to the SQL standard and are implemented is various ways in different RDBMS. In the examples in this book, all schemas, table, functions, etc., are contained within one database. [Pg.22]

Some of the more advanced methods described in this book require a more specific use of the RDBMS. The choice made for this book is PostgreSQL. In cases where a particular feature of PostgreSQL is used, a note is added to alert the reader. For example, the array data type in SQL2003 is implemented in PostgreSQL very differently than in Oracle. The list matches function described in a later chapter of this book returns an array of integers that denote which atoms in a structure match a substructure query. The integration of this function into SQL would be handled quite differently in PostgreSQL, Oracle and MySQL. [Pg.32]

The RDBMS is installed and runs on a computer that functions as a database server. Any SQL commands are executed on the server by the RDBMS. Functions written in SQL or in any of the procedural languages mentioned above are also executed by the RDBMS. This has the advantage that the data tables used by these SQL commands or procedural functions are under the control of the server. This is the most efficient way to access the data. The disadvantage is that the server may have many requests to handle from many users. Another way to operate on data tables is indirectly, using a client program typically (although not necessarily) run from another computer. [Pg.33]

There is a smaller set of tools that are typically run on the server. Any SQL commands and any procedural language functions are run on the server. In principle, there is complete flexibility of the server side tools, since in principle any computer program can be written in any computer language. Later chapters of this book show how the RDBMS server itself can be extended using server side programming to handle chemical information. These extensions may directly solve the needs of a particular project, but more importantly they increase the flexibility of the RDBMS to handle chemical information. Client programs can use the results of chemical searches and other computations as well. [Pg.34]

The connect function opens the connection to the RDBMS using the appropriate database name, host name, username, and password. The query and getresult methods execute the SQL statement and get the results. There are other methods available to get results, but these are discussed elsewhere.13... [Pg.44]

The SQL domain allows one to define which values are to be allowed in a particular column of a table. A domain is created by stating the underlying built-in SQL data type used to store the domain data type. In addition, a check constraint function may be used to allow or forbid certain values. This can be used to great advantage for SMILES and canonical SMILES. Using a domain improves the ability of the RDBMS to maintain the integrity of the data contained in its tables. [Pg.86]

When a value is inserted into this table, the valid function will be called by the RDBMS. If the function returns true, then the value will be allowed into the column smi. Otherwise, an SQL error will be reported and the value will not be allowed. [Pg.86]

This canonicalize function uses NEW to refer to the row being inserted or updated. NEW.cansmi refers to the value under question. The canonical SMILES is computed and compared to NEW. cansmi. If they are not the same, the NEW.smi value is replaced by the canonical SMILES value and the NEW row is returned. This NEW row is used by the RDBMS in place of the original row. The create trigger command causes this operation to be put into effect in the RDBMS. [Pg.87]

Many other uses of the xform function are possible. Because the function is an extension of SQL, it can be easily used with all the other features of the SQL language and capabilities of an RDBMS. [Pg.105]

The CHORD6 chemical cartridge is a commercial product from gNova, Inc. It is written using C functions and the OEChem toolkit from OpenEye. It provides the core functions discussed in this book, such as cansmiles, matches, count matches, list matches, smiles to molfile, molfile to smiles, and xform. CHORD makes it possible to efficiently process RDBMS tables containing many millions of chemical structures. [Pg.120]

Unlike the procedural languages discussed above, C language functions are compiled separately. The code itself is not included in the SQL create function command. Instead, the create function command refers to a compiled object such as shared object (.so) file located in some directory on the server running the RDBMS. For example, the CHORD oe smiles function is defined as follows. [Pg.120]

The previous section shows how molecular structures stored in an RDBMS can be made available to client programs that traditionally read molecular structure files. The advantage of storing molecular structures in an RDBMS is that the information can be used from within the database, as well as by external clients. For example, it would be possible to search a table of molecular structures for three-dimensional overlap, much like it might be searched for substructure match. Of course, such search functions need to be written and installed as extensions to an RDBMS, just like the matches functions was done for substructure searches. This section shows some possible ways this might be accomplished. [Pg.133]

One disadvantage of using client programs is that data must be transferred to and from the server. Depending on how much data is required, this can cause a client program to run less efficiently than a server function run as an extension of the RDBMS. [Pg.137]

Most new client programs will benefit from a judicious use of both new server-side SQL functions and new client functions. It is wise to carefully consider which operations are best done on the RDBMS server and which are done using a client program. There are several suggestions to consider in designing the best system for a project. [Pg.137]

Do not implement the same functionality in different clients. If many different clients require the same functionality, it is better to encapsulate that in one central location—namely the RDBMS. For example, if it is necessary to compute molecular weight, it is better to have a server-side function do this in a consistent way rather than to implement such a function in each of the languages used for client applications. This may be more obvious for functions like fingerprints that are more difficult to re-implement in various languages. Moreover, it is essential that fingerprints be... [Pg.138]

Writing a Web application to help such users search or update a database is more than simply offering a text box for them to type in SQL statements. The focus of section is not to show how a full Web application can be developed that uses SQL and an RDBMS server. There are some useful SQL functions that can benefit any Web application. [Pg.143]

There are many uses for a chemical relational database. Some of these have been mentioned in earlier chapters. In this chapter, three general types of applications will be discussed. The purpose is not to present complete working applications, but to indicate important issues to consider when designing such applications. Sample schemas are proposed. The use within each application of the core functions described earlier is discussed. Each of these applications might be developed as a Web application or a client application on a user s desktop. Any computer language might be used, although the ability to connect to an RDBMS is essential. [Pg.155]

If SMILES is used to store molecular structures in a relational database management system(RDBMS), it may be necessary to extract the symbol and bond information for some client programs that expect a connection table. The smiles to symbol and smiles to bonds function shown in the next sections allow the symbol and bond information in a SMILES to be extracted as an array. Some client programs may prefer to process this information in rows, as if they were records in a file. The following plpgsql functions can be used to present the array elements as rows. Two functions are shown ctable (connection table) and symbol coords. The symbol coords function requires an array of coordinates in addition to the symbols. [Pg.173]

The following code will define the core functions described in Chapter 7 of this book. The isosmiles function is not included here because of limitation of PerlMol. These functions apply only to the PostgreSQL RDBMS. [Pg.188]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...