Databases Machine-Readable

GORE. The CORE Electronic Chemistry Library is a joint project of Cornell University, OCLC (On-line Computer Library Center), Bell Communications Research (Bellcore), and the American Chemical Society. The CORE database will contain the full text of American Chemical Society Journals from 1980, associated information from Chemical Abstracts Service, and selected reference texts. It will provide machine-readable text that can be searched and displayed, graphical representations of equations and figures, and full-page document images. The project will examine the performance obtained by the use of a traditional printed index as compared with a hypertext system (SUPERBOOK) and a document retrieval system (Pixlook) (6,116). [Pg.131]

The use of computer-readable formats for chemical structures has become a compelling need to capture this information in databases or to simply annotate documents (Degtyarenko et al. 2007). Annotating full text documents with these machine-readable forms can make documents easier to search, and the intended structure can be visualized in context. The development of computer-readable formats started in earnest around 1990 (Borkent et al. 1988 Weininger 1988 Contreras et al. 1990 Ibison et al. 1993). [Pg.14]

In the mid-1960s, the first publicly available systems for accessing machine-readable databases appeared. With the explosion of information in the sciences, coupled with the vast number of resources (journals, reports, newspapers, and monographs) in which this information was appearing, the use of computers to store and retrieve information has become an effective means to handle the increasing flow. As a result, computer use has increased exponentially, as evidenced by the growth in number of host computers that store data for use on the Internet. In 1969, four computers held all the information available to users of the nascent World Wide Web , whereas today it is estimated that over 132 million host computers now serve in this capacity. [Pg.1428]

CSD The Cambridge Structural Database produced by the Cambridge Crystallographic Data Centre contains bibliographic, chemical, and numerical data of x-ray structures. This machine-readable file is a comprehensive compendium of molecular geometries of organic and organometallic compounds. [Pg.751]

The MCDF, ICSD and CSD currently acquire the vast bulk of their input data by visual scanning of the literature, followed by manual keyboarding of the relevant information. Only the PDB is compiled in its entirety from data submitted in machine readable form. Whilst all four databases perform extensive checks of data integrity, both computerized and visual, to ensure the accuracy of their final products, this function is particularly important for data acquired by manual methods. Statistics from the CSD show that some IS Tb of published structural papers contain at least one numerical error, and this is, of course, compounded by the inevitable additional risk of error due to the manual keyboarding. [Pg.75]

The production of further comprehensive compendia of X-ray and neutron diffraction results has been precluded by the steep rise in the number of published crystal structures, as illustrated by Figure A.l. Printed compilations have been effectively superseded by computerized databases. In particular, the Cambridge Structural Database (CSD) now (October 1992) contains bibliographic, chemical, and numerical results for over 100,000 organo-carbon crystal structures. This machine-readable file fulfils the function of a comprehensive structure-by-structure compendium of molecular geometries. However, the amount of data now held in CSD is so large that there is also a need for concise, printed tabulations of average molecular dimensions. [Pg.751]

Because the internal dissemination of this database within AstraZeneca R D (a company with 11 R D sites across four continents) was not deemed a success, AstraZeneca decided to discontinue the project as of May 2002. Backed by private funding, the database, renamed WOrld of Molecular BioAc-Tivity (WOMBAT) in 2003, continued to evolve [13] as discussed for WOMBAT 2006.1, below. Recognizing the paucity of chemical databases that capture clinical pharmacokinetics data in a searchable manner, we further developed the WOMBAT-PK (WOMBAT-Pharmacokinetics), to index such data from literature [14]. This chapter summarizes the contents of WOMBAT and WOMBAT-PK [15], some of the problems encountered in appropriately indexing biological activities and correct chemical structures (with focus on machine-readable contents for data mining), and provides some examples of data mining with WOMBAT. Other bioactivity databases [16], focused mostly on patent literature, are shown in Table 13.2-1 together with the on-line references. [Pg.761]

A large number of kinetic data are well known and compiled in individual tables, like Landolt BOrnstein [4], and further in the review literature [5]. Also machine-readable databases were arranged, including the NIST database for chemical kinetics [6]. Version 7.0 contains over 38,000 records on over 11,700 reactant pairs. It also includes reactions of excited states of atoms and molecules [7]. Kinetic calculations are substantially more complex in comparison to the field of equilibrium thermodynamics. [Pg.487]

Computer-based chemical information systems are now widely used for the storage and retrieval of chemical structure information. Efficient searching algorithms are available which allow structure-based searches to be carried out on databases containing many thousands, or even millions, of chemical compounds. In addition, the ease with which machine-readable structural data can be manipulated has led to a wide range of related activities, such as computer-aided s)uithesis design and studies of quantitative structure-activity relationships (QSAR). ... [Pg.131]

The main goal for the software system under development is to make available in machine-readable form all the information contained in Cheminform. In this way different kinds of indexes may be produced as well as data deaUng with selected parts of chemistry for specialised information services, either printed or on computer-readable media. But the most important fact is that the information in Cheminform contains all the data needed for a reaction database. [Pg.409]

Most of the major indexes and abstracts are now available in machine-readable form. For a comprehensive list of databases and online vendors see Information Industry Market Place (International Directory of Information Products Services from R. R. Bowker, New York). The names of online databases ffequendy differ fi-om their paper counterparts. Engineering Index (monthly from Engineering Information Inc.) for... [Pg.424]

All the information from the printed Handbook has been thoroughly checked for errors and redundancies and published in the Main Series and the first four Supplementary Series (Table 1). In addition to this information a large collection of unreviewed excerpts is available on cards, covering the time span from 1960 to 1979 these contain only the substance identification, an indicator of the type of factual information, and a reference to the literature. All the cards have been added to the database. The publication of the 5th Supplementary Series has started and the corresponding information is also being added to the database, thus replacing the former short file data. Since 1980 the excerpts from the primary literature have been directly input to the database and have thus become available in machine-readable form in a very short time. The... [Pg.1973]

A much more ambitious database that builds on the IUBMB classification is BRENDA, maintained by the Institute of Biochemistry at the University of Cologne. In addition to the data provided by the ENZYME database, the BRENDA curators have extracted a large body of information from the enzyme literature and incorporated it into the database. The database format strives to be readable by both humans and machines. The categories of data stored in BRENDA comprise the EC-number, systematic and recommended names, synonyms, CAS-registry numbers, the reaction catalyzed, a list of known substrates and products, the natural substrates, specific activities, KM values, pH and temperature optima, cofactor and ion requirements, inhibitors, sources, localization, purification schemes, molecular weight, subunit structure, posttranslational modifications, enzyme stability, database links, and last but not least an extensive bibliography. Currently, BRENDA holds entries for approximately 3500 different enzymes. [Pg.152]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...