PubChem BioAssay

Figure 13.1 hERG compound data obtained from PubChem BioAssay AID 376. Upper line Conventional hERG blockers. Lower line Atypical hERG blockers which do not contain the conventional pharmacophore of a basic nitrogen, decorated by a set of lipophilic rings, (also see Chapter 16). [Pg.299]

D Domains PubChem BioAssay PubChem Compound PubChem Substance Gene LocusLink UniGene HomoloGene... [Pg.498]

Pubchem BioAssay Search bioassay records using terms from the bioassay descripbon, for example "cancer cell line". Links to active compounds and bioassay results are provided. [Pg.206]

PubChem is organized as three linked databases within the NC8I s Entrez information retrieval system. These are PubChem Substance. PubChem Compound, and PubChem BioAssay. Pubchem also provides a fast chemical structure similarity search tool. More information about using each component database may be found using the links above. [Pg.206]

One obvious application of these databases is to use them as sources of training data when developing predictive models. For example, Novotarskyi et al. (46) employed PubChem Bioassay data to develop models to predict CYP450 1A2 inhibition, and Shen et al. (47) employed the database to develop a support vector... [Pg.87]

Chen B, Wild DJ (2010) PubChem BioAssays as a data source for predictive models. J Mol Graph Model 28(5) 420 26... [Pg.94]

The U.S. National Institutes of Health PubChem project contains information on millions of chemical compounds.1 The data are divided into three main sections. PubChem Substance contains structures supplied by depositors. PubChem Compound contains unique structures with computed properties. PubChem BioAssay contains bioactivity assay results supplied by depositors. The data in these three sections are recorded independently, yet there are chemical relationships among these sections. For example, information available as a PubChem BioAssay is associated with a particular substance for which the data were collected. A substance may be a single compound or a mixture of several compounds. [Pg.53]

PubChem BioAssay is available as hundreds of different files.3 The files are named, for example, l.csv.gz, l.descr.xml, 2.csv.gz, 2.descr.xml. The xml files are descriptions of the data contained in the corresponding csv file, which results when the csv.gz file is unzipped. For example, the file l.descr.xml contains the information "Growth inhibition of the NCI H23 human Non-Small Cell Lung tumor cell line is measured as a screen for anti-cancer activity" as well as information about the various columns of data in the l.csv file. This information is used to define a table to hold the data in the l.csv file. Figure 6.2 shows a representation of the table, named nci h23. Using additional information in the l.descr.xml file and using the capabilities of the RDBMS to incorporate comments on tables and columns, the following SQL defines the nci h23 table. [Pg.54]

All of the other xml and csv data files from PubChem BioAssay can be used in a similar way to define other tables in the pubchem schema. Many of the assays use the same column names and descriptions as the above... [Pg.55]

PubChem BioAssay. http //www.ncbi.nlm.nih.gov/sites/entrez db= pcassay (accessed April 18, 2008). [Pg.70]

PubChem is organized as three distinct databases PubChem Substance, PubChem Compound, and PubChem BioAssay. PubChem Substance contains descriptions of chemical samples, provided by dafa deposifors, and links to information on their biological activities. The description includes PubChem Compound identifiers in cases where the chemical structures of compounds in fhe sample are known. Links providing information on biological activity include those to PubMed [8] citations, protein 3-D structures [9], links to contributor websites, and to biological testing results available in PubChem BioAssay. [Pg.218]

Filters are related to links in that the majority of filters in the PubChem databases are generated automatically based on the presence of links. In the above example the "pcsubstance pcassay" filter has a "true" bit for every substance for which a PubChem BioAssay link is present (e.g., in the pop-up menus of the Entrez DocSum for that substance). [Pg.225]

It is important to note that Entrez history is database-specific. One cannot use it to combine search results between databases (e.g., to AND together a CID list with an AID list). Cross-database links must first be used as set transformation operators, so all ID lists are in the same database. For example, following the "PubChem BioAssays" link from a set of CIDs will create a new set of AIDs that have any test results for the set of CIDs (again with the implicit understanding that CID is first expanded to SID, which is built into the CID-AID links). From there, one may combine this set of AIDs with other search results in the BioAssay database using the Entrez Boolean logic. [Pg.226]

The primary eUtil tools of most interest to PubChem users are eSearch, eFetch, ePost, eLink, eHistory, and einfo. eSearch performs an Entrez search, with the same query syntax as web-based Entrez queries (e.g., to query PubChem Compound for the chemical name "aspirin"). eFetch returns an ID list from a prior search (e.g., the list of PubChem Compound identifiers (CIDs) from the aforementioned query of "aspirin"). ePost creates a new ID list by upload of a list of identifiers (e.g., substance identifiers (SIDs)). eLink follows a given link type to create a new ID list from an existing one (e.g., to find all PubChem BioAssay identifiers (AIDs) associated with a list of SIDs). eHistory returns information on current Entrez History entries, einfo lists available Entrez indices and links for a given database. [Pg.236]

GFA PLS regression QSAR with 4D fingerprints and MOE descriptors N = 250 from the literature, r = 0.58, q = 0.54. PubChem bioassay database N = 250 active, N = 1703 inactive. QSAR used to make classifications with 10 pM IC50 cut off. 65% accuracy. With a smaller test set (N = 876) the overall accuracy is 82%. 182... [Pg.316]

PubChem A chemical database is a database specifically designed to store chemical informatioa Chemical stmctures are traditionally represented using hnes indicating chemical bonds between atoms and drawn on paper (2D stmctural formulae). Various chemical databases are available on the Internet which are free for all. Large chemical databases are expected to handle the storage and searching of information on millions of molecules. PubChem is one of the free chemical databases which is developed by the National Center for Biotechnology Information (NCBI). More than 24 millions of compound stmctures and descriptive datasets can be freely downloaded from PubChem. PubChem is a user-friendly database, we can search the compounds by compound name/key word, and we can also search the compound by chemical properties. We can download the compounds in SDF format which is the standard one for various stractural viewers. PubChem has three components, namely PubChem Compounds, PubChem Substances, and PubChem BioAssay described below. [Pg.77]

PubChem BioAssay The PubChem BioAssay Database contains BioActiv-ity screens of chemical substances described in PubChem Substance. It provides searchable descriptions of each BioAssay, including descriptions of the conditions and readouts. We can search bioassay records using terms from the bioassay... [Pg.77]

Wang, Y., Suzek, T Zhang, J. et al. (2014) PubChem BioAssay 2014 update. Nucleic Acids Research, 42 (1), D1075-D1082. [Pg.90]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...