Chemical space cell-based

In chemoinformatics research, partitioning algorithms are applied in diversity analysis of large compound libraries, subset selection, or the search for molecules with specific activity (1-4). Widely used partitioning methods include cell-based partitioning in low-dimensional chemical spaces (1,3) and decision tree methods, in particular, recursive partitioning (RP) (5-7). Partitioning in low-dimensional chemical spaces is based on various dimension reduction methods (4,8) and often permits simplified three-dimensional representation of... [Pg.291]

A major potential drawback with cluster analysis and dissimilarity-based methods f selecting diverse compounds is that there is no easy way to quantify how completel one has filled the available chemical space or to identify whether there are any hole This is a key advantage of the partition-based approaches (also known, as cell-bas( methods). A number of axes are defined, each corresponding to a descriptor or son combination of descriptors. Each axis is divided into a number of bins. If there are axes and each is divided into b bins then the number of cells in the multidimension space so created is ... [Pg.701]

Partitioning or cell-based methods provide an absolute measure of the chemical space covered by a collection of compounds. They are based on the definition of a low-dimensional chemistry space, for example, one based on a small number of physicochemical properties such as molecular weight, calculated logP, and number of hydrogen bond donors [45]. Each property defines an axis of the chemistry-space. The range of values for each property is divided into a set of bins, and the combinatorial product of all bins then defines the set of cells or partitions that make up the space. [Pg.201]

While not convincing from a statishcal perspective, the results in this section are consistent with a trend high-activity molecules published in the past decade of medicinal chemistry literature are more likely to be found in the large, hydrophobic and poor solubility corner of chemical property space. These results are not consistent with, for example, cell-based [41] and median-based [42] partihoning of biologically active compounds however, such analyses were performed in the presence of inactive compounds selected from MDDR[41] or ACD [42], with quite probably unrelated chemotypes. ACD, the Available Chemicals Directory [43], and MDDR, the MDL Drug Data Report [43], are databases commonly used by the pharmaceuhcal industry. [Pg.32]

The currently perhaps most popular approach to cell-based partitioning in low-dimensional chemical spaces focuses on the so-called the BCUT metric... [Pg.282]

The cluster-based selection procedure starts with classifying compounds into clusters of similar molecules with a clustering algorithm followed by selection of representative(s) from each cluster (24). On the other hand, the partition-based selection procedure partitions chemical space into cells by dividing values of each dimension into various intervals and selects representative... [Pg.39]

Cell-based partitioning can not only be used for compound selection but also to aid in combinatorial diversity design. In this case, a chemical descriptor space is defined and empty partitions are generated by binning. Test compounds are then enumerated on the computer based on reaction schemes and selected to evenly populate these partitions. [Pg.15]

Burden, CAS, and University of Texas (BCUT) descriptors are well suited and widely used to describe diversity of a chemical population in a low dimensional Euclidian space and they allow for fast cell-based diversity selection algorithms (Pearlman and Smith, 1998). The DiverseSolutions... [Pg.255]

Compound selection is a core process of library design, and three main methods can be mentioned. Dissimilarity-based methods select compounds in terms of similar-ity/distance between individuals in chemical space. Clustering methods first group compounds into clusters based on similarity/distance and then choose representative compounds from different clusters. Partitioning methods first create a uniform cell space that subdivides the chemical space, then assign all virtual compounds to the relative cells according to their properties, and finally choose representative compounds from different cells. [Pg.184]

A diversity metric is a function to aid the quantification of the diversity of a set of compounds in some predefined chemical space. Diversity metrics fall into three main classes (1) Distance-based methods, which express diversity as a function of the pairwise molecular dissimilarities defined through measurement. (2) Cell-based methods, which define diversity in terms of occupancy of a finite number of cells that represent disjoint regions of chemical space. (3) Variance-based methods, which quantify diversity based on the degree of correlation between a compound s important features. [Pg.138]

Cell-base diversity attempts to quantify diversity by dividing chemical space into hyper-rectangular regions and measuring the occupancy of the resulting cells. One advantage of these methods is that, unlike distance-based techniques, they can... [Pg.140]

An advantage of cell-based methods is that they allow the explicit identification of those regions of the chemical space that are underrepresented, or, even unrepresented (i.e., diversity voids), in a database thus suggesting alternative potential structures to those of the existing chemicals [Pearlman and Smith, 1998]. [Pg.86]

Property filters are a particular implementation of partitioning methods they are used to select drug-like or lead-like compounds from large chemical libraries. Like the cell-based methods, these filters are based on a partition of the chemical space but each selected molecular descriptor is divided into only two or three subranges of values. While property filters mainly aim at optimizing drug-likeness, cell-based methods at optimizing diversity of chemical libraries. [Pg.87]

Unlike other cell-based diversity methods, which select one compound from each cell of the chemical space, JEDA allows selection of more than one compound belonging to the same cell. [Pg.88]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...