Algorithm dissimilarity

In dissimilarity-based compound selection the required subset of molecules is identified directly, using an appropriate measure of dissimilarity (often taken to be the complement of the similarity). This contrasts with the two-stage procedure in cluster analysis, where it is first necessary to group together the molecules and then decide which to select. Most methods for dissimilarity-based selection fall into one of two categories maximum dissimilarity algorithms and sphere exclusion algorithms [Snarey et al. 1997]. [Pg.699]

The maximum dissimilarity algorithm works in an iterative manner at each step one compormd is selected from the database and added to the subset [Kennard and Stone 1969]. The compound selected is chosen to be the one most dissimilar to the current subset. There are many variants on this basic algorithm which differ in the way in which the first compound is chosen and how the dissimilarity is measured. Three possible choices for fhe initial compormd are (a) select it at random, (b) choose the molecule which is most representative (e.g. has the largest sum of similarities to the other molecules) or (c) choose the molecule which is most dissimilar (e.g. has the smallest sum of similarities to the other molecules). [Pg.699]

Holiday J D, S R Ranade and P Willett 1995. A Fast Algorithm For Selecting Sets Of Dissimilar Molecule From Large Chemical Databases. Quantitative Structure-Activity Relationships 14 501-506. [Pg.739]

The biggest limitation of the CoMFA method is the alignment step. The algorithm superimposes the portions of the inhibitors that are of similar stmcture, assuming that they bind with similar orientations in the active site of the enzyme, which is not necessarily the case. Also, because of a problem with alignment, a CoMFA may fail when a few molecules are very dissimilar from all others in the series. Like QSAR, CoMFA does not require a stmcture of the relevant biological receptor, but does require knowledge about a series of inhibitory compounds. [Pg.328]

Snarey M, Terrett NK, Willett P, Wilton DJ. Comparison of algorithms for dissimilarity-based compound selection. J Mol Graph Model 1997 15 372-85. [Pg.206]

Holliday, J.D., Ranade, S.S., and Willett, P. A fast algorithm for selecting sets of dissimilar molecules from large chemical databases. Quant. Struc.-Act. Relat. 1995, 14, 501-506. [Pg.109]

Willett, P. Dissimilarity-based algorithms for selecting structurally diverse sets of compounds./. Comput. Biol. 1999, 6, 447-457. [Pg.172]

The D-score is computed using the maximum dissimilarity algorithm of Lajiness (20). This method utilizes a Tanimoto-like similarity measure defined on a 360-bit fragment descriptor used in conjunction with the Cousin/ChemLink system (21). The important feature of this method is that it starts with the selection of a seed compound with subsequent compounds selected based on the maximum diversity relative to all compounds already selected. Thus, the most obvious seed to use in the current scenario is the compound that has the best profile based on the already computed scores. Thus, one needs to compute a preliminary consensus score based on the Q-score and the B-score using weights as defined previously. To summarize this, one needs to... [Pg.121]

Once a database of candidate molecules has been prepared, it may be desirable to select a diverse set of molecules. Diversity algorithms are designed to select sets of molecules in such a way that the chemical space from which they have been extracted is sampled democratically.1291 Molecules are represented in this space using molecular descriptors and dissimilarity between them is quantified using metrics derived from the value of the descriptors. In terms of descriptors that have been used for fragment molecules,... [Pg.45]

The DIVSEL program was developed by Pickett et al. for combinatorial reagent selection using three-point pharmacophores as the descriptor for similarity calculations [2], The algorithm starts by selecting the compound most dissimilar to the others in the set and then iteratively selects compounds most dissimilar to those already selected. DIVSEL was used to select a set of carboxylic acids from a collection of 1100 monocarboxylic acids for an amide library, based on the pharmacophoric diversity of the products. Eleven diverse amines were selected based on pharmacophoric diversity. A virtual library of 12100 amides was constructed from the 11 amines and 1100 carboxylic acids. The DIVSEL program used the pharmacophore fingerprints for the product virtual library to select a diverse set of the carboxylic acids. The products of 90 acids with the 11 amines selected with DIVSEL covered 85% of the three-point pharmacophores represented by the entire 12100 compound virtual library. [Pg.194]

Spatially resolved material identification and classification is currently the prevalent application for SI systems. Of the many powerful spectral classifiers available, only two types, each with a number of different algorithms,14 could successfully be applied for real-time SI applications discriminant classifiers and dissimilarity-based classifiers. In addition, occasionally dedicated algorithms, such as fuzzy-classifiers, may be useful for special applications, for example, when there is no ab inito knowledge about the number and properties of the classification classes. [Pg.166]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...