Similarity Measure Selection

In general, different similarity measures yield different rankings, except when they are monotonic. Improved results are obtained by using data fusion methods to combine the rankings resulting from different coefficients. [Pg.312]

Empirically, the Dice coefficient has worked better than cosine similarity in retrieving actives and is the standard choice for use with the ap and tt descriptors. [Pg.312]

Asymmetric simhaiity measures allow fuzzy super- and substructure searching. A substructure search is defined as looking for structures containing the given query and a superstructure search is defined as looking for structures embedded in the given query. In both cases asymmetric local similarity is estimated. [Pg.312]

A, J S Mason and I M McLay 1997. Similarity Measures for Rational Set Selection and Analysis lombinatorial Libraries The Diverse Property-Derived (DPD) Approach. Journal of Chemical irtnation and Computer Science 37 599-614. [Pg.740]

Lewis RA, Mason JS, McLay, IM. Similarity measures for rational set selection and analysis of combinatorial libraries The diverse property-derived (DPD) approach. J Chem Inf Comput Sci 1997 37 599-614. [Pg.207]

The NO + CO reaction is only partially described by the reactions (2)-(7), as there should also be steps to account for the formation of N2O, particularly at lower reaction temperatures. Figure 10.9 shows the rates of CO2, N2O and N2 formation on the (111) surface of rhodium in the form of Arrhenius plots. Comparison with similar measurements on the more open Rh(llO) surface confirms again that the reaction is strongly structure sensitive. As N2O is undesirable, it is important to know under what conditions its formation is minimized. First, the selectivity to N2O, expressed as the ratio given in Eq. (7), decreases drastically at the higher temperatures where the catalyst operates. Secondly, real three-way catalysts contain rhodium particles in the presence of CeO promoters, and these appear to suppress N2O formation [S.H. Oh, J. Catal. 124 (1990) 477]. Finally, N2O undergoes further reaction with CO to give N2 and CO2, which is also catalyzed by rhodium. [Pg.390]

The similarities between all pairs of objects are measured using one of the measures described earlier. This yields the similarity matrix or, if the distance is used as measure of (dis)similarity, the distance matrix. It is a symmetrical nx matrix containing the similarities between each pair of objects. Let us suppose, for example, that the meteorites A, B, C, D, and E in Table 30.3 have to be classified and that the distance measure selected is Euclidean distance. Using eq. (30.4), one obtains the similarity matrix in Table 30.4. Because the matrix is symmetrical, only half of this matrix needs to be used. [Pg.68]

The instrument scan mode called selected reaction monitoring (SRM) is generally used for quantitative applications. SRM is similar to selected ion monitoring (SIM) in single quadrupole MS. The difference is that a product ion from the decomposition reaction in the collision cell is measured instead of a single ion formed in the... [Pg.831]

A number of performance criteria are not primarily dedicated to the users of a model but are applied in model generation and optimization. For instance, the mean squared error (MSE) or similar measures are considered for optimization of the number of components in PLS or PC A. For variable selection, the models to be compared have different numbers of variables in this case—and especially if a fit criterion is used—the performance measure must consider the number of variables appropriate measures are the adjusted squared correlation coefficient, adjR, or the Akaike S information criterion (AIC) see Section 4.2.3. [Pg.124]

The distribution of Tanimoto indices for randomly selected (or all) pairs of structures characterizes the diversity of a chemical structure database (Demuth et al. 2004 Scsibrany et al. 2003). For structure similarity searches, a number of other similarity measures have been suggested (Gasteiger and Engel 2003 Willett 1987). [Pg.270]

The ability of a liquid to "wet" the membrane material is an indication of that liquids ability to establish and maintain such an interfacial layer. Liquids of surface tension values less than the critical surface tension iy ) of the membrane material are capable of completely "wetting" the polymer. It may be possible therefore, to select membrane materials capable of accomplishing specific separations by their ability to be wet by one solution component but not by the other. For this reason Yc membrane materials is important. By employing the standard techniques of Zisman (43), the critical surface tension for PSF and CA were determined to be 43.0 and 36.5 dynes/cm, respectively. This data indicates that PSF is more readily wet by a larger number of liquids than is CA. Similar measurements for the various sulfonated polysulfones are underway. [Pg.337]

The D-score is computed using the maximum dissimilarity algorithm of Lajiness (20). This method utilizes a Tanimoto-like similarity measure defined on a 360-bit fragment descriptor used in conjunction with the Cousin/ChemLink system (21). The important feature of this method is that it starts with the selection of a seed compound with subsequent compounds selected based on the maximum diversity relative to all compounds already selected. Thus, the most obvious seed to use in the current scenario is the compound that has the best profile based on the already computed scores. Thus, one needs to compute a preliminary consensus score based on the Q-score and the B-score using weights as defined previously. To summarize this, one needs to... [Pg.121]

With Ga-Beta it was found that, when the Si/Ga ratio increased from 10 to 40, the number of strong sites decreased drastically for Si/Ga between 10 and 25 and then reached a plateau above Si/Ga = 25 [53], The strength and density of acid sites in H(Ga, La)-Y were also found to be lower than those in HY crystals of the type used in FCC preparation (LZY-82) [250], Similar catalytic selectivities were obtained for both Ga-ZSM5 and A1-ZSM5 in Prins condensation of isobutylene with formaldehyde. Catalytic tests coupled with microcalorimetric measurements have shown that medium to weak acid strength sites favor the selectivity to isoprene [254],... [Pg.247]

First, let us define some key terms. One method for quantifying a reaction s efficiency is by examining the reactant conversion, the product selectivity, and the product yield over time. The reactant conversion is the fraction of reactant molecules that have transformed to product molecules (regardless of which product it is). The selectivity to product P is the fraction (or percentage) of the converted reactant that has turned into this specific product P. The yield of P is simply conversion x selectivity. High conversions in short time spans mean smaller and safer reactors. Similarly, high selectivity means less waste, and simpler and cheaper separation units. Thus, conversion, selectivity, and yield are all measures of the reaction efficiency. [Pg.4]

After selecting a measure one has to decide which clustering algorithm (strategy) may be appropriate. Sometimes it is necessary for the algorithm to fit the similarity measure. In most cases one wishes to use the algorithm which yields the most interpretable or plausible data structure. [Pg.156]

The objective of a spread design is to identify a subset of molecules in which the molecules are as dissimilar as possible under a given similarity metric. For a given metric to measure the similarity of a subset, all subsets of size k (plus any molecules previously selected) could be evaluated and the subset that produces the lowest similarity measure chosen. In practice, simple non-optimal sequential algorithms are often used to approximate the maximally dissimilar subset two such algorithms are described below. [Pg.84]

This example shows how spread designs can be used to solve practical problems. There is always a descriptor selection problem, as chemists continue to invent new molecular descriptors. Which should be used Which molecular similarity measure performs best Controlled experiments are expensive. Simulation can be used as a guide. [Pg.88]

Another approach of great importance for studies of excited state dynamics is sub-picosecond time resolved spectroscopy. A number of authors have reported femtosecond pump-probe measurements of excited state lifetimes in A, C, T, and G [13-16] and base pair mimics [17]. Schultz et al. have reported time resolved photoelectron spectroscopy and electron-ion coincidence of base pair mimics [18]. these studies can also be compared with similar measurements in solution [19-24], While time resolved measurements provide direct lifetime data, they do have the limitation that the inherent bandwidth reduces the spectral resolution, required for selecting specific electronic states and for selecting single isomers, such as cluster structure and tautomeric form. [Pg.326]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...