Measures of dis similarity

The similarities between all pairs of objects are measured using one of the measures described earlier. This yields the similarity matrix or, if the distance is used as measure of (dis)similarity, the distance matrix. It is a symmetrical nx matrix containing the similarities between each pair of objects. Let us suppose, for example, that the meteorites A, B, C, D, and E in Table 30.3 have to be classified and that the distance measure selected is Euclidean distance. Using eq. (30.4), one obtains the similarity matrix in Table 30.4. Because the matrix is symmetrical, only half of this matrix needs to be used. [Pg.68]

Applications Quantitative dry ashing (typically at 800 °C to 1200°C for at least 8h), followed by acid dissolution and subsequent measurement of metals in an aqueous solution, is often a difficult task, as such treatment frequently results in loss of analyte (e.g. in the cases of Cd, Zn and P because of their volatility). Nagourney and Madan [20] have compared the ashing/acid dissolution and direct organic solubilisation procedures for stabiliser analysis for the determination of phosphorous in tri-(2,4-di-t-butylphenyl)phosphite. Dry ashing is of limited value for polymer analysis. Crompton [21] has reported the analysis of Li, Na, V and Cu in polyolefins. Similarly, for the determination of A1 and V catalyst residues in polyalkenes and polyalkene copolymers, the sample was ignited and the ash dissolved in acids V5+ was determined photo-absorptiometrically and Al3+ by complexometric titration [22]. [Pg.594]

On the other hand, factor analysis involves other manipulations of the eigen vectors and aims to gain insight into the structure of a multidimensional data set. The use of this technique was first proposed in biological structure-activity relationship (i. e., SAR) and illustrated with an analysis of the activities of 21 di-phenylaminopropanol derivatives in 11 biological tests [116-119, 289]. This method has been more commonly used to determine the intrinsic dimensionality of certain experimentally determined chemical properties which are the number of fundamental factors required to account for the variance. One of the best FA techniques is the Q-mode, which is based on grouping a multivariate data set based on the data structure defined by the similarity between samples [1, 313-316]. It is devoted exclusively to the interpretation of the inter-object relationships in a data set, rather than to the inter-variable (or covariance) relationships explored with R-mode factor analysis. The measure of similarity used is the cosine theta matrix, i. e., the matrix whose elements are the cosine of the angles between all sample pairs [1,313-316]. [Pg.269]

The greater stability of the 1 1 adducts in aprotic solvents is, then, attributed mainly to the enhanced reactivity of the attacking nucleophiles in these solvents. This factor should also favour the production of di-adducts in aprotic solvents and NMR measurements do indicate that these are formed from trinitro-substituted compounds and methoxide ions in media rich in dimethyl sulphoxide. However, there is some evidence that the di-adducts are not particularly well solvated by dimethyl sulphoxide and are in fact better solvated by water. Thus it has been found that 1 2 adducts are very readily formed in water. For example 1,3,5-trinitrobenzene gives both 1 1 and 1 2 adducts in fairly dilute solutions of hydroxide ions in water, while dimethyl-picramide and the picrate ion give evidence only for the production of 1 2 adducts. Similarly a variety of trinitro-compounds are readily converted into di-adducts in aqueous sodium sulphite solution, although... [Pg.253]

An alternative way of measuring the dissimilarity of one compound to a set of compounds is to sum the pairwise dissimilarities between the compound and all compounds in the set, a method known as MaxSum. The most dissimilar compound to a set of compounds is the compound which has the maximum sum of pairwise dissimilarities. Holliday et al. [42] have implemented an efficient version of MaxSum that uses the cosine coefficient as the (dis)similarity coefficient. Their algorithm operates in O(nN) time complexity and can thus be applied to very large datasets. However, as Snarey et al. [43] have pointed out, there is a tendency for the algorithm to focus on outliers. [Pg.353]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...