Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Internal clustering criterion

The internal clustering criterion allows us to formulate clustering as an optimization problem. Unfortunately, this optimization problem falls into the category of NP-hard making it intractable for all but the smallest problem instances. Hence, a number of heuristic approaches have been advocated and in many cases these approaches do not explicitly specify the internal criterion being optimized. [Pg.135]

Clustering problems can have numerous formulations depending on the choices for data structure, similarity/distance measure, and internal clustering criterion. This section first describes a very general formulation, then it details special cases that corresponds to two popular classes of clustering algorithms partitional and hierarchical. [Pg.135]

Equations (1) and (2) represent the most general form of the optimal clustering problem. The objective is to find the clustering c that minimizes an internal clustering criterion J. J typically employs a similarity/dissimilarity measure to judge the quality of any c. The set C defines c s data structure, including all the feasible clusterings of the set Q of all objects to be clustered. [Pg.136]

Agglomerative methods, such as single link and complete link, are stepwise procedures. The formulation in (5)-(7) allows us to define the hierarchical clustering problem in terms of combinatorial optimization. To do this, however, we need an appropriate internal clustering criterion. The most obvious is squared error. [Pg.139]

We emphasize that the choice of the internal criterion for use in a partitional clustering problem is critical to the interpretability and usefulness of the resulting output. However, for computational reasons, it is desired that the criterion be simple. One of the simplest of the internal clustering criteria is total within-cluster distance ... [Pg.148]

These results show clearly the importance of the optimization criterion to clustering. The computationally simple Ward s method performs better than the simulated annealing approach with a simplistic criterion. However, a criterion that more correctly accounts for the hierarchy, by minimizing the sum of squared error at each level, performs much better. As with partitional clustering the application of simulated annealing to hierarchical clustering requires careful selection of the internal clustering criterion. [Pg.151]

Classical trajectory studies of the association reactions M+ + H20 and M+ + D20 with M = Li, Na, K (Hase et al. 1992 Hase and Feng 1981 Swamy and Hase 1982,1984), Li+(H20) + H20 (Swamy and Hase 1984), Li+ + (CH3)20 (Swamy and Hase 1984 Vande Linde and Hase 1988), and Cl- + CH3C1 (Vande Linde and Hase 1990a,b) are particularly relevant to cluster dynamics. In these studies, the occurrence of multiple inner turning points in the time dependence of the association radial coordinate was taken as the criterion for complex formation. A critical issue (Herbst 1982) is whether the collisions transfer enough energy from translation to internal motions to result in association. Comparison of association probabilities from various studies leads to the conclusion that softer and/or floppier ions and molecules that have low frequency vibrations typically recombine the most efficiently. Thus, it has been found that Li+ + (CH3)20 association is more likely than Li+ + H20 association, and similarly H20 association with Li(H20)+ is more likely than with the bare cation Li+. The authors found a nonmonotonic dependence of association probability on the assumed HaO bend frequency and also a dependence on the impact parameter, the rotational temperature, and the orientation of the H20 dipole during the collision. [Pg.16]

In this general formulation of the hierarchical clustering problem, the internal criterion J t) is calculated recursively from all the subtrees u of t. The value e(u) is sometimes called the level of the subtree u in the dendrogram. In keeping with this interpretation, e is nonincreasing along paths from the root to the leaves. [Pg.138]

Lastly we demonstrated the use of simulated annealing on examples from multi-sensor data fusion. These examples showed the effectiveness of simulated annealing in performing both hierarchical and partitional clustering. They also showed the importance of the internal criterion to the results obtained. [Pg.153]

Our results also demonstrated how simulated annealing can help choose the most appropriate internal criterion. In both the partitional and hierarchical cases clustering performance was dramatically affected by this choice. In the partitional case our testing results showed that Barker s criterion outperformed within-cluster distance. In fact, the worst Jaccard score for Barker s criterion was better than the average Jaccard score for within-cluster distance. [Pg.153]


See other pages where Internal clustering criterion is mentioned: [Pg.47]    [Pg.285]    [Pg.133]    [Pg.134]    [Pg.135]    [Pg.136]    [Pg.137]    [Pg.138]    [Pg.142]    [Pg.148]    [Pg.24]    [Pg.413]    [Pg.85]    [Pg.273]    [Pg.149]    [Pg.152]    [Pg.153]    [Pg.575]   
See also in sourсe #XX -- [ Pg.134 ]




SEARCH



Clustering Criteria

© 2024 chempedia.info