Between-class variance

When we consider the multivariate situation, it is again evident that the discriminating power of the combined variables will be good when the centroids of the two sets of objects are sufficiently distant from each other and when the clusters are tight or dense. In mathematical terms this means that the between-class variance is large compared with the within-class variances. [Pg.216]

However, contrary to PCA, it is a supervised method that uses the information of which data point belongs to which class. The discriminants are linear combinations of the measured variables (e.g., sensor response). A discriminant function is found that maximizes the ratio of between-class variance to within-class variance. [Pg.173]

Supervised learning methods - multivariate analysis of variance and discriminant analysis (MVDA) - k nearest neighbors (kNN) - linear learning machine (LLM) - BAYES classification - soft independent modeling of class analogy (SIMCA) - UNEQ classification Quantitative demarcation of a priori classes, relationships between class properties and variables... [Pg.7]

Matrix B expresses the variance between the means of the classes, matrix expresses the pooled within-classes variance of all classes. The two matrices B and W are the starting point both for multivariate analysis of variance and for discriminant analysis. [Pg.183]

These observations may be summarized conveniently in an analysis-of-variance table-. Table 26-7 illustrates this type of table for the above case. The overall variance (total mean square) Sj(N — 1) contains contributions due to variances within as well as between classes. The variation between classes contains both variation within classes and a variation associated with the classes themselves and is given by the expected mean square aj + not. Whether not is significant can be determined by the F test. Under the null hypothesis, = 0. Whether the ratio... [Pg.550]

The discriminant power of the variables will be high when the centroids of the two classes of samples are sufficiently distant from each other and when the samples in the classes are dense. This means that the variance between classes is higher than the variances in the classes. LDA will search a linear function, D, of the variables, which maximizes the ratio between the variances of two classes K and L (8). The discriminant function for n variables is given by the following equation ... [Pg.305]

That is, a is the direction that maximizes the separation between the classes, both by having compact classes (a small within-groups variance) and by having the class centers far apart (a large between-groups variance). Large values in a indicate which variables are important in the discrimination. Another formulation is to calculate the Mahalanobis distance of a new sample x to the class centers... [Pg.143]

The selection of variables could separate relevant information from unwanted variability and at the same time allows data compression, that is more parsimonious models, simplification or improvement of model interpretation, and so on. Although many approaches can be used for features selection, in this work, a wavelet-based supervised feature selection/classification algorithm, WPTER [12], was applied. The best performing model was obtained using a daubechies 10 wavelet, a maximum decomposition level equal to 10, between-class/within-class variance ratio criterion for the thresholding operation and the percentage of selected coefficients equal to 2%. Six wavelet coefficients were selected, belonging to the 4th, 5th, 6th, 8th, and 9th levels of decomposition. [Pg.401]

These weights depend on several characteristics of the data. To understand which ones, let us first consider the univariate case (Fig. 33.7). Two classes, K and L, have to be distinguished using a single variable, Jt,. It is clear that the discrimination will be better when the distance between and (i.e. the mean values, or centroids, of 3 , for classes K and L) is large and the width of the distributions is small or, in other words, when the ratio of the squared difference between means to the variance of the distributions is large. Analytical chemists would be tempted to say that the resolution should be as large as possible. [Pg.216]

A simple two-dimensional example concerns the data from Table 33.1 and Fig. 33.9. The pooled variance-covariance matrix is obtained as [K K -1- L L]/(n, + 3 - 2), i.e. by first computing for each class the centred sum of squares (for the diagonal elements) and the cross-products between variables (for the other... [Pg.217]

LC-TSP-MS without tandem mass capabilities has only met with limited success for additive analysis in most laboratories. Thermospray ionisation was especially applied between 1987 and 1992 in combination with LC-MS for a wide variety of compound classes, e.g. dyes (Fig. 7.31). Thermospray, particle-beam and electrospray LC-MS were used for the analysis of 14 commercial azo and diazo dyes [594]. No significant problems were met in the LC-TSP-MS analysis of neutral and basic azo dyes [594,595], at variance with that of thermolabile sulfonated azo dyes [596,597], LC-TSP-MS has been used to elucidate the structure of Basic Red 14 [598]. The applications of LC-TSP-MS and LC-TSP-MS in dye analysis have been reviewed [599]. [Pg.513]

FIGURE 6.2 Representation of multivariate data by icons, faces, and music for human cluster analysis and classification in a demo example with mass spectra. Mass spectra have first been transformed by modulo-14 summation (see Section 7.4.4) and from the resulting 14 variables, 8 variables with maximum variance have been selected and scaled to integer values between 1 and 5. A, typical pattern for aromatic hydrocarbons B, typical pattern for alkanes C, typical pattern for alkenes 1 and 2, unknowns (2-methyl-heptane and meta-xylene). The 5x8 data matrix has been used to draw faces (by function faces in the R-library Tea-chingDemos ), segment icons (by R-function stars ), and to create small melodies (Varmuza 1986). Both unknowns can be easily assigned to the correct class by all three representations. [Pg.267]

Number concentrations are dominated by submicron particles, whereas the mass concentrations are strongly influenced by particle concentrations in 0.1-10 pm diameter range [13]. Similarly, the variability of the number-based measurements is strongly dominated by variability in smaller diameter ranges, whereas the variability of mass-based properties, such as PM10, are dominated by variability in the accumulation mode (usually around 500 nm of mass mean diameter) and in the coarse mode. This means the variabilities of these properties are not necessarily similar in shorter timescales, due to sensitivity of variance from very different air masses and thus aerosol types. This is demonstrated in Fig. lb, where the variance of the each size class of particle number concentrations between 3 and 1,000 nm is shown for SMEAR II station in Hyytiala, Finland. The variance has similarities to the particle number size distribution (Fig. la), but there are also significant differences, especially on smaller particles sizes. Even though in the median particle number size distribution the nucleation mode is visible only weakly, it is a major contributor to submicron particle number concentration variability. [Pg.301]

Some common classification parameters are the mean values of the classes in the classification space, the variance of the class s calibration samples around the class mean, and the unmodeled variance in the calibration samples. Classification logic varies widely between classification methods. The following section provides details on some commonly encountered classification methods. [Pg.289]

NIR spectroscopy was utilized by Aldridge and coworkers86 to determine, in a rapid manner, the polymorphic quality of a solid drug substance. Two computational methods, Mahalonobis distance and soft independent modeling of class analogy (SIMCA) residual variance, were used to distinguish between acceptable and unacceptable samples. The authors not only determined that the Mahalonobis distance classification yielded the best results, they addressed one of the key implementation issues regarding NIR as a PAT tool. [Pg.349]

In discriminant analysis, in a manner similar to factor analysis, new synthetic features have to be created as linear combinations of the original features which should best indicate the differences between the classes, in contrast with the variances within the classes. These new features are called discriminant functions. Discriminant analysis is based on the same matrices B and W as above. The above tested groups or classes of data are modeled with the aim of reclassifying the given objects with a low error risk and of classifying ( discriminating ) another objects using the model functions. [Pg.184]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...