Class membership information

In contras to unsupervised methods, supervised pattern-recognition methods (Section 4.3) use class membership information in the calculations. The goal of these methods is to construct models using analytical measurements to predict class membership of future samples. Class location and sometimes shape are used in the calibration step to construct the models. In prediction, these moddsare applied to the analytical measurements of unknowu samples to predict dsss membership. [Pg.36]

The preprocessed data and class membership information is submitted to the analysis software. Euclidean distance and leave-one-out cross-validation is used to determine the value for K and the cutoff for G. [Pg.69]

Before using the class-membership information, unusual sample 2 might have been considered an outlier. However, it no longer appears to be unusual because it is the only member of class E in the data set. Be aware that it is not possible to characterize the dispersion of the class with only one sample. [Pg.221]

There are several distinctions of the PLS-DA method versus other classification methods. First of all, the classification space is unique. It is not based on X-variables or PCs obtained from PCA analysis, but rather the latent variables obtained from PLS or PLS-2 regression. Because these compressed variables are determined using the known class membership information in the calibration data, they should be more relevant for separating the samples by their classes than the PCs obtained from PCA. Secondly, the classification rule is based on results obtained from quantitative PLS prediction. When this method is applied to an unknown sample, one obtains a predicted number for each of the Y-variables. Statistical tests, such as the /-test discussed earlier (Section 8.2.2), can then be used to determine whether these predicted numbers are sufficiently close to 1 or 0. Another advantage of the PLS-DA method is that it can, in principle, handle cases where an unknown sample belongs to more than one class, or to no class at all. [Pg.293]

Class membership information shows groups or clusters in the data. [Pg.164]

Supervised pattern recognition methods are the methods that use the class membership information while reveaUng dominant pattern in the data. [Pg.165]

Class identifier this gives the column number which contains information about the class membership. [Pg.464]

Exploration analysis is not adequate when the task of the analysis is clearly defined. An example is the attribution of each measurement to a pre-defined set of classes. In these cases it is necessary to find a sort of regression able to assign each measurement to a class according to some pre-defined criteria of class membership selection. This kind of analysis is called supervised classification. The information about which classes are present have to be acquired from other considerations about the application under study. Once classes are defined, supervised classification may be described as the search for a model of the following kind ... [Pg.157]

This criterion for selection of features leads to a set of descriptors that contain optimal Information about class membership as (opposed to information about class differences. [Pg.247]

It is used to examine the similarities and differences between samples without imposing a priori information regarding class membership. [Pg.43]

Supervised versus Unsupervised Pattern Recognition In some situations the class membership of the samples is unknown. For example, an analyst may simply want to examine a data set to see what can be learned. Are there any groupings of samples Are there any outliers (i.e., a small number of samples that are not grouped with the majority) Even if class information is known, the analyst may want to identify and display natural groupings in the data without imposing class membership on the samples. For example, assume a series of spectra have been collected and the goal is to... [Pg.214]

Studies of the extent of absorption in humans, or intestinal permeability methods, can be used to determine the permeability class membership of a drug. To be classified as highly permeable, a test drug should have an extent of absorption >90% in humans. Supportive information on permeability characteristics of the drug substance should also be derived from its physical-chemical properties (e.g., octanol water partition coefficient). [Pg.225]

Let us assume that a known set of samples is available, where the category or class membership of every sample is a priori known. Then a suitable planning of the data-acquisition process is needed. At this point, chemical expreiience, saooir faire and intuition are invaluable in order to decide which measurements should be made on the samples and which variables of these measurements are most likely to contain class information. [Pg.23]

Parametru/non-parametric techniques This first distinction can be made between techniques that take account of the information on the population distribution. Non parametric techniques such as KNN, ANN, CAIMAN and SVM make no assumption on the population distribution while parametric methods (LDA, SIMCA, UNEQ, PLS-DA) are based on the information of the distribution functions. LDA and UNEQ are based on the assumption that the population distributions are multivariate normally distributed. SIMCA is a parametric method that constructs a PCA model for each class separately and it assumes that the residuals are normally distributed. PLS-DA is also a parametric technique because the prediction of class memberships is performed by means of model that can be formulated as a regression equation of Y matrix (class membership codes) against X matrix (Gonzalez-Arjona et al., 1999). [Pg.31]

Unsupervised Competitive Learning This learning algorithm is used if no information about the class membership of the training data vectors is available. The change of the weights at iteration t is updated by... [Pg.313]

These classification methods use different principles and rules for learning and prediction of class membership, but wiU usually produce a comparable result. Some comparisons of the methods have been given (i.e., Kotsiantis, 2007 Rani et al., 2006). Although the modem methods such as SVM have demonstrated very good performance, the drawback is that the model becomes an incomprehensible black-box that removes the explanatory information provided by, for example, a logistic regression model. However, classification performance usually outweighs the need for a comprehensible model. PCA has been used for classification based on bioimpedance measurements. Technically, PCA is not a method for classification but rather a method of data reduction, more suitable as a parameterization step before the classification analysis. [Pg.386]

Cluster analysis provides a method for discrimination of different classes without a priori information about possible class memberships. A group of objects is classified into smaller subgroups through the different realization of the corresponding characteristic features (variables). Objects in the same class should be as similar as possible and significantly different from objects in other classes. Similarity can be most easily defined in terms of the distance of objects in the variable space. In cluster analysis, most frequently the Euclidean distance djj between two objects i and j with p variables is used ... [Pg.703]

It may happen that only a single rule provides information about a particular output variable. When this is true, that rule can be used immediately as a measure of the membership for the variable in the corresponding set. In the enzyme problem, only one rule predicts that the rate is high, therefore, we can provisionally assign a membership of 0.2 for the rate in this fuzzy class. Often though, several rules provide fuzzy information about the same variable and these different outputs must be combined in some way. This is done by aggregating the outputs of all rules for each output variable. [Pg.255]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...