Categories, data analysis

Besides these main categories, a large number of hybrid visualization techniques also exist, which arc combinations of the methods described. Well-known hybrid approaches arc the 2D or 3D glyph displays. These techniques combine the multidimensional representation capabilities of icon-based methods with the easy and intuitive representations of scatter-plot displays, Therefore these techniques can also be frequently found within chemical data analysis applications. [Pg.477]

Often the goal of a data analysis problem requites more than simple classification of samples into known categories. It is very often desirable to have a means to detect oudiers and to derive an estimate of the level of confidence in a classification result. These ate things that go beyond sttictiy nonparametric pattern recognition procedures. Also of interest is the abiUty to empirically model each category so that it is possible to make quantitative correlations and predictions with external continuous properties. As a result, a modeling and classification method called SIMCA has been developed to provide these capabihties (29—31). [Pg.425]

An optimization criterion for determining the output parameters and basis functions is to minimize the output prediction error and is common to all input-output modeling methods. The activation or basis functions used in data analysis methods may be broadly divided into the following two categories ... [Pg.12]

As discussed and illustrated in the introduction, data analysis can be conveniently viewed in terms of two categories of numeric-numeric manipulation, input and input-output, both of which transform numeric data into more valuable forms of numeric data. Input manipulations map from input data without knowledge of the output variables, generally to transform the input data to a more convenient representation that has unnecessary information removed while retaining the essential information. As presented in Section IV, input-output manipulations relate input variables to numeric output variables for the purpose of predictive modeling and may include an implicit or explicit input transformation step for reducing input dimensionality. When applied to data interpretation, the primary emphasis of input and input-output manipulation is on feature extraction, driving extracted features from the process data toward useful numeric information on plant behaviors. [Pg.43]

A fundamental idea in multivariate data analysis is to regard the distance between objects in the variable space as a measure of the similarity of the objects. Distance and similarity are inverse a large distance means a low similarity. Two objects are considered to belong to the same category or to have similar properties if their distance is small. The distance between objects depends on the selected distance definition, the used variables, and on the scaling of the variables. Distance measurements in high-dimensional space are extensions of distance measures in two dimensions (Table 2.3). [Pg.58]

The potential of modern chemical instrumentation to detect and measure the conposition of coirplex mixtures has made it necessary to consider the use of methods of multivariable data analysis in the overall evaluation of environmental measurements. In a number of instances, the category (chemical class) of the compound that has given rise to a series of signals may be known but the specific entity responsible for a given signal may not be. This is true, for example, for the polychlorinated biphenyls (PCB s) in which the clean-up procedure and use of specific detectors eliminates most possibilities except PCB s. Such hierarchical procedures simplify the problem somewhat but it is still advantageous to apply data reduction methods during the course of the interpretation process. [Pg.243]

Aside from applications to specific regions or locations, new developments in receptor modeling have tended to take place in one of three broad categories experimental methods, data analysis and... [Pg.3]

PLS falls in the category of multivariate data analysis whereby the X-matrix containing the independent variables is related to the Y-matrix, containing the dependent variables, through a process where the variance in the Y-matrix influences the calculation of the components (latent variables) of the X-block and vice versa. It is important that the number of latent variables is correct so that overfitting of the model is avoided this can be achieved by cross-validation. The relevance of each variable in the PLS-metfiod is judged by the modelling power, which indicates how much the variable participates in the model. A value close to zero indicates an irrelevant variable which may be deleted. [Pg.103]

Stratification This technique is used to separate data into groups based on categories or characteristics. It is the basis for the application of other tools or it can be used with other data analysis tools such as scatter diagrams. [Pg.292]

Some of the factors to be considered in evaluating published material in kinetics have been outlined by Hampson and Garvin [26], Baulch and Montague [27] and Cohen and Westberg [28]. The factors fall into two categories, first the evaluation of the details of the technique and data analysis and, second, comparison of the results with material external to the study, e.g., other measurements of the rate parameters, theoretical predictions etc. [Pg.259]

Another form of random matrix-related interference is more rarely occurring gross errors, which typically are seen in the context of immunoassays and relate to unexpected antibody interactions (see interference section) Such an error will usually show up as an outlier in method comparison studies. A weU-known source is the occurrence of heterophilic antibodies. This is the background for the fact that outliers should be carefuUy considered and not just discarded from the data analysis procedure. Supplementary studies may help clarify such random matrix-related interferences and may provide specifications for the assay that limit its application in certain contexts (e.g., with regard to samples from certain patient categories). [Pg.370]

Newer data analysis methods overcome the difficulties that small sample-to-variabies ratios create for traditional statistical methods. These new methods fall into two major categories (1) support-vector classification and regression methods, and (2) feature selection and construction techniques, The former are effectively determined by only a small portion of the training data (sample), while the latter select only a small subset of variables such that the available sample is enough for traditional and newer classification techniques. [Pg.418]

At the same time the performance of the immunoassay for atrazine in RM08 demands more critical attention. The obtained Z-score was 23, therefore as it falls in the Z >3 category, means unsatisfactory result (Figure 5.2.6). It was about 6 times higher (0.98 0.06 pg L-1) than the consensus value (0.17 0.02 pg L-1) calculated from the results of 11 participants. At first glance the difference seems to be the typical and thus often described matrix effect in immunoassays. However, a more careful data analysis suggests that it might be due to the presence of other triazines in the sample. [Pg.363]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...