Problems with scatter plots

Scatter plots in PCA have special properties because the scores are plotted on the base P, and the columns of P are orthonormal vectors. Hence, the scores in PCA are plotted on an orthonormal base. This means that Euclidean distances in the space of the original variables, apart from the projection step, are kept intact going to the scores in PCA. Stated otherwise, distances between two points in a score plot can be understood in terms of Euclidian distances in the space of the original variables. This is not the case for score plots in PARAFAC and Tucker models, because they are usually not expressed on an orthonormal base. This issue was studied by Kiers [2000], together with problems of differences in horizontal and vertical scales. The basic conclusion is that careful consideration should be given to the interpretation of scatter plots. This is illustrated in Example 8.3. [Pg.192]

In order to illustrate the importance of orthogonal and orthonormal bases, a small real example is given. Consider a mixture of two chemical compounds ethanol and isopropanol. A good way to show plotting principles is by making an orthogonal design in the concentrations. The columns of C are the concentrations of ethanol (column 1) and isopropanol (column 2). [Pg.195]

Multi-way Analysis With Applications in the Chemical Sciences [Pg.196]

The above has repercussions for component models. In component models, X is decomposed as X = X + E and in order to study the model, score plots are used. Such score plots only reflect the rows of X in the original domain (of X) if an orthonormal basis is used to express the scores. For curve resolution types of studies, it is usually more insightful to express the scores (e.g. concentrations) on the basis of the estimated spectra (Z). [Pg.197]

In Figure 2, the distribution of each variable for each group is plotted along with a bivariate scatter plot of the data and it is clear that the two groups form distinct clusters. However, it is equally evident that it is necessary for both variables to be considered in order to achieve a clear separation. The problem facing us is to determine the best line between the data clusters, the discriminant function, and this can be achieved by consideration of probability and Bayes theorem. [Pg.127]

A two-component PARAFAC model is fitted to a sensory data set of different breads. Ten breads are assessed by eight assessors on eleven different attributes. The left plot of Figure 7.14 shows a scatter plot of the attribute loadings (B) of the PARAFAC model. It is seen that two attributes have low loadings in both components. The right plot shows that this is also reflected by the small Mahalanobis distances, indicating that these attributes are not relevant for the problem or at least not consistent with the main variation in the data. [Pg.173]

The distribution of the observations can be visualized using scatter plots. For obvious reasons, scatter plots are limited to three dimensions at most, and typically to two dimensions. Therefore, the direct observation of the data distribution in data sets with several tens, hundreds or even thousands of variables is not possible. One can always construct scatter plots for selective pairs or thirds of variables, but this is an overwhelming and often misleading approach. Projection models overcome this problem. PCA and PLS can be used straightforwardly to visualize the distribution of the data in the latent subspace, considering only a few latent variables (LVs) which contain most of the variability of interest. Scatter plots of the scores corresponding to the LVs, the so-called score plots, are used for this purpose. [Pg.64]

To deal with this problem of intermixing, atomic ratios for EDS point analyses are commonly plotted on 2D (or 3D) scatter plots, which can be interpreted to give the compositions of the phases present and the intermixing between them. This is particularly useful for estimating the average C-S-H composition (or C-A-S-H composition when aluminium is present). [Pg.382]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...