Compositional data matrices

This discussion provides only an outline of the techniques that have been used to search for structure in compositional data matrices generated by the analysis of archaeological materials. Before many of the techniques are used, however, some pretreatment of the data may be necessary. [Pg.67]

This chapter stresses the notion of modeling as it pertains to a structure or structures contained within a compositional data matrix and as revealed or imposed by choice of algorithmic approach. By using a generated example, the influence of such factors as outliers, transformations, interelemental correlation, choice of resemblance coefficients, grouping procedures, and group summary evaluation have been discussed. All of these factors are variable within the context of specific problem formulation. [Pg.87]

Let us suppose that dust particles have been collected in the air above a city and that the amounts of p constituents, e.g. Si, Al, Ca,..., Pb have been determined in these samples.The elemental compositions obtained for n (e.g. 100) samples, taken over a grid of sampling points, can be arranged in a data matrix X (Fig. 34.1). Each row of the table represents the elemental composition of one of the samples. A column represents the amount of one of the elements found in the sample set. Let us further suppose that there are two main sources of dust in the neighbourhood of the sampled area, and that the particles originating from each source have a specific concentration pattern for the elements Si to Pb. These concentration patterns are described by the vectors s, and Sj. For instance the dust in the air may originate from a power station and from an incinerator, having each a specific concentration pattern, sj = [Si, Al, Ca , ... PbJ with k = 1,2. [Pg.243]

Three main patterns of contamination were resolved by MCR-ALS analysis of [SE SO] data matrix (105 samples x 15 variables). Composition profiles (loadings) of the resolved components are shown in Fig. 11 (plots on the left). Variables are identified with a number in the x axis. In the y axis, the relative contribution of every scaled variable to the identified contamination pattern is given. Temporal and spatial sample distribution profiles of the contamination patterns (scores) are represented in Fig. 11 (plots on the right). In the x axis, samples are identified for the two compartments, SE and SO, successively ordered from first to third campaign and, within each campaign, form North-West to South-East. The y axis displays the contribution of every resolved contamination pattern to samples. [Pg.363]

While there are transformations that normalize a data matrix X to row sums equal to 1 (see Section 2.2.3), some data sets are originally provided in this form that the values of an object sum up to for instance 100%, like relative concentrations of chemical compounds in a mixture. Such data are called compositional data or closed data. [Pg.51]

After finding NC, we must determine the composition of each mineral. It is very helpful at this point to have a qualitative mlneraloglcal analysis, such as XRD, to provide initial estimates of compositions. In addition, libraries of mineral compositions are extremely useful. Methods based on searching the original data matrix for candidate minerals also are helpful and in some instances may provide the best compositions for real samples. [Pg.58]

The second difference is that the correlations between samples are calculated rather than the correlations between elements. In the terminology of Rozett and Peterson ( ), the correlation between elements would be an R analysis while the correlation between samples would be a Q analysis. Thus, the applications of factor analysis discussed above are R analyses. Imbrle and Van Andel ( 6) and Miesch (J 7) have found Q-mode analysis more useful for interpreting geological data. Rozett and Peterson (J ) compared the two methods for mass spectrometric data and concluded that the Q-mode analysis provided more significant informtlon. Thus, a Q-mode analysis on the correlation about the origin matrix for correlations between samples has been made (18,19) for aerosol composition data from Boston and St. Louis. [Pg.35]

The chemical mass balance method starts with a single column vector from the ambient data matrix, C]. This vector represents the chemical concentrations for the kth filter, which is combined with the best available estimates of the source compositions from the fractional composition matrix, Fij> to form a series of linear equations in which the Mj are the only unknowns. This set of equations is then solved by the least squares method to obtain the best fit of the ambient chemical data on a single filter. [Pg.79]

Multivariate methods, on the other hand, resolve the major sources by analyzing the entire ambient data matrix. Factor analysis, for example, examines elemental and sample correlations in the ambient data matrix. This analysis yields the minimum number of factors required to reproduce the ambient data matrix, their relative chemical composition and their contribution to the mass variability. A major limitation in common and principal component factor analysis is the abstract nature of the factors and the difficulty these methods have in relating these factors to real world sources. Hopke, et al. (13.14) have improved the methods ability to associate these abstract factors with controllable sources by combining source data from the F matrix, with Malinowski s target transformation factor analysis program. (15) Hopke, et al. (13,14) as well as Klelnman, et al. (10) have used the results of factor analysis along with multiple regression to quantify the source contributions. Their approach is similar to the chemical mass balance approach except they use a least squares fit of the total mass on different filters Instead of a least squares fit of the chemicals on an individual filter. [Pg.79]

Some methods are based on the knowledge of the experimental error in the measurement of the original variables. Thus, the number of significant components is that by which the original data matrix is reproduced within the measurement error. This does not usually happen with food data, where analytical error is frequently smaller than the other individual sources of variation. The number of sources of variability in food composition is very high, and it is almost impossible that the experiment has been designed to cover all these sources of variability uniformly. So, some sources of variability appear in only one or a few objects, a minority, which behaves differently from the majority. [Pg.100]

As discussed earlier, the selectivity rate constant matrix K contains one element that is unity, that is, k /k = 1. This property allows the elements of K to be determined from composition data alone. The selectivity time t, defined in Eq. (11), does not need to be independently known. [Pg.214]

During the selectivity kinetic parameter estimation, the relationship for x in terms of C5 - is determined from Eq. (12). For an assumed set of rate constants K, x is calculated for each composition data point such that the experimentally measured C5- equals that estimated from Eq. (12). Selectivity composition profiles as a function of C5- are generated in this manner. The proper selectivity matrix K will be that which minimizes the deviation between experimental and predicted profiles for the hydrocarbons other than C5-, as illustrated in Fig. 10. [Pg.214]

The term factor is a catch-all for the concept of an identifiable property of a system whose quantity value might have some effect on the response. Factor tends to be used synonymously with the terms variable and parameter, although each of these terms has a special meaning in some branches of science. In factor analysis, a multivariate method that decomposes a data matrix to identify independent variables that can reconstitute the observed data, the term latent variable or latent factor is used to identify factors of the model that are composites of input variables. A latent factor may not exist outside the mathematical model, and it might not therefore influence... [Pg.69]

Gu J, Pitz M, Schnelle-Kreis J, Diemer J, Reller A, Zimmermann R, Soentgen J, Stoelzel M, Wichmann H-E, Peters A, Cyrys J (2011) Source apportionment of ambient particles comparison of positive matrix factorization analysis applied to particle size distribution and chemical composition data. Atmos Environ 45(10) 1849-1857... [Pg.190]

In Section I we introduce the gas-polymer-matrix model for gas sorption and transport in polymers (10, LI), which is based on the experimental evidence that even permanent gases interact with the polymeric chains, resulting in changes in the solubility and diffusion coefficients. Just as the dynamic properties of the matrix depend on gas-polymer-matrix composition, the matrix model predicts that the solubility and diffusion coefficients depend on gas concentration in the polymer. We present a mathematical description of the sorption and transport of gases in polymers (10, 11) that is based on the thermodynamic analysis of solubility (12), on the statistical mechanical model of diffusion (13), and on the theory of corresponding states (14). In Section II we use the matrix model to analyze the sorption, permeability and time-lag data for carbon dioxide in polycarbonate, and compare this analysis with the dual-mode model analysis (15). In Section III we comment on the physical implication of the gas-polymer-matrix model. [Pg.117]

Another possibility of finding relationships between the impact of emissions in a territory and existing emission sources is the use of PLS modeling. For the above discussed case PLS modeling between the data matrix of the pollutant load in territory B and the data vector for the composition of the emitted dust was performed according to the mathematical basis described in Section 5.7.2. The elemental compositions both of the emitted dust and the impact of emissions were normalized to their concentrations, thus giving a uniform data basis. [Pg.263]

The resolution of a multicomponent system involves the description of the variation of measurements as an additive model of the contributions of their pure constituents [1-10]. To do so, relevant and sufficiently informative experimental data are needed. These data can be obtained by analyzing a sample with a hyphenated technique (e.g., HPLC-DAD [diode array detection], high-performance liquid chromatography-DAD) or by monitoring a process in a multivariate fashion. In these and similar examples, all of the measurements performed can be organized in a table or data matrix where one direction (the elution or the process direction) is related to the compositional variation of the system, and the other direction refers to the variation in the response collected. The existence of these two directions of variation helps to differentiate among components (Figure 11.1). [Pg.418]

Equation 11.20 describes the factorization of the experimental data matrix into two factor matrices, the loadings matrix VT and the augmented scores matrix Uau . The loadings matrix V1 identifies the nature and composition of the N main contamination sources defined by means of their chemical composition (SVOC concentrations)... [Pg.456]

FIGURE 11.19 Composition (loading) profiles of resolved components in the MCR-ALS analysis of raw augmented data matrix. Top components explain more variance bottom components explain less variance. Names for the compounds are defined in the caption to Figure 11.17. [Pg.461]

A data matrix produced by compositional analysis commonly contains 10 or more metric variables (elemental concentrations) determined for an even greater number of observations. The bridge between this multidimensional data matrix and the desired archaeological interpretation is multivariate analysis. The purposes of multivariate analysis are data exploration, hypothesis generation, hypothesis testing, and data reduction. Application of multivariate techniques to data for these purposes entails an assumption that some form of structure exists within the data matrix. The notion of structure is therefore fundamental to compositional investigations. [Pg.63]

Most of the matrix data that we review are chemical and mineralogical as there is little isotopic information about individual matrix grains. Bulk oxygen isotopic compositions for matrix samples commonly differ from those of associated chondrules (e.g., Scott et al, 1988), but lack of oxygen-isotope data for samples of key chondrites and specific matrix components severely limits inferences about matrix origins. [Pg.180]

In the future, models will exist which will link constants for in vitro binding to cloned human receptors (Kd), data from in vitro functional assays (IC50) and animal and human in vivo EC50 values. A composite prediction matrix will be applied rapidly and accurately to the process of synthesis of new compounds for phase I testing. [Pg.95]

Inductively coupled argon plasma emission spectrophotometry (ASTM D-5708) has an advantage over atomic absorption spectrophotometry (ASTM D-4628, ASTM D-5863) because it can provide more complete elemental composition data than the atomic absorption method. Flame emission spectroscopy is often used successfully in conjunction with atomic absorption spectrophotometry (ASTM D-3605). X-ray fluorescence spectrophotometry (ASTM D-4927, ASTM D-6443) is also sometimes used, but matrix effects can be a problem. [Pg.42]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...