Data compression

Typically, compression and filtering of spectroscopic data go hand-in-hand. Often, the process of compressing the data leads to a certain amount of noise filtering. [Pg.86]

In general, there are two types of compression (1) individual spectra can be compressed and filtered and (2) the entire dataset can be compressed and filtered by representing each of the individual spectra as a linear combination of some smaller set of data, which is referred to as a basis set. In this section, we will address the processing of individual spectra by applying the fast fourier transform (FFT) algorithm and followed this discussion with one on processing sets of spectra with principal component analysis (PCA). [Pg.87]

The idea behind PCA is that a spectral dataset of mixtures of the same components can be expressed as a linear combination of a small set of spectral representations. This is most easily understood if we consider a set of mixture spectra for three components. Assuming that there is no noise and that the mixture spectra are simply the sum of absorptivity spectra for the pure components, each of the mixture spectra can [Pg.87]

Principal component analysis makes it possible to find a set of representations for mixture spectra in which noise and interactions are taken into account without knowing anything about the spectra of the pure components or their concentrations. The basic idea is to find a set of representations that can be linearly combined to reproduce the original mixture spectra. In PCA, Equation (4.3) is rewritten as [Pg.89]

The loading and scores for PCA can be generated by singular value decomposition (SVD). Instead of expressing the matrix containing the mixture spectra, A, as a product of two matrices as in Equation (4.4), SVD expresses it as a product of three matrices [Pg.89]

It was mentioned earlier that empirical multivariate modeling often requires a very large amount of data. These data can contain a very large number of samples (N), a very large number of variables (M), or both. In the case of PAC, where the analytical method is often a form of spectroscopy or chromatography, the number of variables collected per process sample can range from the hundreds to the thousands [Pg.243]

Data compression is the process of reducing data into a representation that uses fewer variables, yet still expresses most of its information. There are many different types of data compression that are applied to a wide range of technical fields, but only those that are most relevant to process analytical applications are discussed here. [Pg.243]

The most straightforward means of data compression is to simply select a subset of variables that is determined to be relevant to the problem (or, conversely, to remove a subset of variables determined to be irrelevant to the problem). Assessment of relevance can be done manually (using a priori knowledge of the system being analyzed) or empirically (using statistical methods on the data itself). [Pg.243]

Some of the earliest applications of chemometrics in PAC involved the use of an empirical variable selection technique commonly known as stepwise multiple linear regression (SMLR).8,26,27 As the name suggests, this is a technique in which the relevant variables are selected sequentially. This method works as follows [Pg.243]

Step 1 A series of linear regressions of each X- variable to the property of interest is done. Step 2 The single variable that has the best linear regression fit to the property of interest is selected (xt). [Pg.243]

The conditions of these target appHcations are fulfihed by WORM-disks. Apart from data compression on a smah volume, WORM filing systems offer the advantage of fast access from the workplace at ah times, including a simplified document search and retrieval strategy. [Pg.140]

A special implementation of the CD-R disk is the Photo-CD by Kodak which is a 5.25 in. WORM disk employing the dye-in-polymer principle for storage of up to 100 sHdes /pictures on a CD (after data compression) with the possibhity of interactive picture processing. [Pg.140]

M. R. Nelson, The Data Compression Book, M T Books, Redwood City, Calif., 1991. [Pg.58]

The historical data is sampled at user-specified intervals. A typical process plant contains a large number of data points, but it is not feasible to store data for all points at all times. The user determines if a data point should be included in the list of archive points. Most systems provide archive-point menu displays. The operators are able to add or delete data points to the archive point hsts. The samphng periods are normally some multiples of their base scan frequencies. However, some systems allow historical data samphng of arbitraiy intei vals. This is necessaiy when intermediate virtual data points that do not have the scan frequency attribute are involved. The archive point lists are continuously scanned bv the historical database software. On-line databases are polled for data. The times of data retrieval are recorded with the data ootained. To consei ve storage space, different data compression techniques are employed by various manufacturers. [Pg.773]

Then, with data compression techniques (sending only changes for most of the words in the frame and rotating the data), an effective transmission rate of 10 bits/s can be achieved. [Pg.937]

Data compression. As mentioned in the previous chapter, PCA and PLS provide us with an optimum way to reduce the dimensionality of our data so that we can use ILS to develop calibrations. [Pg.81]

The primary objective of any data compression technique is to transform the data to a form that requires the smallest possible amount of storage space, while retaining all the relevant information. The desired qualities of a technique for efficient storage and retrieval of chemical process data are as follows ... [Pg.215]

The ideas presented in Section III are used to develop a concise and efficient methodology for the compression of process data, which is presented in Section IV. Of particular importance here is the conceptual foundation of the data compression algorithm instead of seeking noninterpretable, numerical compaction of data, it strives for an explicit retention of distinguished features in a signal. It is shown that this approach is both numerically efficient and amenable to explicit interpretations of historical process trends. [Pg.216]

The practical, implementational issues that arise during the utilization of the data compression techniques, presented in the previous two subsections, are discussed in the following paragraphs. [Pg.251]

The speed with which the data need to be compressed depends on the stage of data acquisition at which compression is desired. In intelligent sensors it may be necessary to do some preliminary data compression as the data are collected. Often data are collected for several days or weeks without any compression, and then stored into the company data archives. These data may be retrieved at a later stage for studying various aspects ol the process operation. [Pg.251]

In order to compress the measured data through a wavelet-based technique, it is necessary to perform a series of convolutions on the data Becau.se of the finite size of the convolution filters, the data may be decomposed only after enough data has been collected so as to allow convolution and decomposition on a wavelet basis. Therefore, point-bypoint data compression as done by the boxcar or backward slope methods is not possible using wavelets. Usually, a window of data of length 2" m e Z, is collected before decomposition and selection of the appropriate... [Pg.251]

Fig. 20. Performance of data compression techniques (a) orthonormal wavelet (b) backward slope (c) boxcar.

Bader, F. P and Tucker, T. W., Data compression applied to a chemical plant using a distributed historian station. ISA Trans. 26(4), 9-14 (1987a). [Pg.267]

Feehs, R. J., and Arce, G. R., Vector Quantization for Data Compression of Trend Recordings, Tech. Rep. 88-11-1, University of Delaware, Dept. Elect. Eng., Newark,... [Pg.268]

The application of principal components regression (PCR) to multivariate calibration introduces a new element, viz. data compression through the construction of a small set of new orthogonal components or factors. Henceforth, we will mainly use the term factor rather than component in order to avoid confusion with the chemical components of a mixture. The factors play an intermediary role as regressors in the calibration process. In PCR the factors are obtained as the principal components (PCs) from a principal component analysis (PC A) of the predictor data, i.e. the calibration spectra S (nxp). In Chapters 17 and 31 we saw that any data matrix can be decomposed ( factored ) into a product of (object) score vectors T(nxr) and (variable) loadings P(pxr). The number of columns in T and P is equal to the rank r of the matrix S, usually the smaller of n or p. It is customary and advisable to do this factoring on the data after columncentering. This allows one to write the mean-centered spectra Sq as ... [Pg.358]

Fig. 40.31. Data compression by a Fourier transform, (a) A spectrum measured at 512 wavelengths (b) spectrum after reconstruction with 2, 4,..., 256 Fourier coefficients.

Vedam, H., Venkatasubramanian, V., and Bhalodia, M A B-spline base method for data compression, process monitoring and diagnosis. Comput. Chem. Eng. 22(13), S827-S830 (1998). [Pg.102]

Methods for unsupervised learning invariably aim at compression or the extraction of information present in the data. Most prominent in this field are clustering methods [140], self-organizing networks [141], any type of dimension reduction (e.g., principal component analysis [142]), or the task of data compression itself. All of the above may be useful to interpret and potentially to visualize the data. [Pg.75]

PCA is a data compression method that reduces a set of data collected on M variables over N samples to a simpler representation that uses a much fewer number (A M) of compressed variables , called principal components (or PCs). The mathematical model for the PCA method is provided below ... [Pg.362]

The difference between PLS and PCR is the manner in which the x data are compressed. Unlike the PCR method, where x data compression is done solely on the basis of explained variance in X followed by subsequent regression of the compressed variables (PCs) to y (a simple two-step process), PLS data compression is done such that the most variance in both x and y is explained. Because the compressed variables obtained in PLS are different from those obtained in PCA and PCR, they are not principal components (or PCs) Instead, they are often referred to as latent variables (or LVs). [Pg.385]

See also in sourсe #XX -- [ Pg.550 ]

See also in sourсe #XX -- [ Pg.362 , Pg.369 , Pg.376 , Pg.385 , Pg.424 , Pg.525 ]

See also in sourсe #XX -- [ Pg.313 , Pg.438 ]

See also in sourсe #XX -- [ Pg.280 ]

See also in sourсe #XX -- [ Pg.402 ]

See also in sourсe #XX -- [ Pg.322 ]

See also in sourсe #XX -- [ Pg.402 ]

See also in sourсe #XX -- [ Pg.72 , Pg.73 , Pg.86 , Pg.100 , Pg.108 , Pg.118 ]

See also in sourсe #XX -- [ Pg.2 , Pg.870 ]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...