Redundant data sets

Completeness and non-redundancy. Does the strategy guarantee to find all and only those solutions which are in the data set ... [Pg.292]

In spite of this redundancy of results, discrepancies among different data sets obtained from different laboratories on antimicrobial activity of these 0-heterocycles against both prokaryotic and eukaryotic microorganisms have been sometimes observed. This fact is probably due to various causes. [Pg.258]

To reduce intensity effects, the data were normalized by reducing the area under each spectrum to a value of 1 [42]. Principal component analysis (PCA) was applied to the normalized data. This method is well suited to optimize the description of the fluorescence data sets by extracting the most useful data and rejecting the redundant ones [43]. From a data set, PCA assesses principal components and their corresponding spectral pattern. The principal components are used to draw maps that describe the physical and chemical variations observed between the samples. Software for PCA has been written by D. Bertrand (INRA Nantes) and is described elsewhere [44]. [Pg.283]

In order to obtain the preferred term for a given drug, you have to pull drug record numbers, where sequence number 1 is 01 and sequence number 2 is 001. Also, redundancies in the resulting data set must be stripped out with first-dot processing or a NODUPKEY on a PROC SORT. The following SAS code shows how to get preferred terms from WHODrug. [Pg.112]

A data reconciliation procedure was applied to the subset of redundant equations. The results are displayed in Table 4. A global test for gross error detection was also applied and the x2 value was found to be equal to 17.58, indicating the presence of a gross error in the data set. Using the serial elimination procedure described in Chapter 7, a gross error was identified in the measurement of stream 26. The procedure for estimating the amount of bias was then applied and the amount of bias was found... [Pg.251]

Check for completeness and redundancy at the desired resolution usually a complete data set at 10-4 Angstroms is fine. [Pg.99]

Online analysis processing mainly comprises the interactive exploration of multidimensional data sets, or data cubes, which are manipulated by operations from matrix algebra, for example, slice-and-dice, roll-up, and drill-down. Computing performance is related to data warehouse size and also data quality, for example, missing data, unsharpness, and redundancy. The multidimensionality issue is critical for extracting pertinent information and selecting the results to be stored and visualized. [Pg.359]

During data collection from crystals exposed to fragment(s), collect a data set that is complete in the low-resolution shells and has high redundancy. Also beneficial will be the highest resolution data possible, so examination of multiple crystals to select the one with suitable qualities is crucial (see Note 12). [Pg.248]

You can see from Fig. 9.8 that a Laue diffraction pattern is much more complex than a diffraction pattern from monochromatic X rays. But modern software can index Laue patterns and thus allow accurate measurement of many diffraction intensities from a single brief pulse of X rays through a still crystal. If the crystal has high symmetry and is oriented properly, a full data set can in theory be collected in a single brief X-ray exposure. In practice, this approach usually does not provide sufficiently accurate intensities because the data lack the redundancy necessary for high accuracy. Multiple exposures at multiple orientations are the rule. [Pg.211]

Spectral data are highly redundant (many vibrational modes of the same molecules) and sparse (large spectral segments with no informative features). Hence, before a full-scale chemometric treatment of the data is undertaken, it is very instructive to understand the structure and variance in recorded spectra. Hence, eigenvector-based analyses of spectra are common and a primary technique is principal components analysis (PC A). PC A is a linear transformation of the data into a new coordinate system (axes) such that the largest variance lies on the first axis and decreases thereafter for each successive axis. PCA can also be considered to be a view of the data set with an aim to explain all deviations from an average spectral property. Data are typically mean centered prior to the transformation and the mean spectrum is used a base comparator. The transformation to a new coordinate set is performed via matrix multiplication as... [Pg.187]

Both the Parameter and Reconcile cases determine (calculate) the same set of parameters. However, these cases do not get the same values for each parameter. A Parameter case has an equal number of unknowns and equations, therefore is considered "square" in mathematical jargon. In the Parameter case, there is no objective function that drives or affects the solution. There are typically the same measurements, and typically many redundant measurements in both the Parameter and Reconcile case. In the Parameter case we determine, by engineering analysis beforehand (before commissioning an online system for instance) by looking at numerous data sets, which measurements are most reliable (consistent and accurate). We "believe" these, that is, we force the model and measurements to be exactly the same at the solution. Some of these measurements may have final control elements (valves) associated with them and others do not. The former are of FIC, TIC, PIC, AIC type whereas the latter are of FI, TI, PI, AI type. How is any model value forced to be exactly equal to the measured value The "offset" between plant and model value is forced to be zero. For normally independent variables such as plant feed rate, tower... [Pg.128]

Compression of large data sets, elimination of redundancy and noise... [Pg.14]

The correlations between the original features of one set and the canonical variables of the second set are called inter-set loadings. They are also redundancy measures and demonstrate the possibility of describing the first data set with the features of the second set. The inter-set loadings characterize the overlapping of both sets or, in other words, the variance of the first set explained by the second set of features. [Pg.180]

When several variables load on a given component, they are considered to be redundant. The first component extracted explains the greatest percentage of the variation within the data set, the second component explains the next highest percentage of variation not explained by the first, and so on. [Pg.116]

With respect to agreement with experimental data, the most obvious criterion is the size and number of violations of experimental data. Of course, this figure does not include variation caused by the nature of the data set. For example, a small set of loose restraints with redundant information should always be less violated than a large set of restraints interpreted as tightly as possible. Ignoring variation resulting from data sets, a measure analogous to... [Pg.164]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...