Sparse data matrix

Variable selection is particularly important in LC-MS and GC-MS. Raw data form what is sometimes called a sparse data matrix, in which the majority of data points are zero or represent noise. In fact, only a small percentage (perhaps 5% or less) of die measurements are of any interest. The trouble with this is that if multivariate methods are applied to the raw data, often the results are nonsense, dominated by noise. Consider the case of performing LC-MS on two closely eluting isomers, whose fragment ions are of principal interest. The most intense peak might be the molecular... [Pg.360]

The second step concerns distance selection and metrization. Bound smoothing only reduces the possible intervals for interatomic distances from the original bounds. However, the embedding algorithm demands a specific distance for every atom pair in the molecule. These distances are chosen randomly within the interval, from either a uniform or an estimated distribution [48,49], to generate a trial distance matrix. Unifonn distance distributions seem to provide better sampling for very sparse data sets [48]. [Pg.258]

Note that although the bounds on the distances satisfy the triangle inequalities, particular choices of distances between these bounds will in general violate them. Therefore, if all distances are chosen within their bounds independently of each other (the method that is used in most applications of distance geometry for NMR strucmre determination), the final distance matrix will contain many violations of the triangle inequalities. The main consequence is a very limited sampling of the conformational space of the embedded structures for very sparse data sets [48,50,51] despite the intrinsic randomness of the tech-... [Pg.258]

The 3D HNCO data matrix is very sparsely populated with crosspeaks because there is only one correlation per residue in the protein (Fig. 12.59). The main purpose of this experiment is to count residues and make sure all of the peaks can be found and identified. Once we have the assignments for each H-N pair, the data can be arranged in a strip plot in order of residue number. [Pg.614]

The first attempt at estimating interindividual pharmacokinetic variability without neglecting the difficulties (data imbalance, sparse data, subject-specific dosing history, etc.) associated with data from patients undergoing drug therapy was made by Sheiner et al. " using the Non-linear Mixed-effects Model Approach. The vector 9 of population characteristics is composed of all quantities of the first two moments of the distribution of the parameters the mean values (fixed effects), and the elements of the variance-covariance matrix that characterize random effects.f " " ... [Pg.2951]

In the following we will use a matrix notation for the double-substitution amplitudes, T,/, two-electron integrals, K,p and residual elements, R,y, where, for instance, T,y contains all the double-substitution amplitudes T " for a fixed ij pair. Using the matrix notation is convenient in the discussion but does not reflect the actual data representation. The double-substitution amplitudes, integrals, and residual elements are stored as four-dimensional sparse arrays using the employed sparse data representation. In this representation, however, indices can be frozen for instance, freezing the indices i and / in the four-index array T with indices i, j, a, and b allows the sub-array T,y to be manipulated as a matrix. [Pg.171]

Objects in the BzzMatrixSparseSymmetricLocked class collect the data for a sparse symmetric matrix as follows. [Pg.154]

Different processes like eddy turbulence, bottom current, stagnation of flows, and storm-water events can be simulated, using either laminar or turbulent flow model for simulation. All processes are displayed in real-time graphical mode (history, contour graph, surface, etc.) you can also record them to data files. Thanks to innovative sparse matrix technology, calculation process is fast and stable a large number of layers in vertical and horizontal directions can be used, as well as a small time step. You can hunt for these on the Web. [Pg.305]

For medium and large networks, the occurrence matrix that is of the same structure (isomorphic) as the coefficient matrix of the governing equations is usually quite sparse. For example, Stoner (S5) showed a 155-vertex network with a density of 3.2% for the occurrence matrix (i.e., 775 nonzeros out of a total of 1552 entries) using formulation C. Still lower densities have been observed on larger networks. In these applications it is of paramount importance that the data structure and data manipulations take full advantage of the sparsity of the governing equations. Sparse computation techniques are also needed in order to capture the full benefit of cycle selection and row and column reordering. [Pg.166]

Spectral data are highly redundant (many vibrational modes of the same molecules) and sparse (large spectral segments with no informative features). Hence, before a full-scale chemometric treatment of the data is undertaken, it is very instructive to understand the structure and variance in recorded spectra. Hence, eigenvector-based analyses of spectra are common and a primary technique is principal components analysis (PC A). PC A is a linear transformation of the data into a new coordinate system (axes) such that the largest variance lies on the first axis and decreases thereafter for each successive axis. PCA can also be considered to be a view of the data set with an aim to explain all deviations from an average spectral property. Data are typically mean centered prior to the transformation and the mean spectrum is used a base comparator. The transformation to a new coordinate set is performed via matrix multiplication as... [Pg.187]

Global two-stage Experimental and observational (sparse) pharmacokinetic data May provide unbiased estimates under certain conditions Asymptotic covariance matrix used in the calculations is approximate and may lead to poor variance estimates when the variance is not normally distributed... [Pg.2954]

Substantial advantages are derived from the separable form of the electron interaction. Seven one-particle Hermitian matrices are required for the generation of the Hamiltonian in the present, reduced form. The matrices will be sparse and demand modest storage. Savings in storage become essential with increasing basis sets but even for the present case it is notable that seven 10-by-10 matrices has the data for the full 210-by-210 Fock space Hamiltonian. Symmetry and number conservation does reduce the number of non-vanishing matrix elements. [Pg.49]

Sometimes, instead, if data are gathered without having any specific project, it happens that the result is a sparse matrix containing some blank cells. In such cases, if the percentage of missing data is quite high, the whole data set is not suitable for a multivariate analysis as a consequence, the variables and/or the objects with the lowest number of data must be removed, and therefore a huge amount of experimental effort can be lost. [Pg.222]

These blind predictions of the FEBEX data do not make a strong case that, for this particular geomechanical situation, a coupled analysis is entirely necessary. The granite in this case is sparsely fractured, and most of the inflow occurs at the lamprophyre and other more fractured areas. Also, the rock mass is sufficiently nonporous and saturated that inelastic deformation of the rock matrix is not a significant issue for repository performance. However, the exercise was very valuable for developing rationale for modeling the more complex coupled problems associated with the introduction of the bentonite barrier and the heat of the simulated waste. [Pg.130]

Iterative algorithms are recommended for some linear systems Ax = b as an alternative to direct algorithms. An iteration usually amounts to one or two multiplications of the matrix A by a vector and to a few linear operations with vectors. If A is sparse, small storage space suffices. This is a major advantage of iterative methods where the direct methods have large fill-in. Furthermore, with appropriate data structures, arithmetic operations are actually performed only where both operands are nonzeros then, D A) or 2D A) flops per iteration and D(A) + 2n units of storage space suffice, where D(A) denotes the number of nonzeros in A. Finally, iterative methods allow implicit symmetrization, when the iteration applies to the symmetrized system A Ax = A b without explicit evaluation of A A, which would have replaced A by less sparse matrix A A. [Pg.194]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...