Variable space

Factorial design methods cannot always be applied to QSAR-type studies. For example, i may not be practically possible to make any compounds at all with certain combination of factor values (in contrast to the situation where the factojs are physical properties sucl as temperature or pH, which can be easily varied). Under these circumstances, one woul( like to know which compounds from those that are available should be chosen to give well-balanced set with a wide spread of values in the variable space. D-optimal design i one technique that can be used for such a selection. This technique chooses subsets o... [Pg.713]

Discriminant emalysis is a supervised learning technique which uses classified dependent data. Here, the dependent data (y values) are not on a continuous scale but are divided into distinct classes. There are often just two classes (e.g. active/inactive soluble/not soluble yes/no), but more than two is also possible (e.g. high/medium/low 1/2/3/4). The simplest situation involves two variables and two classes, and the aim is to find a straight line that best separates the data into its classes (Figure 12.37). With more than two variables, the line becomes a hyperplane in the multidimensional variable space. Discriminant analysis is characterised by a discriminant function, which in the particular case of hnear discriminant analysis (the most popular variant) is written as a linear combination of the independent variables ... [Pg.719]

FIG. 3-50 Finite difference grid with variable spacing. [Pg.476]

Quahtative description of physical behaviors require that each continuous variable space be quantized. Quantization is typically based on landmark values that are boundary points separating qualitatively distinct regions of continuous values. By using these qualitative quantity descriptions, dynamic relations between variables can be modeled as quahtative equations that represent the struc ture of the system. The... [Pg.509]

The known models for describing retention factor in whole variable space ar e based on three-phase model and containing from three to six par ameters and variety combinations of two independent factors (micelle concentration, volume fraction of organic modifier). When the retention models are comparing or the accuracy of fitting establishing, the closeness of correlation coefficient to 1 and the sum of the squared residuals or the sum of absolute deviations and their relative values is taken into account. A number of problems ar e appear in this case ... [Pg.45]

The last equation was ensured the better description of the experimental data in whole variable space than the other known three-pai ameter equations. [Pg.81]

The retinoid X receptor forms heterodimers that recognize tandem repeats with variable spacings... [Pg.185]

It is a first-order differential equation in time, but second-order in the spatial variables. Space and time do not enter on an equal footing, as required by the special theory of relativity. [Pg.305]

Figure 14.10 Illustration of the sphere method. Energy minima on the hyperspheres are denoted by , while R indicates a (local) minimum in the full variable space...

Once a model has been fitted to the available data and parameter estimates have been obtained, two further possible questions that the experimenter may pose are How important is a single parameter in modifying the prediction of a model in a certain region of independent variable space, say at a certain point in time and, moreover. How important is the numerical value of a specific observation in determining the estimated value of a particular parameter Although both questions fall within the domain of sensitivity analysis, in the following we shall address the first. The second question is addressed in Section 3.6 on optimal design. [Pg.86]

Any data matrix can be considered in two spaces the column or variable space (here, wavelength space) in which a row (here, spectrum) is a vector in the multidimensional space defined by the column variables (here, wavelengths), and the row space (here, retention time space) in which a column (here, chromatogram) is a vector in the multidimensional space defined by the row variables (here, elution times). This duality of the multivariate spaces has been discussed in more detail in Chapter 29. Depending on the chosen space, the PCs of the data matrix... [Pg.246]

The two eigenvectors define a plane in the original variable space. This process can be repeated systematically until the eigenvalue associated with each new eigenvector is of such a small magnitude that it represents the noise associated with the observations more than it does information. In the limit where the number of significant eigenvectors equals the number... [Pg.26]

Alternatively, methods based on nonlocal projection may be used for extracting meaningful latent variables and applying various statistical tests to identify kernels in the latent variable space. Figure 17 shows how projections of data on two hyperplanes can be used as features for interpretations based on kernel-based or local methods. Local methods do not permit arbitrary extrapolation owing to the localized nature of their activation functions. [Pg.46]

Finally, electrodeposition in general is orthogonal to MBE and CVD, as it involves growth in a condensed phase with potential control instead of thermal. This increases the variable space for producing materials the diversity of conditions under which compounds can be formed. [Pg.8]

Softwares for numerical integration of equations include the calculator HP-32SII, POLYMATH, CONSTANTINIDES AND CHAPRA CANALE. The last of these also can handle tabular data with variable spacing. POLYMATH fits a polynomial to the tabular data and then integrates. A comparison is made in problem PI.03.03 of the integration of an equation by the trapezoidal and Runge-Kutta rules. One hundred intervals with the trapezoidal rule takes little time and the result is usually accurate enough, so it is often convenient to standardize on this number. [Pg.15]

A fundamental idea in multivariate data analysis is to regard the distance between objects in the variable space as a measure of the similarity of the objects. Distance and similarity are inverse a large distance means a low similarity. Two objects are considered to belong to the same category or to have similar properties if their distance is small. The distance between objects depends on the selected distance definition, the used variables, and on the scaling of the variables. Distance measurements in high-dimensional space are extensions of distance measures in two dimensions (Table 2.3). [Pg.58]

The Mahalanobis distance considers the distribution of the object points in the variable space (as characterized by the covariance matrix) and is independent from the scaling of the variables. The Mahalanobis distance between a pair of objects xA and xB is defined as... [Pg.60]

The loading vector is usually scaled to a length of 1, that means bT b = 1 it defines a direction in the variable space. [Pg.65]

Also nonlinear methods can be applied to represent the high-dimensional variable space in a smaller dimensional space (eventually in a two-dimensional plane) in general such data transformation is called a mapping. Widely used in chemometrics are Kohonen maps (Section 3.8.3) as well as latent variables based on artificial neural networks (Section 4.8.3.4). These methods may be necessary if linear methods fail, however, are more delicate to use properly and are less strictly defined than linear methods. [Pg.67]

FIGURE 2.17 Projection of the object points from a two-dimensional variable space on to a direction b45 giving a latent variable with a high variance of the scores, and therefore a good preservation of the distances in the two-dimensional space. [Pg.69]

For univariate data, only one variable is measured at a set of objects (samples) or is measured on one object a number of times. For multivariate data, several variables are under consideration. The resulting numbers are usually stored in a data matrix X of size ii x in where the n objects are arranged in the rows and the m variables in the columns. In a geometric interpretation, each object can be considered as a point in an m-dimensional variable space. Additionally, a property of the objects can be stored in a vector y (nx 1) or several properties in a matrix Y nxq) (Figure 2.19). [Pg.70]

Projection of the variable space on to a plane (defined by two loading vectors) is a powerful approach to visualize the distribution of the objects in the variable space, which means detection of clusters and eventually outliers. Another aim of projection can be an optimal separation of given classes of objects. The score plot shows the result of a projection to a plane it is a scatter plot with a point for each object. The corresponding loading plot (with a point for each variable) indicates the relevance of the variables for certain object clusters. [Pg.71]

Principal component analysis (PCA) can be considered as the mother of all methods in multivariate data analysis. The aim of PCA is dimension reduction and PCA is the most frequently applied method for computing linear latent variables (components). PCA can be seen as a method to compute a new coordinate system formed by the latent variables, which is orthogonal, and where only the most informative dimensions are used. Latent variables from PCA optimally represent the distances between the objects in the high-dimensional variable space—remember, the distance of objects is considered as an inverse similarity of the objects. PCA considers all variables and accommodates the total data structure it is a method for exploratory data analysis (unsupervised learning) and can be applied to practical any A-matrix no y-data (properties) are considered and therefore not necessary. [Pg.73]

The direction in a variable space that best preserves the relative distances between the objects is a latent variable which has maximum variance of the scores (these are the projected data values on the latent variable). This direction is called by definition the first principal component (PCI). It is defined by a loading vector... [Pg.73]

FIGURE 3.4 Different distributions of object points in a three-dimensional variable space. [Pg.77]

Nonlinear mapping (NLM) as described by Sammon (1969) and others (Sharaf et al. 1986) has been popular in chemometrics. Aim of NLM is a two-(eventually a one- or three-) dimensional scatter plot with a point for each of the n objects preserving optimally the relative distances in the high-dimensional variable space. Starting point is a distance matrix for the m-dimensional space applying the Euclidean distance or any other monotonic distance measure this matrix contains the distances of all pairs of objects, due. A two-dimensional representation requires two map coordinates for each object in total 2n numbers have to be determined. The starting map coordinates can be chosen randomly or can be, for instance, PC A scores. The distances in the map are denoted by d t. A mapping error ( stress, loss function) NLm can be defined as... [Pg.101]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...