Multidimensional space

The Kohonen network or self-organizing map (SOM) was developed by Teuvo Kohonen [11]. It can be used to classify a set of input vectors according to their similarity. The result of such a network is usually a two-dimensional map. Thus, the Kohonen network is a method for projecting objects from a multidimensional space into a two-dimensional space. This projection keeps the topology of the multidimensional space, i.e., points which are close to one another in the multidimensional space are neighbors in the two-dimensional space as well. An advantage of this method is that the results of such a mapping can easily be visualized. [Pg.456]

Other methods consist of algorithms based on multivariate classification techniques or neural networks they are constructed for automatic recognition of structural properties from spectral data, or for simulation of spectra from structural properties [83]. Multivariate data analysis for spectrum interpretation is based on the characterization of spectra by a set of spectral features. A spectrum can be considered as a point in a multidimensional space with the coordinates defined by spectral features. Exploratory data analysis and cluster analysis are used to investigate the multidimensional space and to evaluate rules to distinguish structure classes. [Pg.534]

The simplest and fastest techniques for grouping molecules are partitioning methods. Every molecule is represented by a point in an n-dimensional space, the axes of which are defined by the n components of the descriptor vector. The range of values for each component is then subdivided into a set of subranges (or bins). As a result, the entire multidimensional space is partitioned into a number of hypercubes (or cells) of fixed size, and every molecule (represented as a point in this space) falls into one of these cells [57]. [Pg.363]

A similarity-related approach is k-nearest neighbor (KNN) analysis, based on the premise that similar compounds have similar properties. Compounds are distributed in multidimensional space according to their values of a number of selected properties the toxicity of a compound of interest is then taken as the mean of the toxicides of a number (k) of nearest neighbors. Cronin et al. [65] used KNN to model the toxicity of 91 heterogeneous organic chemicals to the alga Chlorella vulgaris, but found it no better than MLR. [Pg.481]

For example, the ZN theory, which overcomes all the defects of the Landau-Zener-Stueckelberg theory, can be incorporated into various simulation methods in order to clarify the mechanisms of dynamics in realistic molecular systems. Since the nonadiabatic coupling is a vector and thus we can always determine the relevant one-dimensional (ID) direction of the transition in multidimensional space, the 1D ZN theory can be usefully utilized. Furthermore, the comprehension of reaction mechanisms can be deepened, since the formulas are given in simple analytical expressions. Since it is not feasible to treat realistic large systems fully quantum mechanically, it would be appropriate to incorporate the ZN theory into some kind of semiclassical methods. The promising semiclassical methods are (1) the initial value... [Pg.96]

Now, the general formulation of the problem is finished and ready to be applied to real systems without relying on any local coordinates. The next problems to be solved for practical applications are (1) how to find the instanton trajectory qo( t) efficiently in multidimensional space and (2) how to incorporate high level of accurate ab initio quantum chemical calculations that are very time consuming. These problems are discussed in the following Section III. A. 2. [Pg.119]

One might think that it would be easy to find the instanton trajectory by running classical trajectories even in a multidimensional space. This is actually not true at all. Instead of doing that, we introduce a new parameter z, which spans the interval [—1,1] instead of using the time x and employ the variational principle using some basis functions to express the tarjectory. The 1 1 correspondence between x and z can be found from the energy conservation and the time variation of z is expressed as... [Pg.120]

Using this procedure is analogous to finding a set of orthogonal axes that represents the directions of greatest variety in the data. In PCA one considers each row in the data matrix to be a point in multidimensional space with coordinates defined by the values corresponding to the appropriate n columns in the data matrix. [Pg.94]

In Chapter 29 we introduced the concept of the two dual data spaces. Each of the n rows of the data table X can be represented as a point in the p-dimensional column-space S . In Fig. 31.2a we have represented the n rows of X by means of the row-pattern F. The curved contour represents an equiprobability envelope, e.g. a curve that encloses 99% of the points. In the case of multinormally distributed data this envelope takes the form of an ellipsoid. For convenience we have only represented two of the p dimensions of SP which is in reality a multidimensional space rather than a two-dimensional one. One must also imagine the equiprobability envelope as an ellipsoidal (hyper)surface rather than the elliptical curve in the figure. The assumption that the data are distributed in a multinormal way is seldom fulfilled in practice, and the patterns of points often possess more complex structure than is shown in our illustrations. In Fig. 31.2a the centroid or center of mass of the pattern of points appears at the origin of the space, but in the general case this needs not to be so. [Pg.104]

It has been shown in Chapter 29 that the set of vectors of the same dimension defines a multidimensional space S in which the vectors can be represented as points (or as directed line segments). If this space is equipped with a weighted metric defined by W, it will be denoted by the symbol S. The squared weighted distance between two points representing the vectors x and y in is defined by the weighted scalar product ... [Pg.171]

Any data matrix can be considered in two spaces the column or variable space (here, wavelength space) in which a row (here, spectrum) is a vector in the multidimensional space defined by the column variables (here, wavelengths), and the row space (here, retention time space) in which a column (here, chromatogram) is a vector in the multidimensional space defined by the row variables (here, elution times). This duality of the multivariate spaces has been discussed in more detail in Chapter 29. Depending on the chosen space, the PCs of the data matrix... [Pg.246]

Given these tables of multivariate data one might be interested in various relationships. For example, do the two panels have a similar perception of the different olive oils (Tables 35.1 and 35.2) Are the oils more or less similarly scattered in the two multidimensional spaces formed by the Dutch and by the British attributes How are the two sets of sensory attributes related Does the... [Pg.308]

The matrices such as X and B in Eq. (33), which are composed of a single column, are usually referred to as vectors. In fact, the vectors introduced in Chapter 4 can be written as column matrices in which the elements are the corresponding components. Of course the vector X — [Xj ] in Eq. (33) is of dimension n, while those in Chapter 4 were in three-dimensional space. It is apparent that the matrix notation introduced here is a more general method of representing vector algebra in multidimensional spaces. This idea is developed further in Section 7.7. [Pg.293]

Each object or data point is represented by a point in a multidimensional space. These plots or projected points are arranged in this space so that the distances between pairs of points have the strongest possible relation to the degree of similarity among the pairs of objects. That is, two similar objects are represented by two points that are close together, and two dissimilar objects are represented by a pair of points that are far apart. The space is usually a two- or three-dimensional Euclidean space, but may be non-Euclidean and may have more dimensions. [Pg.948]

Exactly as in univariate analysis, once a model is created it can be applied to predict unknown samples. The difference with respect to the univariate case is that it is impossible to plot the model, because it is an equation in a multidimensional space. Hence, plots reporting predicted values vs. experimental values for standard samples of a training set are used to evaluate models reliability (validation). [Pg.64]

In order to apply the SA protocol, one of the keys is to design a mathematical function that adequately measures the diversity of a subset of selected molecules. Because each molecule is represented by molecular descriptors, geometrically it is mapped to a point in a multidimensional space. The distance between two points, such as Euclidean distance, Tanimoto distance, and Mahalanobis distance, then measures the dissimilarity between any two molecules. Thus, the diversity function to be designed should be based on all pairwise distances between molecules in the subset. One of the functions is as follows ... [Pg.382]

R is called the relaxation superoperator. Expanding the density operator in a suitable basis (e.g., product operators [7]), the a above acquires the meaning of a vector in a multidimensional space, and eq. (2.1) is thereby converted into a system of linear differential equations. R in this formulation is a matrix, sometimes called the relaxation supermatrix. The elements of R are given as linear combinations of the spectral density functions (a ), taken at frequencies corresponding to the energy level differences in the spin system. [Pg.328]

K-nearesineighbor classification is a general approach for classifying unknown samplesL Ihe predicted class of an unknown is assigned to the class of the sample(s>Iying nearest to it in multidimensional space. [Pg.95]

In fact, the space described by two or three PCs can be used to represent the objects (score plot), the original variables (loading plot), or both objects and variables (biplot). For instance, if the first two PCs (low-order) are drawn as axes of a Cartesian plane, we may observe in this plane a fraction of the information enclosed in the original multidimensional space which corresponds to the sum of the variance values explained by the two PCs. Since PCs are not intercorrelated variables, no duplicate information is shown in PC plots. [Pg.80]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...