Training sets

Neural network classifiers. The neural network or other statistical classifiers impose strong requirements on the data and the inspection, however, when these are fulfilled then good fully automatic classification systems can be developed within a short period of time. This is for example the case if the inspection is a part of a manufacturing process, where the inspected pieces and the possible defect mechanisms are well known and the whole NDT inspection is done in repeatable conditions. In such cases it is possible to collect (or manufacture) as set of defect pieces, which can be used to obtain a training set. There are some commercially available tools (like ICEPAK [Chan, et al., 1988]) which can construct classifiers without any a-priori information, based only on the training sets of data. One has, however, always to remember about the limitations of this technique, otherwise serious misclassifications may go unnoticed. [Pg.100]

Table 2 Blind test results (54 spectra) using final networks obtainedfrom training set TS I.

In Eq. (12), SE is the standard error, c is the number of selected variables, p is the total number of variables (which can differ from c), and d is a smoothing parameter to be set by the user. As was mentioned above, there is a certain threshold beyond which an increase in the number of variables results in some decrease in the quality of modeling. In fact, the smoothing parameter reflects the user s guess of how much detail is to be modeled in the training set. [Pg.218]

In Eq. (14) "est stands for the calculated (estimated) response, exp" for the experimental one, and n and m are the numbers of objects in the training set and the test set respectively. CoSE stands for Compound Standard Error. As an option, one can employ several test sets, if needed. [Pg.218]

Oui recommendation is that one should use n-leave-out cross-validation, rather than one-leave-out. Nevertheless, there is a possibility that test sets derived thus would be incompatible with the training sets with respect to information content, i.e., the test sets could well be outside the modeling space [8]. [Pg.223]

First, one can check whether a randomly compiled test set is within the modeling space, before employing it for PCA/PLS applications. Suppose one has calculated the scores matrix T and the loading matrix P with the help of a training set. Let z be the characteristic vector (that is, the set of independent variables) of an object in a test set. Then, we first must calculate the scores vector of the object (Eq. (14)). [Pg.223]

The Kohonen Self-Organizing Maps can be used in a. similar manner. Suppose Xj., k = 1,. Nis the set of input (characteristic) vectors, Wy, 1 = 1,. l,j = 1,. J is that of the trained network, for each (i,j) cell of the map N is the number of objects in the training set, and 1 and j are the dimensionalities of the map. Now, we can compare each with the Wy of the particular cell to which the object was allocated. This procedure will enable us to detect the maximal (e max) minimal ( min) errors of fitting. Hence, if the error calculated in the way just mentioned above is beyond the range between e and the object probably does not belong to the training population. [Pg.223]

Therefore the 28 analytes and their enantiomers were encoded by the conformation-dependent chirality code (CDCC) and submitted to a Kohoiien neural network (Figure 8-1 3). They were divided into a test set of six compounds that were chosen to cover a variety of skeletons and were not used for the training. That left a training set containing the remaining 50 compounds. [Pg.424]

One application of clustering could, for example, be the comparison of compound libraries A training set is chosen which contains members of both libraries. After the structures are coded (cf. Chapter 8), a Kohonen network (cf. Section 9.5.3) is trained and arranges the structures within the Kohonen map in relation to their structural similarity. Thus, the overlap between the two different libraries of compounds can be determined. [Pg.473]

A data set can be split into a training set and a test set randomly or according to a specific rule. The 1293 compounds were divided into a training set of 741 compounds and a test set ot 552 compounds, based on their distribution in a K.NN map. From each occupied neuron, one compound was selected and taken into the training set, and the other compounds were put into the test set. This selection ensured that both the training set and the test set contained as much information as possible, and covered the chemical space as widely as possible. [Pg.500]

Multiple linear regression analysis is a widely used method, in this case assuming that a linear relationship exists between solubility and the 18 input variables. The multilinear regression analy.si.s was performed by the SPSS program [30]. The training set was used to build a model, and the test set was used for the prediction of solubility. The MLRA model provided, for the training set, a correlation coefficient r = 0.92 and a standard deviation of, s = 0,78, and for the test set, r = 0.94 and s = 0.68. [Pg.500]

A relatively small training set of 744 NMR chemical shifts for protons from 1 20 molecular structures was collected from the literature. This set was designed to cover as many situations of protons in organic structures as possible. Only data from spectra obtained in CDCl, were considered. The collection was restricted to CH protons and to compounds containing the elements C, H, N, 0, S, F, Cl, Br. or I. [Pg.524]

Figure 10.2-S. Procedure for spectra simulation the query structure is coded, a training set of structure-spectra pairs is selected from the database, and the counterpropagation network is trained.

Empirical methods, such as group additivity, cannot be expected to be any more accurate than the uncertainty in the experimental data used to parameterize them. They can be much less accurate if the functional form is poorly chosen or if predicting properties for compounds significantly different from those in the training set. [Pg.121]

Ideally, the results should be validated somehow. One of the best methods for doing this is to make predictions for compounds known to be active that were not included in the training set. It is also desirable to eliminate compounds that are statistical outliers in the training set. Unfortunately, some studies, such as drug activity prediction, may not have enough known active compounds to make this step feasible. In this case, the estimated error in prediction should be increased accordingly. [Pg.248]

Many of the tools are aimed at classification and prediction problems, such as the handwriting example, where a training set of data vectors for which the property is known is used to develop a classification rule. Then the rule can be appHed to a test set of data vectors for which the property is... [Pg.417]

Clean and inspect equipment after each change Segregate incompatible materials Label material, lines, pumps and valves Use double check system Check labels against batch sheets Use procedures and training Set valves to correct flow path... [Pg.101]

For either (ii) or (hi), it may well be that a local minima has been located. Under these conditions the BPA may be re-started, and if again unsuccessful, a new training set may be required. [Pg.354]

See also in sourсe #XX -- [ Pg.427 , Pg.441 , Pg.458 , Pg.500 , Pg.547 ]

See also in sourсe #XX -- [ Pg.135 , Pg.137 , Pg.379 , Pg.389 , Pg.495 ]

See also in sourсe #XX -- [ Pg.372 , Pg.379 ]

See also in sourсe #XX -- [ Pg.160 ]

See also in sourсe #XX -- [ Pg.191 , Pg.194 , Pg.197 , Pg.205 , Pg.219 , Pg.394 ]

See also in sourсe #XX -- [ Pg.307 ]

See also in sourсe #XX -- [ Pg.37 , Pg.95 , Pg.119 , Pg.177 , Pg.227 ]

See also in sourсe #XX -- [ Pg.262 , Pg.265 ]

See also in sourсe #XX -- [ Pg.80 ]

See also in sourсe #XX -- [ Pg.145 , Pg.366 ]

See also in sourсe #XX -- [ Pg.135 , Pg.137 , Pg.382 , Pg.468 ]

See also in sourсe #XX -- [ Pg.112 ]

See also in sourсe #XX -- [ Pg.164 , Pg.187 , Pg.262 , Pg.263 , Pg.329 , Pg.332 , Pg.398 , Pg.407 , Pg.408 , Pg.535 , Pg.536 , Pg.555 , Pg.584 , Pg.588 , Pg.619 , Pg.641 , Pg.658 , Pg.659 , Pg.766 ]

See also in sourсe #XX -- [ Pg.398 ]

See also in sourсe #XX -- [ Pg.125 ]

See also in sourсe #XX -- [ Pg.18 ]

See also in sourсe #XX -- [ Pg.130 ]

See also in sourсe #XX -- [ Pg.85 , Pg.86 , Pg.100 , Pg.103 ]

See also in sourсe #XX -- [ Pg.299 ]

See also in sourсe #XX -- [ Pg.2 , Pg.456 ]

See also in sourсe #XX -- [ Pg.673 ]

See also in sourсe #XX -- [ Pg.273 , Pg.296 , Pg.328 ]

See also in sourсe #XX -- [ Pg.8 ]

See also in sourсe #XX -- [ Pg.420 ]

See also in sourсe #XX -- [ Pg.225 ]

See also in sourсe #XX -- [ Pg.3 , Pg.20 ]

See also in sourсe #XX -- [ Pg.642 ]

See also in sourсe #XX -- [ Pg.184 ]

See also in sourсe #XX -- [ Pg.140 , Pg.146 , Pg.153 , Pg.155 , Pg.158 , Pg.185 , Pg.194 ]

See also in sourсe #XX -- [ Pg.50 ]

See also in sourсe #XX -- [ Pg.283 , Pg.284 , Pg.295 , Pg.302 , Pg.317 ]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...