Pseudo-datasets

In another example, the null distribution was compared for 2000 pseudodatasets with the real distribution generated from 2000 runs of 10-fold cross validation for ER232. The distribution of prediction accuracy of the real dataset centered around 82% while the pseudo-datasets were near 50% [60] (Figure 6.12). The distribution turned out to be much narrower for the real dataset than... [Pg.169]

Step 1. (Generating pseudo-data). Based on the ML Estimate 6, generate the 5-th pseudo-dataset pseudo ( ) of records. That is, to produce the SP hessian estimates, we implement the following procedure to generate a dataset similar to the original ... [Pg.876]

Numerous experimental combinations of process conditions (SS or US), hydrogenation gas (H2 or D2), and solvent (H2O or D2O) have been explored. A summary of combinations we have chosen for study is presented in Table 2. In this table it is seen that the experiments are labeled B1-B7 for 3B20L and P1-P6 for 14PD30L. The second column lists the experimental conditions, whereas the third column lists the initial system concentration based on 100 mM of substrate and the amount of catalyst used. The penultimate column lists the final (extent of reaction > 95%) selectivity to ketone (2-butanone or 3-pentanone) and the final column lists the pseudo-first order substrate loss rate coefficient. The dataset contained in Table 2 enables numerous conclusions to be made regarding the reaction systems. The differences in initial concentrations (e.g., 67 versus 100 M/g-cat.) arise from the chosen convenience of having similar activities and therefore comparable reaction times. [Pg.219]

Although Eq. [29] has the appearance of a multiple regression, remember that the parameter estimates were not calculated by OLS. Instead they were found by a biased regression method. Consequently, these parameters, which are referred to as pseudo-p s, will not in general equal the OLS values because they have been shrunken, (some more than others). However, as more components are added into the latent variable model (Eq. [27]), i.e., as p increases toward k, these pseudo-p s approach the values obtained by OLS. In the limit p = k, Eq. [29] will be identical to the OLS model, a result that will be illustrated later when we apply the methods to a real dataset. [Pg.318]

Its performance and reliability should be inspected before a new method will be applied to the landslide susceptibility mapping. By randomly select 50% samples from entire dataset (2260 failed and 2260 pseudo stables) to train the model and the remnant to test the model, and repeat 50 times, yielded mean and Std. Dev. of predict accuracies. The mean ( standard deviation) was 0.751 + 0.00836 for two-class SVM method and 0.723 + 0.006125 for LR method, which shows that two-class SVM has better predicting performance than that of LR. Even though the two-class SVM (with larger Std. Dev.) is slightly less reliable than LR, its least accuracy (0.735) is still larger than mean (0.723) of LR. So we can expect an efficient landslide susceptibility map produced by two-class SVM. [Pg.206]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...