Huuskonen dataset

Several other investigations of solubility using the Huuskonen dataset and other datasets using ANNs and various other NN methods, for example, Bayesian NNs,... [Pg.1023]

The Huuskonen dataset [31] consists of 1297 compounds compiled from the AQUASOL dATAbASE of the University of Arizona (Yalkowsky, S. H. Dannelfelser, R. M. The ARIZONA dATAbASE of Aqueous Solubility College of Pharmacy, University of Arizona ... [Pg.1037]

The Huuskonen Dataset This set of 1274 experimental solubility values (Log S) was one of the first large solubility datasets published [15,16] and has subsequently been used in a number of other publications [14,17]. The data in this set was extracted from the AQUASOL [18, 19] database, compiled by the Yalkowsky group at the... [Pg.3]

TABLE 1,2 Statistics for Models Build Based on the Full Huuskonen Dataset and Subset with Log 5 Between —6 and -3... [Pg.10]

Aqueous solubility is selected to demonstrate the E-state application in QSPR studies. Huuskonen et al. modeled the aqueous solubihty of 734 diverse organic compounds with multiple linear regression (MLR) and artificial neural network (ANN) approaches [27]. The set of structural descriptors comprised 31 E-state atomic indices, and three indicator variables for pyridine, ahphatic hydrocarbons and aromatic hydrocarbons, respectively. The dataset of734 chemicals was divided into a training set ( =675), a vahdation set (n=38) and a test set (n=21). A comparison of the MLR results (training, r =0.94, s=0.58 vahdation r =0.84, s=0.67 test, r =0.80, s=0.87) and the ANN results (training, r =0.96, s=0.51 vahdation r =0.85, s=0.62 tesL r =0.84, s=0.75) indicates a smah improvement for the neural network model with five hidden neurons. These QSPR models may be used for a fast and rehable computahon of the aqueous solubihty for diverse orgarhc compounds. [Pg.93]

Refinement of a QSPR model requires experimental solubilities to train the model. Several models have used the dataset of Huuskonen [44] who sourced experimental data from the AQUASOL [45] and PHYSPROP [46] databases. The original set had a small number of duplicates, which have been removed in most subsequent studies using this dataset, leaving 1290 compounds. When combined, the log Sw... [Pg.302]

One problem highlighted by several reviewers [14,20] is that datasets like the Huuskonen set cover unnecessarily large ranges of solubility. The Huuskonen set covers the range log S (log of solubility in mol/1) from —11.62 to +1.58, which converts approximately to 9.6 x 10 7-1.5 x 107pg/ml for a MW of 400 Da. [Pg.453]

Table 16-2 Summary of different methods and models for the Huuskonen aqueous solubility dataset... [Pg.1025]

Fig. 16-5 Model of the Huuskonen aqueous solubility dataset using PLS [34]. Triangles - training set, circles - test set. The plot shows the "deceptively good...

The Huuskonen test set is similar to the training set. The median similarity for the training set is 0.78 and the mean is 0.76. By most standards, one would assume that approximately half of the molecules in the Huuskonen test set are similar to molecules in the training set. We would expect the performance of the model on this dataset to be reasonably good. [Pg.14]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...