Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Descriptors descriptor medians

Once a basis set of descriptor medians is obtained, MP proceeds in a stepwise manner. In each of n subsequent steps, molecules with a value of the particular descriptor above (or equal to) the median are assigned 1 and molecules with value below the median are assigned 0. For n descriptors, a total of 2" unique partitions are created, each of which is characterized by a unique /z-digit partitioning code (for example, 10 descriptors produce 1024 partitions). Ultimately, each test molecule falls into a unique partition and is assigned its signature code... [Pg.293]

Descriptor median values naturally depend on the composition and size of compound databases. Whenever source databases are changed, reduced, or extended in size, descriptor medians need to be re-calculated to ensure accurate MP analysis. Relatively small changes in median values can significantly alter partitioning results. [Pg.299]

Figure 9.2 represents a plot of errors in the methods reviewed above as well as in those of several other works [22,44,59]. As the figure demonstrates, there is no apparent difference in the performance of methods developed with any particular group of descriptors, that is, quantum-chemical (median RMSE = 0.47) and topological (median RMSE = 0.53), or using physicochemical descriptors (median RMSE = 0.50). The median error of all methods (RMSE = 0.51) approximately corresponds to the experimental error of solubility measurements. [Pg.250]

Figure 9.3 shows a plot of the errors from methods to predict lipophilicity of chemicals reviewed in this and in several other works [84,96] according to the same types of descriptors used in Figure 9.2. There are only four methods (see above) that were classified as those that use physicochemical descriptors. The median RMSE errors for 2D and 3D methods are 0.38 and 0.34log units, respectively. Figure 9.3 shows a plot of the errors from methods to predict lipophilicity of chemicals reviewed in this and in several other works [84,96] according to the same types of descriptors used in Figure 9.2. There are only four methods (see above) that were classified as those that use physicochemical descriptors. The median RMSE errors for 2D and 3D methods are 0.38 and 0.34log units, respectively.
In addition to the usual statistical methods based on univariate descriptors (mean, median, and standard deviation) and analysis of variance, multivariate techniques of statistics and chemometrics are increasingly being used in data evaluation. Whereas the former are more rigorous in theoretical background and assumptions, the latter are useful in the presentation of the data, pattern recognition, and multivariate calibrations. Several good monographs on chemometrics are available (see for example [58-61]). [Pg.83]

The mass median diameter (MMD) is the most common descriptor of primary particle size and may be determined by sieving or centrifugal sedimentation. The volume median diameter, as determined by laser diffraction, may be used as an approximation of MMD, provided that the particle density is known and does not vary with size, and that the particle shape is near spherical. The MMD of a powder can be used as a predictor of aerodynamic diameter by Eq. (1),... [Pg.98]

Key Words Biological activity chemical descriptors chemical spaces classification methods compound databases decision trees diversity selection partitioning algorithms space transformation statistics statistical medians. [Pg.291]

The table reports median values for a number of descriptors that were calculated for two overlapping compound datasets. Median 1 was calculated for 317 active compounds belonging to different biological activity classes. Median 2 was calculated after 2000 randomly collected molecules were added to this set of active compounds. Most median values differ for these two compound sets (see Note 3). VDW stands for van der Waals. Data were taken from ref. 14. [Pg.293]

Table 1 shows some examples of descriptors and calculated medians. [Pg.293]

Fig. 1. Median partitioning and compound selection. In this schematic illustration, a two-dimensional chemical space is shown as an example. The axes represent the medians of two uncorrelated (and, therefore, orthogonal) descriptors and dots represent database compounds. In A, a compound database is divided in into equal subpopulations in two steps and each resulting partition is characterized by a unique binary code (shared by molecules occupying this partition). In B, diversity-based compound selection is illustrated. From the center of each partition, a compound is selected to obtain a representative subset. By contrast, C illustrates activity-based compound selection. Here, a known active molecule (gray dot) is added to the source database prior to MP and compounds that ultimately occur in the same partition as this bait molecule are selected as candidates for testing. Finally, D illustrates the effects of descriptor correlation. In this case, the two applied descriptors are significantly correlated and the dashed line represents a diagonal of correlation that affects the compound distribution. As can be seen, descriptor correlation leads to over- and underpopulated partitions. Fig. 1. Median partitioning and compound selection. In this schematic illustration, a two-dimensional chemical space is shown as an example. The axes represent the medians of two uncorrelated (and, therefore, orthogonal) descriptors and dots represent database compounds. In A, a compound database is divided in into equal subpopulations in two steps and each resulting partition is characterized by a unique binary code (shared by molecules occupying this partition). In B, diversity-based compound selection is illustrated. From the center of each partition, a compound is selected to obtain a representative subset. By contrast, C illustrates activity-based compound selection. Here, a known active molecule (gray dot) is added to the source database prior to MP and compounds that ultimately occur in the same partition as this bait molecule are selected as candidates for testing. Finally, D illustrates the effects of descriptor correlation. In this case, the two applied descriptors are significantly correlated and the dashed line represents a diagonal of correlation that affects the compound distribution. As can be seen, descriptor correlation leads to over- and underpopulated partitions.
Re-calculate medians, re-initialize descriptor selection, and re-partition... [Pg.297]

The only real alternative to the mean and SD as descriptors for sets of measurements is the system of quartiles . This enjoys a certain vogue in some research areas, but there are others where you will almost never see it used. We have already seen that the median is a value chosen to cut a set of data into two equal sized groups. Quartiles are an extension of that idea. Three quartiles are chosen so as to cut a data set into four equal-sized groups. [Pg.20]

Compounds are described by a number of molecular descriptors these are first normalized and then subjected to the —> Principal Component Analysis to reduce the dimensionality of the chemical space. The M most significant principal components are successively transformed into binary vectors where each bit corresponds to a single principal component (PC) the bit can be either 0 or 1 depending on whether the PC value is smaller or greater than the median of that component calculated on the whole library [Xue, Godden et al., 2003b]. [Pg.88]

The median is here calculated as the value at which the entropy H of a molecular descriptor is maximal for the considered library ... [Pg.88]

Because of the ease of assessing lethality, many toxicity studies have concentrated upon determination of LD50 or LCtso. Of more importance in the military context are the incapacitating effects of compounds, and the ID50 (median incapacitating dose) or the ICtso are important descriptors of a compound s toxicity. These are, of course, much more difficult to determine in animals and the relevance of such determinations to likely effects in humans is open to doubt. Of even more military importance are the parameters ICts and ICtio, although to determine these... [Pg.55]

Least Median Squares regression N = 1075 patch clamping IC50 data. N = 1679 IC50 and single point data. Test set R2 = 0.54 RMSE = 0.63 Volsurf and other descriptors. 178... [Pg.316]

Table 7.13 contains the distributions of experimental density and of descriptors for the size of molecules, i.e. number of atoms (A), molecular weight (MW), and van der Waals volume (y du)- Minimum, maximum, first and third quartile, median and mean are given for each quantity, for the library as a whole and for LS and TS separately. LS and TS are similar according to this information. Important predictors will be constructed from A, MW and... [Pg.272]

Descriptor P r Median Difference Difference 9 th decile Median Dijf. in Random Pairs 9th Decile of Diff. in Random Pairs... [Pg.95]

To provide a comparison, we also evaluated forecast errors and cost for the rules used by the plarmers at this retailer, the k median method based on store descriptors, alone and combined with sales mix differences, and two standard approaches to variable selection in linear regression, since the problem of choosing k test stores and a linear prediction function based on test sales at these stores can be viewed as choosing the best k out of n possible variables in a linear regression. Given actual sales Sp and test sales Sjp for i = 1,. . . n and p = 1,. . . m, we used the forward selection and backward elimination methods (Myers (1990)) to choose k out of the n test... [Pg.119]


See other pages where Descriptors descriptor medians is mentioned: [Pg.293]    [Pg.293]    [Pg.297]    [Pg.299]    [Pg.154]    [Pg.763]    [Pg.456]    [Pg.29]    [Pg.400]    [Pg.311]    [Pg.291]    [Pg.292]    [Pg.11]    [Pg.56]    [Pg.163]    [Pg.163]    [Pg.187]    [Pg.188]    [Pg.394]    [Pg.490]    [Pg.518]    [Pg.63]    [Pg.60]    [Pg.59]    [Pg.3217]    [Pg.52]    [Pg.713]    [Pg.76]    [Pg.120]    [Pg.143]   
See also in sourсe #XX -- [ Pg.293 ]




SEARCH



Median

© 2024 chempedia.info