Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Molecular datasets

How is dimension reduction of chemical spaces achieved There are a number of different concepts and mathematical procedures to reduce the dimensionality of descriptor spaces with respect to a molecular dataset under investigation. These techniques include, for example, linear mapping, multidimensional scaling, factor analysis, or principal component analysis (PCA), as reviewed in ref. 8. Essentially, these techniques either try to identify those descriptors among the initially chosen ones that are most important to capture the chemical information encoded in a molecular dataset or, alternatively, attempt to construct new variables from original descriptor contributions. A representative example will be discussed below in more detail. [Pg.282]

Because principal component analysis attempts to account for all of the variance within a molecular dataset, it can be negatively affected by outliers, i.e., compounds having at least some descriptor values that are very different from others. Therefore, it is advisable to scale principal component axes or, alternatively, pre-process compound collections using statistical filters to identify and remove such outliers prior to the calculation of principal components. [Pg.287]

Descriptor requirements present a significant difference between MP and decision tree methods such as RP. Whereas two-state descriptors are not suitable for MP, these types of descriptors are typically required for decision tree algorithms because at each branch the presence or absence of specific feature(s) must be detected in order to recursively divide a molecular dataset. [Pg.298]

Golbraikh, A. (2000) Molecular dataset diversity indices and their applications to comparison of chemical databases and QSAR analysis. /. Chem. Inf. Comput. Sci., 40, 414—425. [Pg.1047]

Good AC, Hodgkin EE, Richards WG. Similarity screening of molecular datasets. J Comput-Aided Mol Des 1992 6 513-520. [Pg.478]

FIGURE 3.1 Strict consensus of 24 MPTs based on three-gene molecular dataset as published in Bell and Newton (2004), with signihcant groups highlighted. Numbers above branches are bootstrap percentages, numbers below are decay indices. Pleurocarpans taxa (sensu Newton and De Luna, 1999) are in small capitals. [Pg.43]

In addition to our MP analyses we performed Bayesian inferences with MrBayes 3.0 (Huelsen-beck and Ronquist, 2001). Modeltest 3.5 (Posada, 2004) was used to select DNA substitution models for the molecular dataset (gamma shape distribution, six substitution types). Six data... [Pg.86]

The topology of the trees obtained from the MP analyses of the morphological dataset closely resemble those obtained by Kruijer (2002) and are not shown here. The trees obtained from the MP analyses of the molecular dataset are similar to the ones obtained from the MP analyses of the combined dataset. The analyses of the combined dataset, however, resulted generally in better supported branches and a higher resolution of, e.g., the Hypopterygiaceae clade and the results of these analyses are therefore presented here. [Pg.94]

Morphologically some of these results make little sense, even if the support for some of the suggested relationships was moderately strong to strong. This example demonstrates that results from cladistic analyses of limited molecular datasets may not be more reliable than limited morphological and anatomical datasets, and that neither should be used uncritically for suggesting relationships among taxa. [Pg.230]

The first part of QSAR analysis includes selection of a molecular dataset for QSAR studies, acquiring or calculation of molecular descriptors (quantities characterizing molecular... [Pg.1311]


See other pages where Molecular datasets is mentioned: [Pg.151]    [Pg.279]    [Pg.283]    [Pg.291]    [Pg.124]    [Pg.255]    [Pg.259]    [Pg.259]    [Pg.255]    [Pg.259]    [Pg.259]    [Pg.129]    [Pg.33]    [Pg.118]    [Pg.170]    [Pg.46]   


SEARCH



Dataset

© 2024 chempedia.info