Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Chemical space representation molecular similarity

In this chapter, we will give a brief introduction to the basic concepts of chemoinformatics and their relevance to chemical library design. In Section 2, we will describe chemical representation, molecular data, and molecular data mining in computer we will introduce some of the chemoinformatics concepts such as molecular descriptors, chemical space, dimension reduction, similarity and diversity and we will review the most useful methods and applications of chemoinformatics, the quantitative structure-activity relationship (QSAR), the quantitative structure-property relationship (QSPR), multiobjective optimization, and virtual screening. In Section 3, we will outline some of the elements of library design and connect chemoinformatics tools, such as molecular similarity, molecular diversity, and multiple objective optimizations, with designing optimal libraries. Finally, we will put library design into perspective in Section 4. [Pg.28]

In systematic SAR analysis, molecular structure and similarity need to be represented and related to each other in a measurable form. Just like any molecular similarity approach, SAR analysis critically depends on molecular representations and the way similarity is measured. The nature of the chemical space representation determines the positions of the molecules in space and thus ultimately the shape of the activity landscape. Hence, SARs may differ considerably when changing chemical space and molecular representations. In this context, it becomes clear that one must discriminate between SAR features that reflect the fundamental nature of the underlying molecular structures as opposed to SAR features that are merely an artifact of the chosen chemical space representation. Consequently, activity cliffs can be viewed as either fundamental or descriptor- and metrics-dependent. The latter occur as a consequence of an inappropriate molecular representation or similarity metrics and can be smoothed out by choosing a more suitable representation, e.g., by considering activity-relevant physicochemical properties. By contrast, activity cliffs fundamental to the underlying SARs cannot be circumvented by changing the reference space. In this situation, molecules that should be recognized as... [Pg.129]

This chapter provides a brief overview of chemoinformatics and its applications to chemical library design. It is meant to be a quick starter and to serve as an invitation to readers for more in-depth exploration of the field. The topics covered in this chapter are chemical representation, chemical data and data mining, molecular descriptors, chemical space and dimension reduction, quantitative structure-activity relationship, similarity, diversity, and multiobjective optimization. [Pg.27]

The first and most important step is to choose properties that describe the molecules as numerical values. In the following text these numerical representations of some molecular properties tvill be denoted as descriptors . The use of numeric descriptors prepares the problem for subsequent computer processing (Fig. 5). AU subsequent steps of a study look at the descriptors instead at the molecules themselves. Therefore, diversity or similarity is defined in the descriptor space instead of the chemical space. The relevance of descriptor similarity for the similarity of the molecules must be ensured by appropriate choice of the descriptor set [16, 17]. Obviously, it is very important to know the characteristics and appUcability of various descriptor sets. [Pg.567]

In both cases, a structural representation of a small molecule is the input parameter to a conceptual set of operations that give rise to numerical outputs such as molecular descriptors, physicochemical properties, or biological outcomes (Fig. 13.1-l(a)). However, to be useful in predictive ways, such as when used to support prospective decisions about the investment of synthetic chemistry resources, at least some of these numerical outputs must be computable given only a structure representation. Only this situation allows relationships between experimentally determined values and computed values to be used to predict experimental outcomes for new molecules, based on their structural similarity to molecules that have already been experimentally tested (Fig. 13.1-l(b)). Most broadly, chemical space is a colloquialism that refers to the ranges and distributions of computed or measured outputs based on chemical structure inputs, and serves as a mathematical framework for quantitative comparisons of similarities and differences between small molecules (Fig. 13.1-l(c)). [Pg.725]

One key aspect of model applicability is the definition of the chemical space and the way in which chemical similarity is measured, as chemical similarity is a relative concept. The similarity or distance values depend on both the type of molecular representation or the distance measure used. Due to this lack of... [Pg.466]

While molecular similarity has no doubt played and will continne to play an important role in drug research and to a lesser extent other chanical-oriented fields, it is not without its issues. Primary among them is the sensitivity of similarity measures to the representation and similarity function nsed—as is well known, different measures tend to yield different similarity values. Such differences can radically transform chemical spaces so that nearest neighbors in one space are no longer nearest neighbors in another space [40,41]. This behavior has significant consequences for many of the research activities such as LEVS that typically employs similarity methods. To counter this deficiency, a variety of methods have been developed based on combining, in some fashion, the results from multiple similarity procedures. These methods, typically called fusion or consensus methods, have led to some improvements and are discussed in Section 15.5.3. [Pg.347]

In actual applications of MSA, many different types of representations are utilized to compute molecular similarities [41, 52-54]. Johnson [55] has provided a detailed discussion of the manifold types of mathematical spaces and their associated representations. The information contained in the representations is usually in the form of molecular or chemical features called descriptors that are derived from the structural and chemical properties of molecules. Descriptors ate nominally classified as ID (one-dimensional), 2D, or 3D. ID descriptors are usually associated with whole molecule properties such as molecular weight, logP, solubility, number of hydrogen bond donors, nnmber of rotatable bonds, and so on. 2D descriptors are associated with the topological strnctnre of molecules as typically depicted in chemists drawings. Such depictions show the atoms, the bonds connecting them, and in some cases include stereochemical features, but they do not explicitly depict the 3D structures of molecules. 3D descriptors, as their name implies, are associated with the 3D structures of molecules. Todeschini and Consonni [56] have compiled an extensive reference containing many of the descriptors used in chemical informatics applications. [Pg.351]

Since different molecular representations capture different structural, chemical, and biological aspects of molecules [117,124], there is not always a clear answer to what molecular descriptions perform best in similarity searching and related similarity-based activities. In addition, different similarity measures tend to exhibit different performance characteristics in different application domains. Thus, it is unlikely that a single measure will be sufficient to effectively treat all regions of chemical space uniformly. As an alternative to using a single search method, it has been proposed... [Pg.371]

Since most molecular representations are actually of high dimension (cf [74]), their corresponding chemical spaces are intrinsically of high dimension as well. The spatial properties of high-dimensional spaces can, in some cases, give rise to surprising problems since they tend to behave in a manner that is uncharacteristic of low-dimensional spaces [158, 159]. It is possible, however, to construct lower dimension representations of chemical spaces by computing the similarity of or... [Pg.378]

Chemoinformatics (or cheminformatics) deals with the storage, retrieval, and analysis of chemical and biological data. Specifically, it involves the development and application of software systems for the management of combinatorial chemical projects, rational design of chemical libraries, and analysis of the obtained chemical and biological data. The major research topics of chemoinformatics involve QSAR and diversity analysis. The researchers should address several important issues. First, chemical structures should be characterized by calculable molecular descriptors that provide quantitative representation of chemical structures. Second, special measures should be developed on the basis of these descriptors in order to quantify structural similarities between pairs of molecules. Finally, adequate computational methods should be established for the efficient sampling of the huge combinatorial structural space of chemical libraries. [Pg.363]


See other pages where Chemical space representation molecular similarity is mentioned: [Pg.7]    [Pg.125]    [Pg.126]    [Pg.1]    [Pg.350]    [Pg.317]    [Pg.194]    [Pg.522]    [Pg.329]    [Pg.343]    [Pg.293]    [Pg.296]    [Pg.369]    [Pg.467]    [Pg.194]    [Pg.522]    [Pg.97]    [Pg.98]    [Pg.100]    [Pg.101]    [Pg.312]    [Pg.2585]    [Pg.82]    [Pg.107]    [Pg.603]    [Pg.203]    [Pg.144]    [Pg.145]    [Pg.11]    [Pg.853]    [Pg.12]    [Pg.748]    [Pg.251]   
See also in sourсe #XX -- [ Pg.350 , Pg.351 , Pg.352 , Pg.353 , Pg.354 , Pg.355 ]




SEARCH



Chemical representation

Chemical similarity

Chemical space

Chemical space representations

Chemical spaces molecular similarity

Molecular similarity

Molecular similarity spaces

Molecular space

Representation molecular

Space representation

© 2024 chempedia.info