For statistical reasons, multiple regression analysis cannot be used for 3D-QSAR methods that consider many more 3D descriptors than compounds or for which the descriptors are mutually correlated. The alternative strategies described next can be used to find a quantitative model in such situations. As will be seen, cross-validation is an important technique for assessing the robustness of a proposed model. [Pg.189]

Equation (13.7) represents the dual form of the regression model, since in it the activity is predicted by considering similarity measures of a test compound in relation to the training set compounds M. In order to obtain the traditional primal form of the 3D-QSAR model, which involves an explicit consideration of molecular descriptors and regression coefficients, one can make the substitution of Eqs. (13.1) and (13.3-13.6) to Eq. (13.7) to obtain [Pg.438]

Avram et al. [118] used 3D-QSAR CoMFA techniques to study a dataset of symmetric and non-symmetric cyclic urea HIVPl. The anti-HIVPR inhibitory activity data was taken from the literature [72,77]. The electrostatic and steric fields were calculated by default settings with sp C-atom probe with a + 1 charge. The regression models derived with the PLS and LOO cross vahdation technique were developed. The best model with a correlation coefficient (r ) of 0.981 and cross-validated correlation (q ) of 0.525 showed a higher contribution from steric fields (58.6%) compared to electrostatic fields (41.1%). Two additional models were reported with q = 0.627 and 0.536, respectively. All the models were improved further by omitting outliers. [Pg.209]

We validated the CMF approach in two case studies and obtained preliminary results, which have been published as short communications [7, 13]. The first one dealt with the use of the CMF to build 3D-QSAR regression models [7]. In the second case study [13], the performance of a new method for virtual screening of organic compounds based on the combination of the CMF methodology with the one-class SVM method (1-SVM) has been assessed. In both cases the CMF has not only proven its efficiency, bnt has also demonstrated some advantages compared to state-of-the-art approaches in chemoinformatics. [Pg.441]

It is clear that for an unsymmetrical data matrix that contains more variables (the field descriptors at each point of the grid for each probe used for calculation) than observables (the biological activity values), classical correlation analysis as multilinear regression analysis would fail. All 3D QSAR methods benefit from the development of PLS analysis, a statistical technique that aims to find the multidimensional direction in the X space that explains the maximum multidimensional variance direction in the F space. PLS is related to principal component analysis (PCA)." ° However, instead of finding the hyperplanes of maximum variance, it finds a linear model describing some predicted variables in terms of other observable variables and therefore can be used directly for prediction. Complexity reduction and data [Pg.592]

© 2019 chempedia.info