Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Unsupervised variable selection

With the exception of those variables having zero variance (which pick themselves), the decision about which variables to eliminate/include and the method by which this is done depends on several factors. The two most important factors are whether the dataset consists of two blocks of variables, a response block (Y) and a descriptor/predictor block (X), and whether the purpose of the analysis is to predict/describe values for one or more of the response variables from a model relating the variables in the two blocks. If this result is indeed the aim of the analysis, then it seems reasonable that the choice of variables to be included should depend, to some extent, on the response variable or variables being modeled. This approach is referred to as supervised variable selection. On the other hand, if the variable set consists of only one block of variables, the choice of variables in any analysis will be done with what are referred to as unsupervised variable selection. [Pg.307]

Whitley DC, Ford MG, Livingstone DJ. Unsupervised forward selection a method for eliminating redundant variables. J Chem Inf Comput Sci 2000 40 1160-8. [Pg.489]

A set of molecules is commonly described with anywhere from 4 to 10,000 descriptors. It is also possible to represent molecules with sparse descriptors numbering up to 2 million. Variable selection, or descriptor subset selection, or descriptor validation, is important, whether the context is supervised or unsupervised learning (Section 6). [Pg.79]

This procedure can be performed either with item 1, in which case it is considered to be a supervised variable selection meaning that the response variable has selected variables, or without item 1, relegating it to an unsupervised selection category. [Pg.334]

Adopting the unsupervised option initially, the first two variables to be selected are those with the lowest pairwise correlation. The next variable selected has the smallest multiple squared correlation with those first two variables. This process is continued until the preset maximum level of multicolinearity (determined by the squared multiple correlation coefficient) is reached. Whitley et al. refer to this procedure as unsupervised forward selection (UFS). UFS can also be performed with a minimum variance criterion where only variables with variance above this minimum will be selected. These two criteria can be used by scientists simultaneously. With supervised variable selection, only those variables having a sufficiently high correlation with the response are considered for what effectively is UFS on this reduced set of variables. We will term this latter process, supervised forward selection (SFS). To see how these options work and to examine the effect they have on the model produced, we performed PLS on the data with both UFS and SFS configured to run with a range of response variable correlations (Table 8). [Pg.335]

Unsupervised Forward Selection A Method for Eliminating Redundant Variables. [Pg.344]

The main method of modification is by multiplication of the compressed spectrum by a vector of coefficients calculated from a stepwise multiple regression analysis (SMLR) of the values of the analyte of interest in the database on the compressed spectral data. This both selects a subset of the variables and weights them according to their importance for predicting the analyte. The SMLR is run after the compression because FT and WT coefficients are essentially uncorrelated, which makes the unsupervised use of SMLR much less problematic. [Pg.785]


See other pages where Unsupervised variable selection is mentioned: [Pg.334]    [Pg.335]    [Pg.337]    [Pg.334]    [Pg.335]    [Pg.337]    [Pg.167]    [Pg.180]    [Pg.1198]    [Pg.230]    [Pg.308]    [Pg.455]    [Pg.309]    [Pg.82]    [Pg.168]    [Pg.81]    [Pg.302]    [Pg.205]    [Pg.126]    [Pg.445]    [Pg.118]    [Pg.159]    [Pg.1814]    [Pg.121]   
See also in sourсe #XX -- [ Pg.307 , Pg.334 ]




SEARCH



Supervised and Unsupervised Variable Selection

Unsupervised

Variable selection

© 2024 chempedia.info