Feature Selection and Ranking

A crude criterion for identifying the genes that discriminate between all the classes is the f-test. The f-statistic selects features under the following ranking criterion [Pg.140]

Dudoit et al. (2002) introduced the ratio of between-group to within-group sums of squares (BW ratio). The BW ratio for a gene j, BW(y), is defined as [Pg.141]

TABLE 6.1 Algorithms for Feature Selection and Ranking by Category [Pg.142]

MDA Embedded (Tree-based Bagging) R landomFoiest, randomForesC [Pg.142]

An efficient method of feature selection and hence sensor optimization for an e-nose system is described in this work with the help of a problem of black tea quality prediction. This work shows that a feature set comprising of few features from the highest ranking of a feature selection algorithm will not necessarily produce the best classification performance. Since the performance of a classifier depends on the choice of the parameter also. Therefore, the feature and parameter of a classifier should be selected simultaneously to obtain the optimum performance. In our future work we shall look after this issue by using wrapper or embedded method of feature selection in this application. [Pg.203]

Sometimes, several feature selection methods are used for a given analysis. For example, an analyst might reduce chromatogram to a peak table, selecting a series of candidate variables of interest and then perform further variable ranking and optimization on the integrated peak table, especially in the case of multidimensional separations where hundreds, if not thousands of compounds can be resolved (Felkel et al., 2010). [Pg.318]

Figure 6.4 shows the selected 20 genes in classification ranked with the f-test, weighted frequency measure from CHRP, and the BW ratio. The performance of the feature selection methods applied with the classification algorithms for the pediatric AML data showed about 70% generalized accuracy (Baek et al., 2008). [Pg.143]

In Fig. 6 we have described the proposed combinational feature selection technique. The first two features of each individual ranking are taken together to form a feature set. It is observed from the Table 3 that the combined feature set consists of only three features, i.e., feature number 4,6 and 8. [Pg.202]

One group of feature selection methods uses the statistics of the data (means, variances) to select the most important features. The features are ranked according to their importance and less important features are discarded. [Pg.106]

The purpose of this chapter is twofold, firstly to introduce a methodology for equipment selection and secondly to describe the principal features of Filter Design Software (FDS). With respect to the former, a technique for preliminary equipment selection is presented and it is shown how an equipment list can be ranked to help refine further selection considerations. Descriptions of FDS illustrate how equipment selection, data analysis and equipment simulation procedures can be combined into computer software, a basic flowsheet is shown in Figure 5.1. Worked examples are given. [Pg.201]

Figure 7.5. Simulation results that elucidate how the sensitivity and the selectivity of a proteomics experiment depend on various features (a) The choice of algorithm. The probity algorithm displays better sensitivity and selectivity than an algorithm that ranks strictly based on the number of matches, (b) The search conditions. Increasing the mass window of a search 10 times when searching with data that display small mass errors yields worse sensitivity and selectivitry. (c) The quality of the data. Data with less noise yields better sensitivity and selectivity.

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...