Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Prevalence in Classified Datasets

The Cooper statistics do not consider the prevalence within the training set, which will introduce a bias in the ability of the model to predict one or other class. For example, if the training set has 75% actives relative to inactives, the null probability will be three times as likely to predict a compound as active rather than an inactive compound. Cohen defined the kappa index to overcome the problem of prevalence when assessing the significance of classification [Pg.255]

Sensitivity(true positive al(a + h) Fractions of actives correctly [Pg.256]

Positive predictivity al(a + c) Fraction of chemicals correctly assigned as active out all predicted actives [Pg.256]

Negative predictivity dl(h + d) Fraction of compounds correctly assigned as not-active out all predicted not-actives [Pg.256]

False positive (over cl(c + d) 1-specificity Fraction of not-actives falsely [Pg.256]


See other pages where Prevalence in Classified Datasets is mentioned: [Pg.255]   


SEARCH



Classified

Classifier

Classifying

Dataset

Prevalence

Prevalency

© 2024 chempedia.info