Exploratory data analysis statistical significance

Exploratory data analysis (EDA). This analysis, also called pretreatment of data , is essential to avoid wrong or obvious conclusions. The EDA objective is to obtain the maximum useful information from each piece of chemico-physical data because the perception and experience of a researcher cannot be sufficient to single out all the significant information. This step comprises descriptive univariate statistical algorithms (e.g. mean, normality assumption, skewness, kurtosis, variance, coefficient of variation), detection of outliers, cleansing of data matrix, measures of the analytical method quality (e.g. precision, sensibility, robustness, uncertainty, traceability) (Eurachem, 1998) and the use of basic algorithms such as box-and-whisker, stem-and-leaf, etc. [Pg.157]

So how is MI incorporated in the context of exploratory data analysis since obviously one would not wish to analyze m different data sets. A simple method would be to impute m + 1 data sets, perform the exploratory analysis on one of the imputed data sets, and obtain the final model of interest. Then using the remaining m-data sets compute the imputed parameter estimates and standard errors of the final model. It should be kept in mind, however, that with the imputed data set being used to develop the model, the standard errors will be smaller than they actually are since this data set fails to take into account the sampling variability in the missing values. Hence, a more conservative test of statistical significance for either model entry or removal should be considered during model development. [Pg.90]

Exploratory methods are used not to test hypotheses but rather to get an overview of data. Various clustering methods and ordination are excellent tools for exploratory analysis of microarray data. These unsupervised methods do not require external class or group information. Clusters are generated purely based on the intrinsic similarity of the gene or sample expression profiles. No null hypothesis can be rejected, and p values are not generated to test statistical significance. Methods that... [Pg.129]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...