Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Data analysis, exploratory

Exploratory data analysis, EDA, is an essential prerequisite of the examination of data by confirmatory methods. Time spent here can lead to a much greater appreciation of its structure and the selection of the most appropriate confirmatory technique. This has parallels in the analytical world. The story of the student s reply to the question Ts the organic material a carboxylic acid which was I don t know because the IR scan isn t back yet poses questions about the approaches to preliminary testing  [Pg.43]

These EDA methods are essentially pictorial and can often be carried out using simple pencil and paper methods. Picturing data and displaying it accurately is an aspect of data analysis which is under utilised. Unless exploratory data analysis uncovers features and structures within the data set there is likely to be nothing for confirmatory data analysis to consider One of the champions of EDA, the American statistician John W. Tukey, in his seminal work on EDA captures the underlying principle in his comment that [Pg.43]

The simple EDA approaches developed by Tukey have been greatly extended by Tufte who remarked that [Pg.43]

Tufte s books on this topic, The Visual Display of Quantitative Information (1983), Envisioning Information (1990) and Visual Explanations (1997/ are recommended for further reading. [Pg.44]

Laboratory % Protein Laboratory % Protein Laboratory % Protein [Pg.46]

Some methods for illustrating structures, that is, multivariate relations between data items in high-dimensional data sets will be described. We will only consider methods that regard the input as metric vectors, and that may be used without any assumptions about the distribution of data. We also assume that limited information is available about the data set (i.e., class labels). [Pg.250]

While we will present methods that produce a cluster substructure of the data, we need to emphasize that, sometimes, variable selection and data preprocessing may also be important. [Pg.250]

The following questions are important when discussing a method for large, high-dimensional data sets What kind of structure does the method extract from the data set How does it illustrate the structure Does it reduce dimensionality of data Does it reduce the number of data points  [Pg.250]

The simplest method for visualizing a data set is to plot profiles, that is, two-dimensional (2D) graphs in which the dimensions are enumerated on the [Pg.250]

X axis, with the corresponding values on y. An alternative, also widely used, is to plot 2D representations of pairs of two original dimensions. There are also methods that produce different curves based on the data points values, for example, the components of data vectors are used as coefficients of orthogonal sinusoids, which are then added together. The most important drawback of such methods is that they do not reduce the amount of data, and thus it cannot be used effectively with large, high-dimensional data sets. However, they can be used for illustrating data summaries. [Pg.251]


Other methods consist of algorithms based on multivariate classification techniques or neural networks they are constructed for automatic recognition of structural properties from spectral data, or for simulation of spectra from structural properties [83]. Multivariate data analysis for spectrum interpretation is based on the characterization of spectra by a set of spectral features. A spectrum can be considered as a point in a multidimensional space with the coordinates defined by spectral features. Exploratory data analysis and cluster analysis are used to investigate the multidimensional space and to evaluate rules to distinguish structure classes. [Pg.534]

Spectral features and their corresponding molecular descriptors are then applied to mathematical techniques of multivariate data analysis, such as principal component analysis (PCA) for exploratory data analysis or multivariate classification for the development of spectral classifiers [84-87]. Principal component analysis results in a scatter plot that exhibits spectra-structure relationships by clustering similarities in spectral and/or structural features [88, 89]. [Pg.534]

Saraiva, P.M. and Stephanopoulos, G., 1992b. An exploratory data analysis robust optimization approach to continuous process improvement. Working Paper, Dept. Chem. Eng. MIT, Cambridge MA. [Pg.321]

Hoaglin, D. etal.. Understanding Robust and Exploratory Data Analysis. John Wiley Sons, New York, 1980. [Pg.236]

Tukey, J. W., Exploratory Data Analysis. Addison-Wesley, Reading, Massachusetts, 1977. [Pg.237]

Exploratory Data Analysis and Display Methods Visualization of Data Structures... [Pg.268]

Multivariate analytical images may be processed additionally by chemo-metrical procedures, e.g., by exploratory data analysis, regression, classifica-tion> and principal component analysis (Geladi et al. [1992b]). [Pg.281]

Tukey JW (1962) The future of data analysis. Ann Math Stat 33 1 81 Tukey JW (1977) Exploratory data analysis. Addison-Wesley, Reading, MA... [Pg.287]

Tukey, ]. W. (1977). Exploratory data analysis. Reading, MA Addison Wesley. [Pg.187]

Exploratory data analysis shows the aptitude of an ensemble of chemical sensors to be utilized for a given application, leaving to the supervised classification the task of building a model to be used to predict the class membership of unknown samples. [Pg.153]

Velleman, P.F. and Hoaglin, D.C. (1981). Applications, Basics and Computing of Exploratory Data Analysis. Duxbury Press, Boston. [Pg.129]

We will skip (1) and (2) above as methods not to be preferred as global analyses. Graphical displays have tremendous values as exploratory data analysis (EDA) techniques with the type of data one encounters in these studies. For formal analyses, one could weigh univariate repeated and other factorial designs against their true multivariate counterparts. [Pg.624]

The above two objectives, data examination and preparation, are the primary focus of this section. For data examination, two major techniques are presented the scattergram and Bartlett s test. Likewise, for data preparation (with the issues of rounding and outliers having been addressed in a previous chapter) two techniques are presented randomization (including a test for randomness in a sample of data) and transformation. Exploratory data analysis (EDA) is presented and briefly reviewed later. This is a broad collection of techniques and approaches to probe data, that is, to both examine and to perform some initial, flexible analysis of the data. [Pg.900]

Over the past twenty years, an entirely new approach has been developed to get the most information out of the increasingly larger and more complex data sets that scientists are faced with. This approach involves the use of a very diverse set of fairly simple techniques which comprise exploratory data analysis (EDA). As expounded by Tukey (1977), there are four major ingredients to EDA. [Pg.908]

A concept related (but not identical) to resistance and exploratory data analysis is that of robustness. Robustness generally implies insensitivity to departures from assumptions surrounding an underlying model, such as normality. [Pg.909]

Some of the above plots can be combined in one graphical display, like onedimensional scatter plot, histogram, probability density plot, and boxplot. Figure 1.7 shows this so-called edaplot (exploratory data analysis plot) (Reimann et al. 2008). It provides deeper insight into the univariate data distribution The single groups are... [Pg.29]

Exploratory data analysis has the aim to learn about the data distribution (clusters, groups of similar objects). In multivariate data analysis, an X-matrix (objects/samples characterized by a set of variables/measurements) is considered. Most used method for this purpose is PCA, which uses latent variables with maximum variance of the scores (Chapter 3). Another approach is cluster analysis (Chapter 6). [Pg.71]

Principal component analysis (PCA) can be considered as the mother of all methods in multivariate data analysis. The aim of PCA is dimension reduction and PCA is the most frequently applied method for computing linear latent variables (components). PCA can be seen as a method to compute a new coordinate system formed by the latent variables, which is orthogonal, and where only the most informative dimensions are used. Latent variables from PCA optimally represent the distances between the objects in the high-dimensional variable space—remember, the distance of objects is considered as an inverse similarity of the objects. PCA considers all variables and accommodates the total data structure it is a method for exploratory data analysis (unsupervised learning) and can be applied to practical any A-matrix no y-data (properties) are considered and therefore not necessary. [Pg.73]

Cluster analysis will be discussed in Chapter 6 in detail. Here we introduce cluster analysis as an alternative nonlinear mapping technique for exploratory data analysis. The method allows gaining more insight into the relations between the objects if a... [Pg.96]


See other pages where Data analysis, exploratory is mentioned: [Pg.418]    [Pg.418]    [Pg.45]    [Pg.148]    [Pg.149]    [Pg.46]    [Pg.284]    [Pg.53]    [Pg.15]    [Pg.114]    [Pg.116]    [Pg.124]    [Pg.902]    [Pg.908]    [Pg.25]    [Pg.45]    [Pg.95]    [Pg.95]    [Pg.97]   
See also in sourсe #XX -- [ Pg.908 ]

See also in sourсe #XX -- [ Pg.227 , Pg.437 ]

See also in sourсe #XX -- [ Pg.197 ]

See also in sourсe #XX -- [ Pg.170 ]

See also in sourсe #XX -- [ Pg.183 ]

See also in sourсe #XX -- [ Pg.636 , Pg.637 , Pg.638 ]

See also in sourсe #XX -- [ Pg.34 , Pg.37 ]

See also in sourсe #XX -- [ Pg.49 , Pg.50 , Pg.51 , Pg.52 ]

See also in sourсe #XX -- [ Pg.350 ]




SEARCH



Exploratory analysis

© 2024 chempedia.info