Detecting and Removing Bad Data

When the data set is being examined visually, several measurements can already be marked as potentially bad data, especially the measurements that lie considerably out of the expected range of the process. But there are also bad data that are not directly visible. Therefore several methods have been developed to recognize this. One of these methods, the principal component analysis, will be discnssed in the following chapter, but its merits will already be shown here. [Pg.292]

In principal component analysis, the process input data can easily be arralyzed outliers can be detected and redimdant measirrements identified. Process inputs and output data can be combined in one data set. [Pg.292]

A matrix A I J) is corrstmcted containing all the process input data. This matrix is decomposed into a set of scores T(J K) and loadings P(J K), where K is the nrrmber of principal components chosen so as to explain the important variation in the data trsing as few orthogonal components as possible. [Pg.292]

In a principal component model, each principal component is a linear combination of the original process variables defined in the data set. For a process with 10 process variables (file pv.mat), a principal components analysis was made using the PLS toolbox (Eigen vector research, 2004) and the result is shown in Table 21.1 [Pg.293]

Almost 90% of the variance in the data set can be explained by the first five principal components (PC). [Pg.293]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...