Outlier and Influence Analysis

Once a suitable covariate model is identified and no further model development will be done, the next step is to examine the dataset for outliers and influential observations. It may be that a few subjects are driving the inclusion of a covariate in a model or that a few observations are biasing the parameter estimates. Examination of the weighted residuals under Eq. (9.14) with the model estimates given in Table 9.15 showed that the distribution was skewed with two observations outside the acceptable limits of + 5. Patient 54 had an observable concentration of 4.05 mg/L 6-h postdose but had a predicted concentration of 1.22 mg/L, a difference of 2.83 mg/L and a corresponding weighted residual of +5.4. Patient 84 had an observable concentration of 1.57 mg/L 7.5-h postdose but had a [Pg.328]

To see what impact these observations might have on the parameter estimates, these observations were removed and the best model after backwards stepwise model development was refit. The results are shown in Table 9.15. Removal of these observations resulted in a decrease in the OFV, AIC, and condition number with little to no change in the parameter estimates. Although to be fair, direct comparison of the OFVs and AICs is not valid because of the unequal number of observations in the data sets. More importantly, although the distribution of weighted residuals was not normally distributed, the distribution was no longer skewed. Also, the standard error of the estimates all decreased. Whether to remove these observations from the data set is not immediately clear and if removal of observations was not specified a priori in the data analysis plan then their removal should probably not be made. In this case, it was decided to remove the observations in the data set. [Pg.329]

With a data set consisting of many patients, some patients may show influence over a single parameter. That is to be expected. The more important question is are there any patients that exert profound influence over a single parameter or are there a few patients that might [Pg.329]

To get at the question of overall influence, the matrix of structural model parameters and variance components was subjected to principal component analysis. Principal component analysis (PCA) was introduced in the chapter on Nonlinear Mixed Effects Model Theory and transforms a matrix of values to another matrix such that the columns of the transformed matrix are uncorrelated and the first column contains the largest amount of variability, the second column contains the second largest, etc. Hopefully, just the first few principal components contain the majority of the variance in the original matrix. The outcome of PC A is to take X, a matrix of p-variables, and reduce it to a matrix of q-variables (q p) that contain most of the information within X. In this PC A of the standardized parameters (fixed effects and all variance components), the first three principal components contained 74% of the total variability in the original matrix, so PCA was largely successfully. PCA works best when a high correlation exists between the variables in the original data set. Usually more than 80% variability in the first few components is considered a success. [Pg.329]

and 117) were identified in the index plot as potentially influential. In the plot of the first versus third principal component, Patients 114, 115, and 117 were again identified as outside the bulk of data, as were Patients 57 and 115 in a plot of the second versus third principal component. A review of these patient s demographics failed to reveal anything unusual. It was decided that Patients 57, 114, 115, and 117 were too influential in the overall estimation of the model parameters and that they should be removed from the data set. [Pg.329]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...