Missing and censored data

Missing and censored data should be handled exactly as in the case of linear regression. The analyst can use complete case analysis, naive substitution, conditional mean substitution, maximum likelihood, or multiple imputation. The same advantages and disadvantages for these techniques that were present with linear regression apply to nonlinear regression. [Pg.121]

There are often data sets used to estimate distributions of model inputs for which a portion of data are missing because attempts at measurement were below the detection limit of the measurement instrument. These data sets are said to be censored. Commonly used methods for dealing with such data sets are statistically biased. An example includes replacing non-detected values with one half of the detection limit. Such methods cause biased estimates of the mean and do not provide insight regarding the population distribution from which the measured data are a sample. Statistical methods can be used to make inferences regarding both the observed and unobserved (censored) portions of an empirical data set. For example, maximum likelihood estimation can be used to fit parametric distributions to censored data sets, including the portion of the distribution that is below one or more detection limits. Asymptotically unbiased estimates of statistics, such as the mean, can be estimated based upon the fitted distribution. Bootstrap simulation can be used to estimate uncertainty in the statistics of the fitted distribution (e.g. Zhao Frey, 2004). Imputation methods, such as... [Pg.50]

Left-censored data are characteristic of many bioassays due to the inherent limitation of the presence of a lower limit of detection and quantification. An ad hoc approach to dealing with the left-censored values is to replace them with the Unfit of quantification (LOQ) or LOQ/2 values. Alternatively, one can borrow information from other variables related to the missing values and use MI to estimate the left-censored data. In addition, the left-censored mechanism can be incorporated directly into a parametric model, and a maximum likelihood (ML) approach can be used to estimate the parameters (21). [Pg.254]

To truly account for left-censored data requires a likelihood approach that defines the total likelihood as the sum of the likelihoods for the observed data and the missing data and then maximizes the total censored and uncensored likelihood with respect to the model parameters. In the simplest case with n independent observations that are not longitudinal in nature, m of which are below the LLOQ, the likelihood equals... [Pg.297]

Replacement of missing values or censored data with any value is always risky since this can substantially change the correlation in the data It is possible to deal with both missing values and outliers simultaneously (Stanimirova et al., 2007). An excellent revision dealing with zeros and missing values in compositional data sets using non-parametric imputation has been performed by Martin-Fernandez et al. (2003). [Pg.24]

Homogeneity and scale testing for small samples with censored and missing data... [Pg.848]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...