Parameter Estimation and Statistical Testing of Models

3 Parameter Estimation and Statistical Testing of Models and Parameters in Single Reactions [Pg.112]

In the examples given above, reference was made to parameter estimation using regression methods and to the statistical testing of models and parameters. In the present section this topic will be presented in a systematic way. [Pg.112]

Let the kinetic model of the reaction relating the dependent variable /, the settings of the independent variables x, and the p parameters p, be an algebraic equation. For n observations of / (which can be conversion or rate) [Pg.112]

In Section 2.6.1, dealing with the differential method of kinetic analysis, [Pg.112]

Estimates b for the parameters p are determined by minimization of the objective function — the sum of squares of residuals [Pg.112]

Statistical testing of model adequacy and significance of parameter estimates is a very important part of kinetic modelling. Only those models with a positive evaluation in statistical analysis should be applied in reactor scale-up. The statistical analysis presented below is restricted to linear regression and normal or Gaussian distribution of experimental errors. If the experimental error has a zero mean, constant variance and is independently distributed, its variance can be evaluated by dividing SSres by the number of degrees of freedom, i.e. [Pg.545]

Nonlinear Models in Parameters, Single Reaction In practice, the parameters appear often in nonlinear form in the rate expressions, requiring nonlinear regression. Nonlinear regression does not guarantee optimal parameter estimates even if the kinetic model adequately represents the true kinetics and the data width is adequate. Further, the statistical tests of model adequacy apply rigorously only to models linear in parameters, and can only be considered approximate for nonlinear models. [Pg.38]

All component-wise calculated families of models (PCA, PLS, etc) are by definition nested. Nestedness is computationally convenient, but not by definition a desirable property. Hierarchical relationships between models are convenient because they allow for a general framework. It is then possible to think of a continuum of models, with increasing complexity, where complexity is defined as the number of (free) parameters which have to be estimated. For example, model (5.1) is less complex than model (5.2) and if model (5.1) can describe the variation well, there is no need for the added complexity of model (5.2). Given a particular data set, it holds in general that adding complexity to the model increases the fit to the data but also increases the variance of the estimated parameters. Hence, there is an optimal model complexity balancing both properties. This is the basic rationale in many statistical tests of model complexity [Fujikoshi Satoh 1997, Mallows 1973], Hierarchy is a desirable property from a statistical point of view, because it makes comparisons between... [Pg.90]

So how is MI incorporated in the context of exploratory data analysis since obviously one would not wish to analyze m different data sets. A simple method would be to impute m + 1 data sets, perform the exploratory analysis on one of the imputed data sets, and obtain the final model of interest. Then using the remaining m-data sets compute the imputed parameter estimates and standard errors of the final model. It should be kept in mind, however, that with the imputed data set being used to develop the model, the standard errors will be smaller than they actually are since this data set fails to take into account the sampling variability in the missing values. Hence, a more conservative test of statistical significance for either model entry or removal should be considered during model development. [Pg.90]

This sum, when divided by the number of data points minus the number of degrees of freedom, approximates the overall variance of errors. It is a measure of the overall fit of the equation to the data. Thus, two different models with the same number of adjustable parameters yield different values for this variance when fit to the same data with the same estimated standard errors in the measured variables. Similarly, the same model, fit to different sets of data, yields different values for the overall variance. The differences in these variances are the basis for many standard statistical tests for model and data comparison. Such statistical tests are discussed in detail by Crow et al. (1960) and Brownlee (1965). [Pg.108]

It was decided to fit the model expressed by Equation 10.6 to the most recent 100 data points only (i.e., starting with sequence number 72) and then use the earlier data points to test the predictive capability of the fitted model (although the prediction is backward in time see Section 3.5 and Exercise 4.19). Table 10.2 lists the parameter estimates and levels of confidence. Table 10.3 gives the ANOVA table and other statistics for the fitted model. [Pg.192]

A dynamic experimental method for the investigation of the behaviour of a nonisothermal-nonadiabatic fixed bed reactor is presented. The method is based on the analysis of the axial and radial temperature and concentration profiles measured under the influence of forced uncorrelated sinusoidal changes of the process variables. A two-dimensional reactor model is employed for the description of the reactor behaviour. The model parameters are estimated by statistical analysis of the measured profiles. The efficiency of the dynamic method is shown for the investigation of a pilot plant fixed bed reactor using the hydrogenation of toluene with a commercial nickel catalyst as a test reaction. [Pg.15]

The text reviews the methodology of kinetic analysis for simple as well as complex reactions. Attention is focused on the differential and integral methods of kinetic modelling. The statistical testing of the model and the parameter estimates required by the stochastic character of experimental data is described in detail and illustrated by several practical examples. Sequential experimental design procedures for discrimination between rival models and for obtaining parameter estimates with the greatest attainable precision are developed and applied to real cases. [Pg.215]

The process of research in chemical systems is one of developing and testing different models for process behavior. Whether empirical or mechanistic models are involved, the discipline of statistics provides data-based tools for discrimination between competing possible models, parameter estimation, and model verification for use in this enterprise. In the case where empirical models are used, techniques associated with linear regression (linear least squares) are used, whereas in mechanistic modeling contexts nonlinear regression (nonlinear least squares) techniques most often are needed. In either case, the statistical tools are applied most fruitfully in iterative strategies. [Pg.207]

The preciseness of the primary parameters can be estimated from the final fit of the multiexponential function to the data, but they are of doubtful validity if the model is severely nonlinear (35). The preciseness of the secondary parameters (in this case variability) are likely to be even less reliable. Consequently, the results of statistical tests carried out with preciseness estimated from the hnal ht could easily be misleading—thus the need to assess the reliability of model estimates. A possible way of reducing bias in parameter estimates and of calculating realistic variances for them is to subject the data to the jackknife technique (36, 37). The technique requires little by way of assumption or analysis. A naive Student t approximation for the standardized jackknife estimator (34) or the bootstrap (31,38,39) (see Chapter 15 of this text) can be used. [Pg.393]

With this book the reader can expect to learn how to formulate and solve parameter estimation problems, compute the statistical properties of the parameters, perform model adequacy tests, and design experiments for parameter estimation or model discrimination. [Pg.447]

Before collecting data, at least two lean/rich cycles of 15-min lean and 5-min rich were completed for the given reaction condition. These cycle times were chosen so as the effluent from all reactors reached steady state. After the initial lean/rich cycles were completed, IR spectra were collected continuously during the switch from fuel rich to fuel lean and then back again to fuel rich. The collection time in the fuel lean and fuel rich phases was maintained at 15 and 5 min, respectively. The catalyst was tested for SNS at all the different reaction conditions and the qualitative discussion of the results can be found in [75], Quantitative analysis of the data required the application of statistical methods to separate the effects of the six factors and their interactions from the inherent noise in the data. Table 11.5 presents the coefficient for all the normalized parameters which were statistically significant. It includes the estimated coefficients for the linear model, similar to Eqn (2), of how SNS is affected by the reaction conditions. [Pg.339]

Section 1.6.2 discussed some theoretical distributions which are defined by more or less complicated mathematical formulae they aim at modeling real empirical data distributions or are used in statistical tests. There are some reasons to believe that phenomena observed in nature indeed follow such distributions. The normal distribution is the most widely used distribution in statistics, and it is fully determined by the mean value p. and the standard deviation a. For practical data these two parameters have to be estimated using the data at hand. This section discusses some possibilities to estimate the mean or central value, and the next section mentions different estimators for the standard deviation or spread the described criteria are fisted in Table 1.2. The choice of the estimator depends mainly on the data quality. Do the data really follow the underlying hypothetical distribution Or are there outliers or extreme values that could influence classical estimators and call for robust counterparts ... [Pg.33]

ML is the approach most commonly used to fit a distribution of a given type (Madgett 1998 Vose 2000). An advantage of ML estimation is that it is part of a broad statistical framework of likelihood-based statistical methodology, which provides statistical hypothesis tests (likelihood-ratio tests) and confidence intervals (Wald and profile likelihood intervals) as well as point estimates (Meeker and Escobar 1995). MLEs are invariant under parameter transformations (the MLE for some 1-to-l function of a parameter is obtained by applying the function to the untransformed parameter). In most situations of interest to risk assessors, MLEs are consistent and sufficient (a distribution for which sufficient statistics fewer than n do not exist, MLEs or otherwise, is the Weibull distribution, which is not an exponential family). When MLEs are biased, the bias ordinarily disappears asymptotically (as data accumulate). ML may or may not require numerical optimization skills (for optimization of the likelihood function), depending on the distributional model. [Pg.42]

Finally, we compute Glesjer s test statistics for the three models discussed in Section 14.3.5. We regress e2, e, and log e on 1, Xh and X2. We use the White estimator for the covariance matrix of the parameter estimates in these regressions as there is ample evidence now that the disturbances are heteroscedastic related to X2. To compute the Wald statistic, we need the two slope coefficients, which we denote q, and the 2x2 submatrix of the 3x3 covariance matrix of the coefficients, which we denote Vq. The statistic is W - q Vq 1q. For the three regressions, the values are 4.13, 6.51, and 6.60, respectively. The critical value from the chi-squared distribution with 2 degrees of freedom is 5.99, so the second and third are statistically significant while the first is not. [Pg.44]

Statistical analysis of the results was performed using the software Statistica 5.5 (Stat Soft). Maximum lipase activities and time to reach the maximum were calculated through fitting of kinetic curves. The maximum was estimated by derivation of the fits. Empirical models were built to fit maximum lipase activity in the function of incubation temperature (T), moisture of the cake (%M), and supplementation (%00). The experimental error estimated from the duplicates was considered in the parameter estimation. The choice of the best model to describe the influence of the variables on lipase activity was based on the correlation coefficient (r2) and on the x2 test. The model that best fits the experimental data is presented in Table 2. [Pg.179]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...