Statistical distributions maximum likelihood

In the maximum-likelihood method used here, the "true" value of each measured variable is also found in the course of parameter estimation. The differences between these "true" values and the corresponding experimentally measured values are the residuals (also called deviations). When there are many data points, the residuals can be analyzed by standard statistical methods (Draper and Smith, 1966). If, however, there are only a few data points, examination of the residuals for trends, when plotted versus other system variables, may provide valuable information. Often these plots can indicate at a glance excessive experimental error, systematic error, or "lack of fit." Data points which are obviously bad can also be readily detected. If the model is suitable and if there are no systematic errors, such a plot shows the residuals randomly distributed with zero means. This behavior is shown in Figure 3 for the ethyl-acetate-n-propanol data of Murti and Van Winkle (1958), fitted with the van Laar equation. [Pg.105]

The converged parameter values represent the Least Squares (LS), Weighted LS or Generalized LS estimates depending on the choice of the weighting matrices Q,. Furthermore, if certain assumptions regarding the statistical distribution of the residuals hold, these parameter values could also be the Maximum Likelihood (ML) estimates. [Pg.53]

These considerations raise a question how can we determine the optimal value of n and the coefficients i < n in (2.54) and (2.56) Clearly, if the expansion is truncated too early, some terms that contribute importantly to Po(AU) will be lost. On the other hand, terms above some threshold carry no information, and, instead, only add statistical noise to the probability distribution. One solution to this problem is to use physical intuition [40]. Perhaps a better approach is that based on the maximum likelihood (ML) method, in which we determine the maximum number of terms supported by the provided information. For the expansion in (2.54), calculating the number of Gaussian functions, their mean values and variances using ML is a standard problem solved in many textbooks on Bayesian inference [43]. For the expansion in (2.56), the ML solution for n and o, also exists, lust like in the case of the multistate Gaussian model, this equation appears to improve the free energy estimates considerably when P0(AU) is a broad function. [Pg.65]

If the errors are normally distributed, the OLS estimates are the maximum likelihood estimates of 9 and the estimates are unbiased and efficient (minimum variance estimates) in the statistical sense. However, if there are outliers in the data, the underlying distribution is not normal and the OLS will be biased. To solve this problem, a more robust estimation methods is needed. [Pg.225]

Mendal et al. (1993) compared eight tests of normality to detect a mixture consisting of two normally distributed components with different means but equal variances. Fisher s skewness statistic was preferable when one component comprised less than 15% of the total distribution. When the two components comprised more nearly equal proportions (35-65%) of the total distribution, the Engelman and Hartigan test (1969) was preferable. For other mixing proportions, the maximum likelihood ratio test was best. Thus, the maximum likelihood ratio test appears to perform very well, with only small loss from optimality, even when it is not the best procedure. [Pg.904]

W = AG. Of course, this relation can be tested only in the region of work values along the work axis where both distributions (forward and reverse) overlap. An overlap between the forward and reverse distributions is hardly observed if the molecules are pulled too fast or if the number of pulls is too small. In such cases, other statistical methods (Rennet s acceptance ratio or maximum likelihood methods. Section IV.B.3) can be applied to get reliable estimates of AG. The validity of the CFT has been tested in the case of the RNA hairpin CD4 previously mentioned and the three-way junction RNA molecule as well. Figure 9c,d and Fig. 10c show results for these two molecules. [Pg.72]

The log-likelihood function at the maximum likelihood estimates is -28.993171. For the model with only a constant term, the value is -31.19884. The t statistic for testing the hypothesis that (3 equals zero is 5.16577/2.51307 = 2.056. This is a bit larger than the critical value of 1.96, though our use of the asymptotic distribution for a sample of 10 observations might be a bit optimistic. The chi squared value for the likelihood ratio test is 4.411, which is larger than the 95% critical value of 3.84, so the hypothesis that 3 equals zero is rejected on the basis of these two tests. [Pg.110]

Maximum likelihood ML A general statistical procedure to estimate one or more parameters (e.g., recombination fraction) of a distribution provided that the distribution is specified. [Pg.573]

There are often data sets used to estimate distributions of model inputs for which a portion of data are missing because attempts at measurement were below the detection limit of the measurement instrument. These data sets are said to be censored. Commonly used methods for dealing with such data sets are statistically biased. An example includes replacing non-detected values with one half of the detection limit. Such methods cause biased estimates of the mean and do not provide insight regarding the population distribution from which the measured data are a sample. Statistical methods can be used to make inferences regarding both the observed and unobserved (censored) portions of an empirical data set. For example, maximum likelihood estimation can be used to fit parametric distributions to censored data sets, including the portion of the distribution that is below one or more detection limits. Asymptotically unbiased estimates of statistics, such as the mean, can be estimated based upon the fitted distribution. Bootstrap simulation can be used to estimate uncertainty in the statistics of the fitted distribution (e.g. Zhao Frey, 2004). Imputation methods, such as... [Pg.50]

Since this monograph is devoted only to the conception of mathematical models, the inverse problem of estimation is not fully detailed. Nevertheless, estimating parameters of the models is crucial for verification and applications. Any parameter in a deterministic model can be sensibly estimated from time-series data only by embedding the model in a statistical framework. It is usually performed by assuming that instead of exact measurements on concentration, we have these values blurred by observation errors that are independent and normally distributed. The parameters in the deterministic formulation are estimated by nonlinear least-squares or maximum likelihood methods. [Pg.372]

A variety of techniques is nowadays available for the solution of inverse problems [26,27], However, one common approach relies on the minimization of an objective function that generally involves the squared difference between measured and estimated variables, like the least-squares norm, as well as some kind of regularization term. Despite the fact that the minimization of the least-squares norm is indiscriminately used, it only yields maximum likelihood estimates if the following statistical hypotheses are valid the errors in the measured variables are additive, uncorrelated, normally distributed, with zero mean and known constant standard-deviation only the measured variables appearing in the objective function contain errors and there is no prior information regarding the values and uncertainties of the unknown parameters. [Pg.44]

Maximal likelihood was first presented by R.A. Fisher (1921) (when he was 22 years old ) and is the backbone of statistical estimation. The object of maximum likelihood is to make inferences about the parameters of a distribution 0 given a set of observed data. Maximum likelihood is an estimation procedure that finds an estimate of 0 (an estimator called 0) such that the likelihood of actually observing the data is maximal. The Likelihood Principle holds that all the information contained in the data can be summarized by a likelihood function. The standard approach (when a closed form solution can be obtained) is to derive the likelihood function, differentiate it with respect to the model parameters, set the resulting equations equal to zero, and then solve for the model parameters. Often, however, a closed form solution cannot be obtained, in which case optimization is done to find the set of parameter values that maximize the likelihood (hence the name). [Pg.351]

An overview of statistical methods covers mean values, standard deviation, variance, confidence intervals, Student s t distribution, error propagation, parameter estimation, objective functions, and maximum likelihood. [Pg.73]

Holland, D. M., and Fitz-Simons, T. (1982) Fitting statistical distributions to air quality data by the maximum likelihood method, Atmos. Environ. 16, 1071-1076. [Pg.1173]

In essence, this principle is the well-known Method of Maximum Likelihood (MML). The PDF written as a function of measurements P(f(a) F) is called Likelihood Function. The MML is one of the strategic principles of statistical estimation that provides statistically the best solution in many senses [27]. For example, the asymptotical error distribution (for infinite number of f realizations) of MML estimates a have the smallest possible variances of a,. [Pg.70]

Difficulties arise when the concentration of an element is below the detection limit (DL). It is often standard practice to report these data simply as

generally depends on the amount of data below the detection hmit, the size of the data set, and the probability distribution of the measurements. When the number of < DL observations is small, replacing them with a constant is generally satisfactory (Clarke, 1998). The values that are commonly used to replace the < DL values are 0, DL, or DL/2. Distributional methods such as the marginal maximum likelihood estimation (Chung, 1993) or more robust techniques (Helsel, 1990) are often required when a large number of < DL observations are present. [Pg.23]

The fission process is less well understood than the others, and there is not a good theoretical argument for a particular statistical distribution. Porter and Thomas 42) predict that it should be a fairly broad distribution corresponding to a small value of v (say, in the range 1 to 4) rather than a very high value as is the case with capture widths. One is forced to rely on the experimental data in spite of the poor statistics. Fischer 32) analyzed 12 resonances of and 19 resonances of Pu , both by the moments method (79) and the maximum likelihood method (87), with the results shown in Table V. [Pg.158]

A further argument concerns the parameter v for the neutron width distribution, as obtained by the maximum likelihood method from the three experiments. For both of the earlier experiments, v is amazingly close to the theoretical value of one, although the statistical uncertainty is at least 10%. Garg et al. 47) report that they found v = 0.89, if all the levels are counted, and v = 1.13, if the uncertain levels are excluded. One could speculate that inclusion of about half of the uncertain levels would yield the desired value v = 1, which corresponds to an average spacing of about 19.3 eV. Thus, one could favor a value in the range 18.5 to 19.3 eV for use in calculations. [Pg.164]

Roughly, about 250 data points are required to fit the generalized hyperbolic distributions. However, about 100 data points can offer reasonable results. Although maximum-likelihood estimation method can be used to estimate the parameters, it is very difficult to solve such a complicated nonlinear equation system with five equations and five unknown parameters. Therefore, numerical algorithms are suggested such as modified Powell method (Wang, 2005). Kolmogorov-Smirnov statistics can also be used here for the fitness test. [Pg.397]

Saunders, S. C. and J. M. Myhre, Maximum Likelihood Estimation for Two-Parameter Decreasing Hazard Rate Distributions Using Censored Data, J. American Statistical Society, 78, 664 (1983). [Pg.429]

We assume that the errors in the measurements are statistically independent, scaled by the weights, u , in such a way that they have equal variance ((T ) and come from a Gaussian distribution. In case of these reasonable assumptions weighted least squares coincides with the maximum likelihood estimate. The (weighted) experimental errors of the measurements are given by Y 0)y as in (6.2). This means that the covariance matrix of the experimental errors is given by ... [Pg.232]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...