Estimation of Probability Densities

Estimation of the class-dependent probability densities p(x m) is the most important problem in the implementation of a Bayes classifier C4033. [Pg.80]

A first approach is the estimation of p(x m) for each pattern x to be classified by using known patterns in the neighbourhood of x. For this computations the KNN-technique or potential functions may be used (Chapter 3). The advantage of this approach is that no special assumptions have been made about the form of the probability density function the disadvantage is that the whole set of known patterns is necessary for each classification. [Pg.80]

Another approach assumes that the actual probability density can be approximated by a mathematical function. The patterns of the training set are used to calculate the parameters of that function. [Pg.80]

a xj-dimensJonaI Gaussian distribution is used which is defined by the mean vector of a class and the covariance matrix C380, 389, 391, [Pg.81]

Ellipsoidal clusters are well described by this function. [Pg.81]

On the other hand, when the number of objects is low (remember that SIMCA has been developed for this special case) the use of a Kernel estimation of probability density can have no significance, as shown by the example of Fig. 34, where the true distribution is a rectangular one. [Pg.126]

Typical data sets in chemistry contain 20 to 100 objects with 3 to 20 features. This small number of objects is not sufficient for a reasonably secure estimation of probability densities. Hence the application of parametric methods is not possible. The use of non parametric methods that make no assumptions about the underlying statistical distribution of data is necessary. These methods, however, do not allow for statements about the confidence of the results. [Pg.49]

Murray, G. D., A note on the estimation of probability density functions, Biometrika, 64, 150-152 (1977). [Pg.93]

A statistical meaningful estimation of probability densities requires very large data sets. Therefore, chemical applications of parametric classification methods always include assumptions which are often not fulfilled or cannot be proved. A severe assumption is the statistical independence of the pattern components which is certainly often not satisfied. Generation of new independent features is usually too laborious (Chapter 10). [Pg.87]

For the components used at each of the levels, a detailed finite element model and dynamics model were build (see Fig. 3) and it is used as the basis for training individual GP (Gaussian process) models. There is a significant amount of computation needed when performing a Bayesian updating so a fast function evaluation is highly desirable. Some selected results of the BN implementation are shown below. For the estimation of probability density functions, 10,000 samples were used. These were in addition to 20,000 samples used as bum-in samples to allow the Markov chain to become stationary. [Pg.159]

HaU P, Hui TC, Marron JC (1995) Improved variable window kernel estimates of probability densities. Ann Stat.23 l-10. [Pg.326]

A number of methods allow the estimation of probability densities, (a) A multivariate Gaussian distribution can be assumed the parameters are the class mean and the covariance matrix, (b) The p-dimensional probability density is estimated by the product of the probability densities of the p features, assuming they are independent, (c) The probability density at location x is estimated by a weighted sum of (Gaussian) kernel functions that have their centers at some prototype points of the class (neural network based on radial ba.sis functions, RBF ). (d) The probability density at location x is estimated from the neighboring objects (with known class memberships or known responses) by applying a voting scheme or by interpolation (KNN, Section 5.2). [Pg.357]

Approximating multicanoncal weights with the current estimate of the density of states the update is accepted with probability imi [l,g E)/g E ). ... [Pg.600]

A set of observed data points is assumed to be available as samples from an unknown probability density function. Density estimation is the construction of an estimate of the density function from the observed data. In parametric approaches, one assumes that the data belong to one of a known family of distributions and the required function parameters are estimated. This approach becomes inadequate when one wants to approximate a multi-model function, or for cases where the process variables exhibit nonlinear correlations [127]. Moreover, for most processes, the underlying distribution of the data is not known and most likely does not follow a particular class of density function. Therefore, one has to estimate the density function using a nonparametric (unstructured) approach. [Pg.65]

When estimating the probability density function of the number of fataUties, it was assumed (for reasons of simplicity) that the smallest probability goes with the largest number of fatalities and the largest probability with the lowest number of fatalities. This results in the FN-curve shown in figure 2. [Pg.1985]

The obtained conditional probability density functions (Fig. 4) are used to calculate characteristics of element criticality the estimate of probability of non-null criticality P X > 0 e ), —kxh. element is out of order. A = 1,. .., 89th is presented in Figure 5. [Pg.185]

Retail inventory management is concerned with determining the amount and timing of receipts to inventory of a particular product at a retail location. Retail inventory management problems can be usefully segmented based on the ratio of the product s life cycle T to the replenishment lead-time L. If T/L < 1, then only a single receipt to inventory is possible at the start of the sales season. This is the case considered in the well-known newsvendor problem. At the other extreme, if T/L 1, then it s possible to assemble sufficient demand history to estimate the probability density function of demand and to apply one of several well-known approaches such as the Q,R model. [Pg.124]

We have applied this process at a eatalog retailer and find that it improved over their current process for determining initial and replenishment quantities by enough to essentially double profits. Remarkably, eompared to no replenishment, a single optimized replenishment improves profit by a factor of five. A key challenge in implementing short life-eyele replenishment is estimating a probability density funetion for demand with no demand history. To circumvent this problem in our applieation, we applied the eommittee-forecast process in Fisher and Raman (1996) and found that it worked well. [Pg.125]

We assume the number density to be 10 cm . The turbulent vdodty is estimated to be of the order 1 km s from the FWHM of the Doppler broadoied lines of CS J=2-l transitions (Mundy et al. 1988). With our measurement of the angular dispersion we obtain a value for the magnetic field of B 7.5 mG 1.5. We note, however, that this value is sensitive to our estimates of the density and turbulent velocity. The velocity could be as high as 3 km s and the density uncertain by a factor of 10. This ves a probable range of 1 - 40 mG for the strength of the magnetic field in OMC-1. [Pg.464]

When providing input for the STOMP calculation a range of values of porosity (and all of the other input parameters) should be provided, based on the measured data and estimates of how the parameters may vary away from the control points. The uncertainty associated with each parameter may be expressed in terms of a probability density function, and these may be combined to create a probability density function for STOMP. [Pg.159]

Combinations of weather conditions, wind speed and wind direction along witli boiling point, vapor density, diffusivity, and heat of vaporization of tlie chemical released vary the healtli impact of tlie released chemical on the nearby population. To model a runaway reaction, the release of 10,000 gallons was assumed to occur over a 15-minute period. Tlie concentration of the chemical released was estimated, using procedures described in Part III (Chapter 12) for each combination of weather condition, wind speed, and wind direction. The results, combined with population data for tlie area adjacent to tlie plant, led to probability estimates of the number of people affected. Table 21.5.3 sunimarizes tlie findings. [Pg.623]

Table 2.3 is used to classify the differing systems of equations, encountered in chemical reactor applications and the normal method of parameter identification. As shown, the optimal values of the system parameters can be estimated using a suitable error criterion, such as the methods of least squares, maximum likelihood or probability density function. [Pg.112]

The application of optimisation techniques for parameter estimation requires a useful statistical criterion (e.g., least-squares). A very important criterion in non-linear parameter estimation is the likelihood or probability density function. This can be combined with an error model which allows the errors to be a function of the measured value. A simple but flexible and useful error model is used in SIMUSOLV (Steiner et al., 1986 Burt, 1989). [Pg.114]

The knowledge required to implement Bayes formula is daunting in that a priori as well as class conditional probabilities must be known. Some reduction in requirements can be accomplished by using joint probability distributions in place of the a priori and class conditional probabilities. Even with this simplification, few interpretation problems are so well posed that the information needed is available. It is possible to employ the Bayesian approach by estimating the unknown probabilities and probability density functions from exemplar patterns that are believed to be representative of the problem under investigation. This approach, however, implies supervised learning where the correct class label for each exemplar is known. The ability to perform data interpretation is determined by the quality of the estimates of the underlying probability distributions. [Pg.57]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...