Sample statistics and population parameters

Finite or infinite set of individuals (ob- Frank and Todeschini [1994] jects, items). A population implicitly con- —> Sample (in the statistical tains all the useful information for cal- sense) culating the true values of the population parameters , e.g., the mean p and the standard deviation o. [Pg.317]

Dispersion parameter for the distribution of measured values, s2, or analytical results, s2, for a given sample or the population, o2 and o2. Statistically defined as the second moment about the mean. [Pg.329]

In most natural situations, physical and chemical parameters are not defined by a unique deterministic value. Due to our limited comprehension of the natural processes and imperfect analytical procedures (notwithstanding the interaction of the measurement itself with the process investigated), measurements of concentrations, isotopic ratios and other geochemical parameters must be considered as samples taken from an infinite reservoir or population of attainable values. Defining random variables in a rigorous way would require a rather lengthy development of probability spaces and the measure theory which is beyond the scope of this book. For that purpose, the reader is referred to any of the many excellent standard textbooks on probability and statistics (e.g., Hamilton, 1964 Hoel et al., 1971 Lloyd, 1980 Papoulis, 1984 Dudewicz and Mishra, 1988). For most practical purposes, the statistical analysis of geochemical parameters will be restricted to the field of continuous random variables. [Pg.173]

If the value of a sample statistic 6 is used to estimate a parameter 0 of the population, this statistic is called an estimator and its value for the sample the estimate. Sample mean x and variance s2 are the usual estimators of the population mean g and... [Pg.185]

The statistical methods discussed up to now have required certain assumptions about the populations from which the samples were obtained. Among these was that the population could be approximated by a normal distribution and that, when dealing with several populations, these have the same variance. There are many situations where these assumptions cannot be met, and methods have been developed that are not concerned with specific population parameters or the distribution of the population. These are referred to as non-parametric or distribution-free methods. They are the appropriate methods for ordinal data and for interval data where the requirements of normality cannot be assumed. A disadvantage of these methods is that they are less efficient than parametric methods. By less efficient is meant... [Pg.305]

From the table of random numbers take 20 different sample data with 10 random numbers. Determine the sample mean and sample variance for each sample. Calculate the average of obtained statistics and compare them to population parameters. [Pg.7]

Statistical estimation uses sample data to obtain the best possible estimate of population parameters. The p value of the Binomial distribution, the p value in Poison s distribution, or the p and a values in the normal distribution are called parameters. Accordingly, to stress it once again, the part of mathematical statistics dealing with parameter distribution estimate of the probabilities of population, based on sample statistics, is called estimation theory. In addition, estimation furnishes a quantitative measure of the probable error involved in the estimate. As a result, the engineer not only has made the best use of this data, but he has a numerical estimate of the accuracy of these results. [Pg.30]

We need to make a decision related to the disposition of soil that has been excavated from the subsurface at a site with lead contamination history. Excavated soil suspected of containing lead has been stockpiled. We may use this soil as backfill (i.e. place it back into the ground), if the mean lead concentration in it is below the action level of 100 milligram per kilogram (mg/kg). To decide whether the soil is acceptable as backfill, we will sample the soil and analyze it for lead. The mean concentration of lead in soil will represent the statistical population parameter. [Pg.22]

The mean TEQ concentration for all samples collected from the entire area will be the statistical parameter that characterizes the target population. A statistical evaluation will be conducted for the TEQ concentrations for all of the samples collected from the entire area. The mean TEQ concentration for each grid will be compared to the action level for a yes or no decision. This project is not based on a probabilistic sampling design and does not have statistical parameters. The parameters that characterize the population of interest are specified in the NPDES discharge permit. They are the VOC concentrations in every effluent sample collected. [Pg.24]

Then, given a model for data from a specific drug in a sample from a population, mixed-effect modeling produces estimates for the complete statistical distribution of the pharmacokinetic-dynamic parameters in the population. Especially, the variance in the pharmacokinetic-dynamic parameter distributions is a measure of the extent of inherent interindividual variability for the particular drug in that population (adults, neonates, etc.). The distribution of residual errors in the observations, with respect to the mean pharmacokinetic or pharmacodynamic model, reflects measurement or assay error, model misspecification, and, more rarely, temporal dependence of the parameters. [Pg.312]

On many occasions, sample statistics are used to provide an estimate of the population parameters. It is extremely useful to indicate the reliability of such estimates. This can be done by putting a confidence limit on the sample statistic. The most common application is to place confidence limits on the mean of a sample from a normally distributed population. This is done by working out the limits as F— ( />[ i] x SE) and F-I- (rr>[ - ij x SE) where //>[ ij is the tabulated critical value of Student s t statistic for a two-tailed test with n — 1 degrees of freedom and SE is the standard error of the mean (p. 268). A 95% confidence limit (i.e. P = 0.05) tells you that on average, 95 times out of 100, this limit will contain the population... [Pg.278]

To the statistician the process of sampling consists of drawing from a population a finite number of units to be examined. From sample statistics, such as mean and standard deviation, estimates are made of the population parameters. By appropriate tests of significance, confidence limits are placed on the estimates. Sampling for chemical analysis is an example of statistical sampling in that conclusions are drawn about the composition of a much larger bulk of material from an analysis of a limited sample. [Pg.565]

It is impossible to conduct an infinite number of extractions of a speciman to determine the accuracy of a method. As a result, we estimate the accuracy of an assay by performing a finite number of extractions (n) on the specimen. We report the accuracy as the mean (x- = Hxifn, i = 1,2,. ..,n) of the multiple determinations, expressed as a percent of the known concentration. The finite group of determinations is a sample from the population, and its mean is referred to as the sample mean. The sample mean is a statistic that estimates the population parameter p. If we could obtain the means from an infinite number of same-size samples, regardless of their size, then the mean of these infinite sample means would equal p. In statistical terminology, we say that the sample mean is an unbiased estimator of the population mean. Unbiasedness is a... [Pg.3484]

A data set is often considered as a sample from a population and the sample parameters calculated from the data set as estimates of the population parameters (-> statistical indices). Moreover, it is usually used to calculate statistical models such as quantitative -> structure/response correlations. In this case the data set is organized into a data matrix X with n rows and p columns, where each row corresponds to an object of the data set and each column to a variable therefore each element represent the value of the yth variable for the ith object (/ = 1,. .., n j = 1,. .., p). [Pg.98]

We can return to the data presented in Table 1 for the analysis of the mineral water. If the parent population parameters, a and po, are known to be 0.82 mg kg- and 10.8 mg kg" respectively, then can we answer the question of whether the analytical results given in Table 1 are likely to have come from a water sample with a mean sodium level similar to that providing the parent data. In statistic s terminology, we wish to test the null hypothesis that the means of the sample and the suggested parent population are similar. This is generally written as... [Pg.6]

Figure 6-4a shows two Gaussian curves in which we plot the relative frequency y of various deviations from the mean versus the deviation from the mean. As shown in the margin, curves such as these can be described by an equation that contains just two parameters, the population mean p. and the population standard deviation a. The term parameter refers to quantities such as pu and a that define a population or distribution. This is in contrast to quantities such as the data values x that are variables. The term statistic refers to an estimate of a parameter that is made from a sample of data, as discussed below. The sample mean and the sample standard deviation are examples of statistics that estimate parameters p. and a, respectively. [Pg.111]

Such a data set has a mean and a SD. The mean of the data set of sample statistics will be 11, the population parameter on average the sample statistics will be g and hence p from any sample is an unbiased estimator of g. The variation of the sample statistics, p, can be described in the same way as for any data set the SD of the distribution of sample statistics is known as the standard error (SE) (of the estimate) - here it would be SEp. [Pg.375]

The unknown quantities of interest described in the previous section are examples of parameters. A parameter is a numerical property of a population. One may be interested in measures of central tendency or dispersion in populations. Two parameters of interest for our purposes are the mean and standard deviation. The population mean and standard deviation are represented by p and cr, respectively. The population mean, p, could represent the average treatment effect in the population of individuals with a particular condition. The standard deviation, cr, could represent the typical variability of treatment responses about the population mean. The corresponding properties of a sample, the sample mean and the sample standard deviation, are typically represented by x and s, which were introduced in Chapter 5. Recall that the term "parameter" was encountered in Section 6.5 when describing the two quantities that define the normal distribution. In statistical applications, the values of the parameters of the normal distribution cannot be known, but are estimated by sample statistics. In this sense, the use of the word "parameter" is consistent between the earlier context and the present one. We have adhered to convention by using the term "parameter" in these two slightly different contexts. [Pg.69]

A reasonable suggestion for devising a confidence interval for the population mean would be to substitute the sample estimate, s, for the corresponding population parameter, a and proceed as described earlier in Section 6.10. However, when the sample size is small (particularly < 30) the use of the Z distribution is less appropriate. William S Gossett, writing anonymously as "Student" while employed at Guinness Brewery, proposed the following statistic as an alternative. When X is a normally distributed variable and the sample size is small, the statistic... [Pg.72]

To summarize, the computational aspects of confidence intervals involve a point estimate of the population parameter, some error attributed to sampling, and the amount of confidence (or reliability) required for interpretation. We have illustrated the general framework of the computation of confidence intervals using the case of the population mean. It is important to emphasize that interval estimates for other parameters of interest will require different reliability factors because these depend on the sampling distribution of the estimator itself and different calculations of standard errors. The calculated confidence interval has a statistical interpretation based on a probability statement. [Pg.74]

As before, these sample statistics are estimates of the unknown population parameters, the population means, and the population variances. If the population variances are assumed to be equal, each sample statistic is a different estimate of the same population variance. It is then reasonable to average or "pool" these estimates to obtain the following ... [Pg.120]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...