Normal linear regression model

OBSERVATIONS FROM NORMAL LINEAR REGRESSION MODEL... [Pg.87]

In the normal linear regression model, we have n independent observations i/i,..., j/n where each observation yi has its own mean Hi, and all observations have the same variance cr. The means are unknown linear functions of the p predictor variables xi,..., Xp. The values of the predictor variables are known for each observation. Hence we can write the mean as... [Pg.87]

When we have n independent observations from the normal linear regression model where the observations all have the same known variance, the conjugate prior distribution for the regression coefficient vector /3 is multivariate normal(bo, Vq). The posterior distribution of /3 will be multivariate nor-mal y>i, Vi), where... [Pg.91]

A valuable inference that can be made to infer the quality of the model predictions is the (l-a)I00% confidence interval of the predicted mean response at x0. It should be noted that the predicted mean response of the linear regression model at x0 is y0 = F(x0)k or simply y0 = X0k. Although the error term e0 is not included, there is some uncertainty in the predicted mean response due to the uncertainty in k. Under the usual assumptions of normality and independence, the covariance matrix of the predicted mean response is given by... [Pg.33]

In this example one can see that the log-normal model, Fig. 4, is a better fit to the data than the normal distribution, Fig. 3. The parameter estimates for the median and standard deviation of each sample can be found from a straightforward application of the simple linear regression model expressed by equation (3) ... [Pg.554]

A second problem is that sometimes a nonlinear function cannot be found that adequately fits the response data. In this case, it may be possible to transform the independent variable such that the model becomes linear and a suitable regression function can easily be found. Another often used approach is to transform the dependent variable such that the model becomes linear. This approach cannot be advocated any longer since often times the transformation, while creating a linear regression model, often leads to heteroscedasticity and non-normality in the residual. In a sense, the analyst is robbing Peter to pay Paul, so to speak. [Pg.138]

Figure 6.1 illustrates this relationship graphically for a 1-compartment open model plotted on a semi-log scale for two different subjects. Equation (6.15) has fixed effects (3 and random effects U . Note that if z = 0, then Eq. (6.15) simplifies to a general linear model. If there are no fixed effects in the model and all model parameters are allowed to vary across subjects, then Eq. (6.16) is referred to as a random coefficients model. It is assumed that U is normally distributed with mean 0 and variance G (which assesses between-subject variability), s is normally distributed with mean 0 and variance R (which assesses residual variability), and that the random effects and residuals are independent. Sometimes R is referred to as within-subject or intrasubject variability but this is not technically correct because within-subject variability is but one component of residual variability. There may be other sources of variability in R, sometimes many others, like model misspecification or measurement variability. However, in this book within-subject variability and residual variability will be used interchangeably. Notice that the model assumes that each subject follows a linear regression model where some parameters are population-specific and others are subject-specific. Also note that the residual errors are within-subject errors. [Pg.184]

Two datasets are fist simulated. The first contains only normal samples, whereas there are 3 outliers in the second dataset, which are shown in Plot A and B of Figure 2, respectively. For each dataset, a percentage (70%) of samples are randomly selected to build a linear regression model of which the slope and intercept is recorded. Repeating this procedure 1000 times, we obtain 1000 values for both the slope and intercept. For both datasets, the intercept is plotted against the slope as displayed in Plot C and D, respectively. It can be observed that the joint distribution of the intercept and slope for the normal dataset appears to be multivariate normally distributed. In contrast, this distribution for the dataset with outliers looks quite different, far from a normal distribution. Specifically, the distributions of slopes for both datasets are shown in Plot E and F. These results show that the existence of outliers can greatly influence a regression model, which is reflected by the odd distributions of both slopes and intercepts. In return, a distribution of a model parameter that is far from a normal one would, most likely, indicate some abnormality in the data. [Pg.5]

For example, the output rate of a simple SISO reactor depends on various conditions. To model the transformation of input to output, knowledge about the chemical reaction and the chemical reactor can be used. E.g., a linear model might be used to describe this relationship properly on an aggregated level (e.g. the hourly production rate). Neglecting minor influences leads to a simplification of the process model. Additionally, measurement errors may hinder a perfect description of the process and lead to uncertainty in the observed process measures. This uncertainty is expressed e.g. by a (normal) error process. The resulting linear regression model can be verified using historical records of the process. Often historical records allow analysts to deduce a proper stochastic model of such a process. For more complex production processes more sophisticated stochastic models (as described in section 2.3) can be necessary. [Pg.145]

The models built in the previous steps can be parameterized based on physiogenomic data. The maximum likelihood method is used, which is a weU-established method for obtaining optimal estimates of parameters. S-plus provides very good support for algorithms that provide these estimates for the initial linear regression models, as well as other generalized linear models that we may use when the error in distribution is not normal. [Pg.456]

Because it is of particular interest in the present context, we now obtain the normal equations for linear regression with a single independent vanable. The model function is... [Pg.44]

Statistical testing of model adequacy and significance of parameter estimates is a very important part of kinetic modelling. Only those models with a positive evaluation in statistical analysis should be applied in reactor scale-up. The statistical analysis presented below is restricted to linear regression and normal or Gaussian distribution of experimental errors. If the experimental error has a zero mean, constant variance and is independently distributed, its variance can be evaluated by dividing SSres by the number of degrees of freedom, i.e. [Pg.545]

VHien this method is used, Table II shows the results when the regression model is the normal first order linear model. Since the maximum absolute studentized residual (Max ASR) found, 2.29, was less than the critical value relative to this model, 2.78, the conclusion is that there are no inconsistent values. [Pg.46]

There are several properties of linear regression that should be noted. First, it is assumed that the model errors are normally distributed. Second, the relationship between the x and y variables is assumed to be linear. In analytical chemistry, the first assumption is generally a reasonable one. However, the second assumption might not be sufficiently accurate in many situations, especially if a strong nonlinear relationship is suspected between x and y. There are some nonlinear remedies to deal with such situations, and these will be discussed later. [Pg.360]

A calibration curve is a model used to predict the value of an independent variable, the analyte concentration, when only the dependent variable, the analytical response, is known. The normal procedure used to establish a calibration curve is based on a linear least-squares fit of the best straight line for a linear regression, as indicated in... [Pg.232]

For the biotransformation assay results, concentration-normalized maximum induction values were modeled. Stepwise multiple linear regression analysis was used to select the most suitable parameters from log Kow, EHOMO, ELUMO, and the difference in EHOMO and ELUM0 (ELUM0 - EH0M0). [Pg.381]

A straight-line model is the most used, but also the most misused, model in analytical chemistry. The analytical chemist should check five basic assumptions during method validation before deciding whether to use a straight-line regression model for calibration purposes. These five assumptions are described in detail by MacTaggart and Farwell [6] and basically are linearity, error-free independent variable, random and homogeneous error, uncorrelated errors, and normal distribution of the error. The evaluation of these assumptions and the remedial actions are discussed hereafter. [Pg.138]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...