next up previous contents
Next: Testing Hypotheses Up: Statistical Definitions Previous: Sample Distributions   Contents

Subsections

Estimation

Some definitions Even efficient estimators will have some error when they estimate the value of a population parameter. In many cases it is better to know an interval within which the population parameter can be found. This type of estimate is termed as an interval estimate, ie we say that the population parameters lies within the interval

$\displaystyle \hat \theta_{L} < \theta < \hat \theta_{U}
$

Some features of interval estimates

Estimating the Mean

$\sigma $ known

Since $\overline X$ has a sampling distribution centered at $\mu$ (the population mean) it has a variance smaller than other estimators of $\mu$. Furthermore since the variance of $\overline X$ is defined as $\sigma^{2}_{\overline X} = \sigma^{2} / n$, the variance decreases with larger sample sizes. Thus $\overline x$ is the best point estimate of $\mu$

Considering interval estimates we know that the Central Limit Theorem says that the sampling distribution of $\overline X$ is approximately normal with mean $\mu_{\overline X} = \mu$ and standard deviation $\sigma_{\overline
X} = \sigma / n$. Thus if $z_{\alpha/2}$ is the z value above which we get an area of $\alpha / 2$ then we can write

$\displaystyle P( -z_{\alpha/2} < Z < z_{\alpha/2}) = 1 - \alpha
$

and since Z is defined as

$\displaystyle Z = \frac{\overline X - \mu}{\sigma / \sqrt{n}}
$

we can substitute and manipulate to get

$\displaystyle P(
\overline x - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} < \mu <
\overline x + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}) = 1 - \alpha
$

Essentaially this says that if $\overline x$ is the mean of a random sample of size $n$ taken from a population with mean $\mu$ and standard deviation $\sigma $ then the $(1-\alpha)100\%$ confidence interval for $\mu$ is given by

$\displaystyle \overline x - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} < \mu <
\overline x + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}
$

Thus the values of $\theta_{L}$ and $\theta_{U}$ are the left and right sides of the inequality.

Thus for the mean we can say that if $\overline x$ is an estimator of $\mu$ then we can be $(1-\alpha)100\%$ sure that the error (ie $\overline x - \mu$) will not exceed $z_{\alpha/2}
\frac{\sigma}{\sqrt{n}}$

The number of members in a sample required to achieve a $(1-\alpha)100\%$ confidence level for an error $e$ (ie $(1-\alpha)100\%$ sure that the error will not exceed $e$) is given by

$\displaystyle n = \left( \frac{z_{\alpha/2} \sigma}{e} \right)^{2}
$

$\sigma $ unknown

When we have a sample from a normal distribution with an unknown standard deviation then the variable defined as

$\displaystyle T = \frac{\overline X - \mu}{S / \sqrt{n}}
$

has a t distribution with $ n - 1$ degrees of freedom. Proceeding as above we can conclude that if $\overline x$ and $s$ are the sample mean and standard deviation of a sample from a normal population with unknown variance then a $(1-\alpha)100\%$ confidence interval for $\mu$ is given by

$\displaystyle \overline x - t_{\alpha/2} \frac{s}{\sqrt{n}} < \mu <
\overline x + t_{\alpha/2} \frac{s}{\sqrt{n}}
$

where $t_{\alpha/2}$ is a t value with $\nu = n - 1$ degrees of freedom.

An important feature is when $\sigma $ is known we can use the Central Limit Theorem (ie a normal distribution) and when $\sigma $ is unknown we use the sampling distribution of T (ie a t distribution). In many cases when $\sigma $ is unknown and $n > 30$ $s$ can be used instead of $\sigma $ to give the interval

$\displaystyle \overline x \pm z_{\alpha/2} \frac{s}{\sqrt{n}}
$

This is termed as the large sample confidence interval

Standard Error of a Point Estimate

We know that the variance of the estimator $\overline X$ is

$\displaystyle \sigma_{\overline X}^{2} = \frac{\sigma^{2}}{n}
$

The standard deviation of $\overline X$ is also termed as the standard error. Thus confidence intervals can also be written as

$\displaystyle \overline x \pm z_{\alpha/2} \mathrm{s.e.}(\overline x)
$

Tolerance Limits

Estimating the Variance

When using $S^{2}$ as an estimator for the population $\sigma^{2}$ we can get an interval estimate of $\sigma^{2}$ by the statistic

$\displaystyle X^{2} = \frac{(n-1)S^{2}}{\sigma^{2}}
$

which has a $ \chi^{2}$ distribution with $ n - 1$ degrees of freedom (when the samples are taken from a normal population). Rearranging and proceeding as before we get
next up previous contents
Next: Testing Hypotheses Up: Statistical Definitions Previous: Sample Distributions   Contents
2003-08-29