MATH 105, Topics in Mathematics

Lesson 8: Estimation

Introduction	8.1 Point and Interval Estimation	8.2 When σ Is Unknown
8.3 About the Population Proportion		Homework

Introduction

For this chapter you need an advanced calculator. We discuss the concepts and theory and then use a calculator to solve the problems.

The name of this game of Statistics is to try to understand the POPULATION on the basis of the information available in the SAMPLE. Among what we mean by "understand" is to estimate the values of the population PARAMETERS. The game here is to use suitable sample STATISTICS to estimate population parameters. For example, we may like to use the sample mean x as an estimate for the population mean μ.

We consider two types of estimation of parameters.

The first one is called point estimation. In point estimation, we give a number as an estimate for the parameter. For example, if we are trying to estimate the mean height μ of the American population, then we may take a sample of certain size, compute the sample mean height x, and call it an estimate for μ.
The second one is called interval estimation. In interval estimation we give an interval (L, U) and say that the parameter will be within this interval (with a certain degree of confidence). For example, while estimating the mean height μ of the American population, we may take a sample, compute the sample mean x, and say that the population mean μ is in the interval x-1, x+1). Obviously, in interval estimation, the smaller the length of U-L the better interval it is (and the higher the degree of confidence it is).

8.1 Point and Interval Estimation

As we have already mentioned, we use a statistic to estimate a parameter. The statistic T used to estimate a parameter θ is called an estimator of θ. The computed value t of T is called a point estimate or an estimate of θ.
For example, the sample mean X is an estimator of μ, and the computed value x is an estimate of μ. The estimator is a sampling random variable. Similarly, the sample variance S² is an estimator of the population variance σ², and the computed value s² is an estimate of σ².

This may be intuitively clear to you why X and S² would be reasonable estimators, respectively, for μ and σ². We will not analyze it more deeply. We also restrict ourselves to the estimating of the sample mean μ and the population proportion p.

Interval Estimation

Almost never would we expect a point estimate t of a parameter θ to be exactly equal to the actual value of θ. This is why it is more reasonable that we give an interval (L,U) and say that θ would be within this interval. Here L, U will be statistics. Because the computed values of L = l, U = u will depend on the sample, we do not expect that the value of θ will always be within this computed interval (l,u). We are happy as long as the true value of θ falls within the interval (L,U) most often (or often enough) by allowing the possibility of being "wrong" a few times.

But how often is often enough? The probability P(L < θ < U) tells us how often the paramenter will fall within (L,U). It is also reasonable to give the probability P(L < θ < U) or P(θ not in (L,U)). This is what we do in interval estimation, also called a confidence interval of θ.

Definition. Let θ be a population parameter. An interval estimate for θ
provides the following:

It gives an interval (L,U) as an estimate for θ. Here L,U are statistics.
It also gives the probability P(L < θ < U). This number
P(L < θ < U) = 1- α
is called the level of confidence. And (L,U) is said to be an (1- α) 100 percent
confidence interval of θ.
In practice, α will be a small number, like, 0.1, 0.01, 0.05.

Definition. Given a number 0 < α < 1, the z-value z_α is defined by the formula

P(Z > z_α) = α.

The following is a table such z-values:

α	z-Value
z-Values : P(Z > z_α) = α
z_.005	2.58
z_.01	2.33
z_.02	2.05
z_.025	1.96
z_.05	1.65
z_.1	1.28
Z-values : the Flash program

A (1- α )100 percent confidence interval for the mean μ:

Suppose X is a random variable with mean μ and variance σ². We want to construct a confidence interval for μ.

We assume that σ is known. Let X₁,X₂, ... , X_n be a sample from X. Note that from CLT we have, approximately,

P(-z_α/2 < Z < z_α/2 ) = 1 - α

where Z=(X-μ)/(σ/n^1/2).

If we simplify, we get

P(X-E < μ < X+E)=1- α

where E=(z_α/2σ)/n^1/2.

We have the following theorem.

Theorem. Assume that σ is known. Then a (1- α)100 percent confidence interval for μ is given by

The Z-Interval

X-E < μ < X+E

where E=(z_α/2σ)/n^1/2

l = X - E is called the left-end-point.
r = X + E is called the right-end-point.

Remarks.

If you go on computing (1- α)100 percent confidence interval on regular basis, then the true value of μ will not be within the confidence interval α 100 percent of the time.
The confidence interval we computed above may also be called a (1- α )100 percent two-sided confidence interval for μ. There could be all kinds of confidence intervals. For example, if

P(L < μ < infinity) = 1 - α.
Then (L, infinity) will be a (1- α) 100 percent one-sided (upper) confidence interval for μ.

Definitions and Formulas

The length l of this (1- α) 100 percent confidence interval for μ is given by
l = 2(z_α/2σ)/n^1/2.
The margin of error E is defined as

E =(z_α/2σ)/n^1/2.

Use of Calculators, (if you have a TI-83):
Press stat and then select TESTS. Select Zinterval and enter. Input: you will have to select stats (not data) in this section. Feed in the values of σ, x, n, and c-level. Select calculate and enter. It will give you the confidence interval.
The calculator will give you the confidence interval. The margin of error = E = (width of the interval)/2 = (right end - left end)/2.

Problems on 8.1: Point and Interval Estimation

Use your calculator to solve the following problems. You can have a look at the flash-animated solutions for conceptual reasons.

Exercise 8.1.1. Assume that you have a normal population with mean μ and standard deviation σ = 15. Suppose you have collected a sample of size 25, and the sample mean x was found to be 81.

Find the margin of error at 99 percent level of confidence.
Find a 99 percent confidence interval for μ.

Solution

Exercise 8.1.2. Assume that you have a normal population with mean μ and standard deviation σ = 9.8. Suppose you have collected a sample of size 14, and the sample mean x was found to be 151.1.

Find the margin of error at 99 percent level of confidence.
Find a 99 percent confidence interval for μ.

Solution

Exercise 8.1.3. The time taken by an athlete to run an event is normally distributed with mean μ, and known standard deviation σ = 3.5 seconds. To estimate the mean μ he ran 16 times, and sample mean was found to be x = 33 seconds.

Find the margin of error in estimating the true mean μ with 99 percent level of confidence.
Find a 99 percent confidence interval for μ.

Solution

Exercise 8.1.4. A population has normal distribution with standard deviation σ = 17. Suppose you collect a sample size 211 and your sample mean in x = 18.

Find the margin of error in estimating the true mean μ with 95 percent level of confidence.
Find a 95 percent confidence interval for μ.

Exercise 8.1.5. The tuition X paid by a student per semester in a university has a distribution with mean μ and σ = $416. Suppose you collect a sample of size 313 students and your sample mean is x = $1240.

Find the margin of error in estimating the true mean μ with 95 percent level of confidence.
Find a 95 percent confidence interval for the mean tuition μ paid.

Solution

Exercise 8.1.6. It is suspected that an industrial plant is polluting the water stream. To determine the extent of damage, water sample of size n = 13 was collected and the dissolved oxygen concentration was measured. The mean concentration was found to be x = 2.3. It is known from past experience that σ = 0.45.

Find the margin of error in estimating the true mean μ with 95 percent level of confidence.
Find a 99 percent confidence interval for the mean μ oxygen concentration.

Solution

8.2 When σ Is Unknown

Let X be a normal random variable with mean μ and variance σ². Unlike in the last section, in this section we assume that σ is not known, and we try to compute a confidence interval of μ. In the last section, the main tool (or fact) that we used was

Z=(X-μ)/( σ/n^1/2)

has N(0,1) distribution. In this section, we use the distribution of

T= (X-μ)/(S /n^1/2).

The distribution of T is known as t-distribution with degrees of freedom n-1, which we have not discussed. As we did for the N(0,1) random variable, we now give the properties of t-distribution.

About t-distribution

Given a positive integer ν, there is a random variable T = t_ν that is said to have t-distribution with degrees of freedom ν. The useful properties of t-distribution is listed below:

A t-random variable has degrees of freedom. If a random variable T has t-distribution with degrees of freedom ν, then we say that T has t_ν-distribution.
A t-random variable behaves very much like the Z-random variable N(0, 1).
The t-random variables are continuous random variables.
The mean of a t-random variable is ZERO.

The graph of the pdf of a t-random variable is symmetric around the y-axis and has a perfect bell shape.

Flash animation of the pdf of t-Distribution.

Click for the Flash animation of the pdf of t-Distribution.
For a T = t_ν random variable, if the degrees of freedom ν is large, then it can be approximated by a N(0,1) random variable.
For a number 0 < α < 1 and any positive integer ν, we define a number t_{ν, α} by the equation

P(T > t_{ν, α}) = α
where T has t distribution with degrees of freedom ν.
To visualize these numbers t_{ν, α} in a flash animation click here.
For each degree of freedom ν there is a table that gives probabilities. We may not discuss the table because we will use the caculator directly to compute confidence interval.

Theorem. Let X be a normal random variable with mean μ and standard deviation σ. Let X₁,X₂, ... , X_n be a sample of size n from the X population. Then

T= (X-μ)/(S /n^1/2)

has t-distribution with degrees of freedom n-1.

P(-t_n-1,α/2 < T= (X-µ)/(S /n^1/2) < t_n-1,α/2 ) = 1- α.

If we simplify, then we get

P(X-E < µ < X+E)=1- α

where E = (s/n^½)t_n-1,α/2.

A (1- α) 100 percent Confidence Interval for µ

Under the terms of the theorem, a (1- α) 100 percent confidence interval for µ is given by

The T-Interval

X-E < µ < X+E
where E = (s/n^½)t_n-1,α/2 .

l = X - E will be called the left-end-point
r = X + E will be called the right-end-point

E is also called the margin of error.

Use of Calculators (if you have a TI-83):
If you have raw data, then enter the data into the Calculator. Press stat and then select TESTS. Select Tinterval and enter. Input: you will have to select stats or data, depending on what is given. Feed in the values of sample standard deviation s, x, n, or the list where you have the data and c-level. Select calculate and enter. The calculation will give you the confidence interval.
The margin of error = E = (width of the interval)/2 = (Right end - Left end)/2 .

Problems on 8.2: When σ is Unknown

Use your calculator to solve the following problems.
You can have a look at the flash-animated solutions for conceptual reasons.

Exercise 8.2.1. Assume that we have normal populations with mean μ and standard deviation σ. We have a sample of size n = 18 that has sample mean x = 170.5, and standard deviation s = 13.3. Compute a 99 percent confidence interval for μ. Solution

Exercise 8.2.2. Suppose that the time taken to complete a problem in the Math 105 test is normally distributed with mean µ and standard deviation σ. A sample of size 23 was taken, and sample mean and standard deviation were found to be x = 4.7 and s = .47. Estimate the mean time μ taken to complete a problem using a 98 percent confidence interval. Solution

Exercise 8.2.3. It is assumed that the lifetime (in hours) of light bulbs produced in a factory is normally distributed with mean μ and standard deviation σ. To estimate μ the following data was collected on the lifetime of bulbs:

5110	4671	6441	3331	5055	5270	5335	4973	1837
7783	4560	6074	4777	4707	5263	4978	5418	5123

Compute a 95 percent confidence interval for μ. Solution

Exercise 8.2.4. To estimate the mean time taken to complete an event by an athlete, the athlete ran several times, and the following sample of time taken to complete the event was collected:

24.7	23.8	28.2	25.3	21.8
35.3	33.1	31.3	22.5	22.3
21.8	31.5	34.5	24.2	21.3
22.6	29.5	23.1	33.3

Compute a 95 percent confidence interval for the mean μ. Solution

Exercise 8.2.5. Suppose we collect a sample from a normal population of size n = 40 with sample mean X = 18.6 and standard deviation s = 9.486. Constuct a 95 percent confidence interval for mean μ. Solution

Exercise 8.2.6. The time taken by an athlete to run an event is normally distributed with mean μ and unknown standard deviation σ. To estimate the mean μ he ran 16 times, and the sample mean was found to be x = 33 seconds and sample standard deviation s = 3.5 seconds.

Find the margin of error in estimating the true mean μ with 99 percent level of confidence.
Find a 99 percent confidence interval for μ.

Solution

Exercise 8.2.7. Suppose that a sample of size n = 40 pumpkins collected from a farm had a mean weight x = 18.6 pounds and standard deviation s = 7.3 pounds. Give a 99 percent, approximate, confidence interval for the mean weight μ of the pumpkins in the farm. Solution

Exercise 8.2.8. A factory pays the workers depending on the number of units they produce. A sample of 72 workers produced a mean x = 13.4 units and standard deviation s = 2.1 units. Compute a 95 percent confidence interval for mean number μ of units produced by the workers.

Exercise 8.2.9. The mean μ daily number of classified ads published in a newspaper needs to be estimated. A sample over 84 days produced a mean x = 9910 ads and standard deviation s = 1105 ads. Give a 90 percent confidence interval for µ .

Exercise 8.2.10. To estimate the mean weight μ (in pounds) of salmon in a river the following sample was collected:

34.7	33.8	38.2	20.3	27.8	45.3	43.1	37.3	32.5	32.3
31.8	41.5	44.5	29.2	25.3	29.6	39.5	29.1	37.3

Compute a 99 percent confidence interval for the sample mean μ. Solution

Exercise 8.2.11. To estimate the mean μ birth weight of the babies, the following data on birth weight was collected.

8.8	8.1	6.3	9.7	6.3
7.1	5.3	7.7	9.1	8.1
8.2	7.9	8.3	8.9	9.0
10.1	9.9	8.8	7.8	5.2
7.2	7.4	9.1	8.6	6.2
6.3	5.2	8.3	5.9	5.5
7.1	8.1	7.9	6.3	6.9
9.1	8.1	7.0	4.9	5.3
6.3	7.1	6.3	6.1	5.8
5.7	6.8	8.3	7.7

Find a 97 percent confidence interval for μ. Solution

8.3 About the Population Proportion

Let p be the population proportion of a certain attribute. Typical examples are:

P may be proportion of people who are making more than $50,000 annually.
P may be the proportion of the likely voters who are in favor of a candidate.
P may be the proportion of the population that has AIDS.
P may be the proportion of defective item produced by a machine.

We want to compute a confidence interval for p. We let

Y = 1	if success
Y = 0	if failure

where "success" means that the sample has the attribute.

Y is a Bernoulli(p) random variable. We draw a sample X₁,X₂, ... , X_n from the Y population, let

X = X₁+ ... +X_n

be the total number of success and

X=X/n

be the sample proportion of success. It follows from CLT that, approximately, the sample proportion X has

N(µ _X, σ_X)-distribution

where µ _X = p and σ _X = [(p(1-p))/n]^1/2.

Therefore,

P(-z_α/2 < (X-p)/σ _X < z_α/2 ) = 1- α.

In an attempt to compute a confidence interval for p, we simplify and get

P(X-z_α/2 σ_X < p < X+z_α/2 σ _X) ) = 1- α.

In an attempt to compute a confidence interval for p, we simplify and get

P(X-z_α/2 σ_X < p < X+z_α/2 σ _X) ) = 1- α.

Because p is unknown, this will not produce a confidence interval for p. But the sample proportion x of success is a point estimate of p. We have, an approximate, (1- α) 100 percent confidence interval for p given by

The 1-Proportion-Z--Interval

x-e < p < x+e

where

x = the number of success
x= x/n is the sample proportion of success,
and
e = z_α/2 (x(1-x)/n)^1/2

l = x - e will be called the left-end-point
r = x + e will be called the right-end-point

e is also called the margin of error.

A conservative margin of error E is defined as

E = z_α/2 (1/4n)^1/2.

It can be checked that the margin of error e is always less than or equal to the conservative margin of error E.

Remark. During President Clinton's term in the White House we often heard TV news commentators read something like the following:

President Clinton has 64 percent approval rating. The poll has a margin of error plus or minus 3.1 percentage points. The poll surveyed 972 people.

They mean that the sample proportion x of people who "approve" President Clinton is 0.64. Normally they don't tell us the level of confidence they are using. Assuming that they use 95 percent confidence interval, they mean that

E = z_α/2 (1/4n)^1/2 = 1.96(1/ 4*972)^1/2 = 0.031.

Use of Calculators ( TI-83)

Press stat and then select TESTS.
Select 1-PropZint and enter.
Feed in the values of number of success x, n and c-level.
Select calculate and enter. The calculator will give you the confidence interval.
The margin of error = e = (width of the interval)/2
= (Right end - Left end)/2 .
To compute the conservative margin of error, use the formula.
To compute the sample size, use the formula above.

Problems on 8.3: About the Population Proportion

Use your calculator to solve the following problems.
You can have a look at the flash-animated solutions for conceptual reasons.

Exercise 8.3.1. In a sample of 197 apples from a lot, 19 were found to be sour. Set a 99 percent confidence interval for the proportion p of sour apples in the lot. Solution

Exercise 8.3.2. A new vaccine was tried on 147 randomly selected individuals, and it was determined that 97 of them developed immunity. Find a 95 percent confidence interval for the proportion p of individuals in the population for whom the vaccine would help. Solution

Exercise 8.3.3. For the coming congressional election, a poll was conducted: Out of 887 randomly selected voters interviewed, 389 said that they would vote for Candidate A, 359 said that they would vote for Candidate B.

Construct a 98 percent confidence interval for the proportion p of voters who would vote for A.
Construct a 98 percent confidence interval for the proportion q of voters who would vote for B.

Solution

Exercise 8.3.4. A pollster wants to estimate the proportion p of Americans who thought that President Clinton should not have been impeached in 1998. The pollster interviewed 711 individuals, and 420 agreed. Compute a 95 percent confidence interval for p.

Exercise 8.3.5. The proportion p of defective light bulbs produced by a machine needs to be estimated. A sample of 812 bulbs were tested, and 162 were found to be defective. Compute a 98 percent confidence interval for p.

Exercise 8.3.6. In a poll read on October 28, 1998, it was revealed that 60 percent of Americans wanted President Clinton rebuked but not impeached. It was also given that poll was conducted among 1,013 adults, and it had a margin of error of 3 percentage points.

Can you relate the last two numbers?
What is the level of confidence used here?
Give a confidence interval for the proportion p of adults who feel this way.

Math 105, Topics in Mathematics

Lesson 8: Estimation

Introduction

8.1 Point and Interval Estimation

8.2 When σ Is Unknown

8.3 About the Population Proportion