Math 365, Elementary Statistics

 

Lesson 8 : Comparing Two Populations

Satya Mandal

Due Date: Visit the homework site.

Introductionback to top

In this lesson, two populations will be compared by interval estimation. The following will be considered:

  1. Compute confidence intervals of the difference μ1- μ2 of the means of two populations. For example, difference μ1 - μ2 between the mean annual income of the male population μ1 and the mean annual income of the female population μ2 could of some interest.
  2. Compute a confidence interval of the difference p1-p2 of the proportions of an attribute present (or proportions of "success") in two populations. For example, there may be some interest in the difference p1-p2 between of the proportion p1 of the defective items produced by the new machine and the proportion p2 of the defective items produced by the old machine.

8.1 Confidence Interval of μ1- μ2back to top

Suppose X, Y are two similar random variables. Let mean and standard deviation of X be, respectively, μ1 and σ1. Let mean and standard deviation of Y be, respectively, μ2 and σ2. We want to compute a confidence interval for the difference μ1- μ2. We proceed as follows.

  1. A sample X1, X2, …, Xm, of size m, is drawn from the X population and a sample Y1, Y2, …, Yn, of size n, is drawn from the Y population. Let

    X  = (X1+X2+ … +Xm)/m

    Y  = (Y1+Y2+ … +Yn)/n

    be the corresponding sample means.

  2. BY CLT, we have that X has

    N(μ1, σ1/m )

    distribution and Y has

    N(μ2, σ2/n )

    distribution.

  3. The statistic X-Y will be used as an estimator of μ1- μ2.
  4. Assume that the X samples and Y samples are mutually independent. In that case, it follows that X-Y has

    N(μ1 - μ2, σ) - distribution,          where       σ =  ( σ12/m + σ22/n ).

  5. It follows that

    P(-zα/2  ≤  ((X-Y) - (μ1 - μ2)) /σ   ≤  zα/2 )  =   1 - α.

    where σ is as above in (4).

  6. If we simplify, we get

    P(X-Y -zα/2 σ   ≤  μ1 - μ2   ≤  X-Y +zα/2 σ )  =   1 - α.

    where σ is as above in (4).

  7. Theorem. A (1-α)100 percent confidence interval for μ1- μ2 is given by

    x-y -zα/2 σ    ≤    μ1 - μ2    ≤    x-y +zα/2 σ .           Which is rewritten as       x-y - E    ≤    μ1 - μ2    ≤    x- y + E

    where    E = zα/2 σ    and    σ is as above in (4).

    This formula is usable if we know the values σ1 and σ2. Informally, we call this the 2-sample Z-interval.

  8. The margin of error (MOE) is defined as

    E  =  zα/2 σ   =  zα/2 ( σ12/m + σ22/n)

    As in lesson 7, we will use the terminologies LEP and REP.
  9. All the above would be "approximate", which we take the liberty to not mention. If X and Y are normal, then all the above are exact.
  10. When the samples sizes m and n are both large, then we can use the sample standard deviations s1   ≅  σ1 and s2  ≅  σ2, which can be used in the formula for MOE E.

Problem Solving: As in sections 7.1 - 7.3, the TI-84 has a method that essentially computes the 2-Sample Z-interval and the other two confidence intervals in these section. In any case, we will use above the formulas along with the help of invNormal function of TI-84 (solve by a "Long Hand Method").


Problems on 8.1: Confidence Interval of μ1 - μ2

Exercise 8.1.1. Suppose we have two normal populations with means μ1, μ2 and standard deviation σ1, σ2 respectively. It is known that σ1 = 8.1 and σ2 = 11.3. A sample of size m = 64 was collected from the first population, and the sample mean was found to be x = 3.7. A sample of size n = 99 was collected from the second population, and the sample mean was found to be y = 4.1. Compute a 99 percent confidence interval for the difference of mean μ1- μ2.

Solution:
The given data is summarized as follows:

  Population I (X) Population II (Y)
Population St. deviation σ1 = 8.1 σ2 = 11.3
Sample Mean X= 3.7 Y= 4.1
Sample size m   =   64 n   =   99



Level of confidence = 99 percent. So
1 - α = .99,   α = .01   and   α/2 =.005.
Therefore, zα/2 = z.005 = invNormal(.995)= 2.5758

MOE  =  E  =  zα/2 σ   =  zα/2( σ12/m + σ 22/n) =  2.5758 ((8.1)2/64 + (11.3)2/99)   =   3.9191

LEP = X - Y - E = 3.7 - 4.1 - 3.9191 = -4.3191
REP = X - Y + E = 3.7 - 4.1 + 3.9191 = 3.5191

Exercise 8.1.2. The birth weight of babies in developed and developing countries are normally distributed with mean μ1, μ2 and standard deviation σ1, σ2, respectively. (My data is not real.) Given σ1 = 2.3 pounds and σ2 = 2.9 pounds. A sample of size m = 35 babies from the developed nations were collected and the sample mean birth weight was found to be x = 8.9 pounds. A sample of size n = 48 babies from the developing nations was collected and the sample mean birth weight was found to be y = 7.1 pounds.

Determine the margin of error of the difference μ1- μ2 and a confidence interval at the 95 percent level of confidence.

Solution:
The given data is summarized as follows:

  Population I (X) Population II (Y)
Population St. deviation σ1 = 2.3 σ2 = 2.9
Sample Mean X= 8.9 Y= 7.1
Sample size m   =   35 n   =   48



Level of confidence = 95 percent. So
1 - α = .95,   α = .025   and   α/2 =.025.
Therefore, zα/2 = z.05 = invNormal(1-.025)= 1.9600

MOE  =  E  =  zα/2 σ   =  zα/2( σ12/m + σ 22/n) =  1.9600 ((2.3)2/35 + (2.9)2/48)   =   1.1197

LEP = X - Y - E = 8.9 - 7.1 - 1.1197 = .6803
REP = X - Y + E = 8.9 - 7.1 + 1.1197 = 2.9197

Exercise 8.1.3. African elephants and Indian elephants are different in height, weight, and length of ear and tusk. It is natural to assume that all these are normally distributed. The mean height and standard deviation of African elephants are μ1, σ1 = 1.2 feet, respectively. The mean height and standard deviation of Indian elephants are μ2, σ2 = 1.1 feet, respectively. A sample of size 25 African elephants were collected and the sample mean height was found to be x = 10.9 feet. A sample of size 28 Indian elephants was collected and the sample mean height was found to be y = 9.1 feet.

Determine the margin of error and a confidence interval of the difference μ1- μ2 at the 98 percent level of confidence.

Solution:
The given data is summarized as follows:

  Population I (X) Population II (Y)
Population St. deviation σ1 = 1.2 σ2 = 1.1
Sample Mean X= 10.9 Y= 9.1
Sample size m   =   25 n   =   28



Level of confidence = 98 percent. So
1 - α = .98,   α = .02   and   α/2 =.01.
Therefore, zα/2 = z.01 = invNormal(1-.01)= 2.3263

MOE  =  E  =  zα/2 σ   =  zα/2( σ12/m + σ 22/n) =  2.3263 ((1.2)2/25 + (1.1)2/28)   =   .7386

LEP = X - Y - E = 10.9 - 9.1 - .7386 = 1.0614
REP = X - Y + E = 10.9 - 9.1 + .7386 = 2.5386

Exercise 8.1.4.The mean weight of King salmon in Kenai and Anchor River would have to be compared. The mean weight of King in Kenai is μ1 and the standard deviation σ1 = 7.7 pounds. The mean weight of King in Anchor is μ2 and the standard deviation σ2 = 9.1 pounds. A sample of 51 King from Kenai had a mean X = 31 pounds. A sample of 63 King from Anchor had a mean Y = 33 pounds.

Determine the margin of error and a confidence interval of the difference μ1 - μ2 at the 97 percent level of confidence.

Solution:
The given data is summarized as follows:

  Population I (X)
Kenai
Population II (Y)
Anchor
Population St. deviation σ1 = 7.7 σ2 = 9.1
Sample Mean X= 31 Y= 33
Sample size m   =   51 n   =   63



Level of confidence = 97 percent. So
1 - α = .97,   α = .03   and   α/2 =.015.
Therefore, zα/2 = z.015 = invNormal(1-.015)= 2.1701

MOE  =  E  =  zα/2 σ   =  zα/2( σ12/m + σ 22/n) =  2.1701 ((7.7)2/51 + (9.1)2/63)   =   3.4154

LEP = X - Y - E = 31 - 33 - 3.4154 = -5.4154
REP = X - Y + E = 31 - 33 + 3.4154 = 1.4154

Exercise 8.1.5. There is a difference between fall semester grades and spring semester grades. The mean percentage score in fall is μ1 and the standard deviation σ1 = 27 percent. The mean percentage score in spring is μ2 and the standard deviation σ2 = 23 percent. A sample of 87 students in fall had a sample mean score X = 73 percent. A sample of 77 students in spring had a sample mean score Y = 69 percent.

Determine the margin of error and a confidence interval of the difference μ1 - μ2 at the 96 percent level of confidence.

Solution:
The given data is summarized as follows:

  Population I (X)
Fall
Population II (Y)
Spring
Population St. deviation σ1 = 27 σ2 = 23
Sample Mean X= 73 Y= 69
Sample size m   =   87 n   =   77



Level of confidence = 96 percent. So
1 - α = .96,   α = .04   and   α/2 =.02.
Therefore, zα/2 = z.02 = invNormal(1-.02) = 2.0537

MOE  =  E  =  zα/2 σ   =  zα/2( σ12/m + σ 22/n) =  2.0537 ((27)2/87 + (23)2/77)   =   8.0198

LEP = X - Y - E = 73 - 69 - 8.0198 = -4.0198
REP = X - Y + E = 73 - 69 + 8.0198 = 12.0198

Exercise 8.1.6. The difference in mean annual salary of the professors in two state universities have to be estimated. The mean annual salary in the University -I is μ1 and the standard deviation σ1 = $16,000. The mean annual salary in the University -II is μ2 and the standard deviation σ2 = $11,500. A sample of 47 professors in University-I had a mean salary X = $79,000. A sample of 58 professors in University-II had a mean salary Y = $71,500

Determine the margin of error and a confidence interval of the difference μ1 - μ2 at the 94 percent level of confidence.

Solution:
The given data is summarized as follows:

  Population I (X)
University-I
Population II (Y)
University-II
Population St. deviation σ1 = 16000 σ2 = 11500
Sample Mean X= 79000 Y= 71500
Sample size m   =   47 n   =   58



Level of confidence = 94 percent. So
1 - α = .94,   α = .06   and   α/2 =.03.
Therefore, zα/2 = z.03 = invNormal(.97) = 1.8808

MOE  =  E  =  zα/2 σ   =  zα/2( σ12/m + σ 22/n) =  1.8808 ((16000)2/47 + (11500)2/58)   =   5228.1439

LEP = X - Y - E = 79000 - 71500 - 5228.1439 = 2271.8561
REP = X - Y + E = 79000 - 71500 + 5228.1439 = 12728.1439



8.2 When σ1 and σ2 are Unknownback to top

As in the last section, X, Y represent two similar populations. In this section also, we will construct confidence intervals for the difference of means μ1 - μ2, when the standard deviations σ1, σ2 are unknown.

As a price, we would have to assume that X has N(μ1, σ1) distribution and Y has N(μ2, σ2) distribution.

Take a sample X1, X2, …, Xm of size m from the X population, and take another sample Y1,Y2, …, Yn of size n from the Y population. We proceed as follows.

  1. Further Assumption: It is further assumed that the variances σ12 and σ22 are equal. Use a common notation,

    σ1   =  σ2   =  σ.

    And, we also assume that the X-sample and the Y-sample are mutually independent.

  2. Let X and SX denote the sample mean and sample standard deviation of the X-sample. Similarly, let Y and SY denote the sample mean and sample standard deviation of the Y-sample.
  3. Definition. Define the pooled estimate Sp2 for σ2 as follows

    Sp2  = [(m-1)SX2+(n-1)SY2 ]/ [m+n-2].

    Although both SX2, SY2 are estimators of σ2, Sp2 would be a better estimator for σ2 because it uses both the samples.
    One can see that Sp2 is a weighted average of SX2 with weight (m-1) and SY2 with weight (n-1).

    Therefore, the pooled estimate Sp for the standard deviation σ is given by

    Sp  = √ ([(m-1)SX2+(n-1)SY2 ]/ [m+n-2]).

  4. As in section 7.2, we state that

    T  =  [ (X - Y) - (μ1 -μ2) ] / [Sp(1/m +  1/n) ]

    has a t-distribution with degrees of freedom df = m+n-2.

  5. Using the same kind of computations that we have done before, we see that a (1-α)100 percent confidence interval for μ1- μ2 is given by

    x-y-E   <  μ1- μ2   <   x-y+E

    where

    E=tm+n-2,α/2 Sp (1/m + 1/n)

    Informally, we call it 2-Sample T-interval.



Problems on 8.2: σ1 and σ2 are Unknown

Exercise 8.2.1. Suppose that two "similar" normal populations have means μ1, μ2 respectively and same standard deviations σ. A sample of size m = 11 from the first population the sample mean was found to be x = 13.2 and the sample standard deviation s1 = 2.33. A sample of size n = 13 was collected from the second population that had a sample mean y = 11.5 and sample standard deviation s2 = 2.73.

Compute the pooled estimate sp of σ and a confidence interval for μ 1- μ2 at the 96 percent level of significance.

Solution

Solution:
We will use 2-Sample T-interval, because σ1 and σ2 are unknown.
The given data is summarized as follows:

  Population I (X) Population II (Y)
Sample size m   =   11 n   =   13
Sample Mean X= 13.2 Y= 11.5
Sample St. deviation s1 = 2.33 s2 = 2.73

The pooled estimate of σ is given by

Sp  = √ ([(m-1)SX2+(n-1)SY2 ]/ [m+n-2])
 =  √ ([(11-1)(2.33)2+(13-1)(2.73)2]/ [11+13-2])  =   2.5560


The degrees for freedom
df= m+n -2 =11+13-2 =22
Level of confidence = 96 percent. So
1 - α = .96,   α = .04   and   α/2 =.02. Therefore,
tm+n -2, α/2 = t22, .02 = invT(.98, 22)= 2.1829

MOE  =  E  =  tm+n -2, α/2Sp (1/m + 1/n) = 2.1829*2.5560 √(1/11 + 1/13) = 2.2858



LEP = X - Y - E = 13.2 - 11.5 - 2.2858 = -.5858
REP = X - Y + E = 13.2 - 11.5 + 2.2858 = 3.9858

Exercise 8.2.2. Suppose we have two normal populations with means μ1, μ2 and equal standard deviation σ. A sample of size m = 64 was collected from the first population and the sample mean and standard deviation were found to be x = 3.7, s1 = 9.2 . A sample of size n = 99 was collected from the second population and the sample mean and standard deviation were y = 4.1, s2 = 8.7.

Compute the pooled estimate sp of σ and a confidence interval for μ 1- μ2 at the 95 percent level of significance.

Solution

Solution:
We will use 2-Sample T-interval, because σ1 and σ2 are unknown.
The given data is summarized as follows:

  Population I (X) Population II (Y)
Sample size m   =   64 n   =   99
Sample Mean X= 3.7 Y= 4.1
Sample St. deviation s1 = 9.2 s2 = 8.7

The pooled estimate of σ is given by

Sp  = √ ([(m-1)SX2+(n-1)SY2 ]/ [m+n-2])
 =  √ ([(64-1)(9.2)2+(99-1)(8.7)2]/ [64+99-2])  =   8.8990


The degrees for freedom
df= m+n -2 = 64+99-2 = 161
Level of confidence = 95 percent. So
1 - α = .95,   α = .05   and   α/2 =.025. Therefore,
tm+n -2, α/2 = t161, .025 = invT(.976, 161)= 1.9748

MOE  =  E  =  tm+n -2, α/2Sp (1/m + 1/n) = 1.9748*8.8990* √(1/64 + 1/99) = 2.8187



LEP = X - Y - E = 3.7 - 4.1 - 2.8187 = -3.2187
REP = X - Y + E = 3.7 - 4.1 + 2.8187 = 2.4187

Exercise 8.2.3. The difference in mean monthly water consumption in two adjacent towns has to be estimated estimated. A sample 37 household in the Town-I had a sample mean 6300 gallons and standard deviation 450 gallons. A sample 49 household in the Town-II had a sample mean 6800 gallons and standard deviation 650 gallons. Compute a 94 percent confidence interval for the difference μ1 - μ2.

Solution:
We will use 2-Sample T-interval, because σ1 and σ2 are unknown.
The given data is summarized as follows:

  Population I (X) Population II (Y)
Sample size m   =   37 n   =   49
Sample Mean X= 6300 Y= 6800
Sample St. deviation s1 = 450 s2 = 650

The pooled estimate of σ is given by

Sp  = √ ([(m-1)SX2+(n-1)SY2 ]/ [m+n-2])
 =  √ ([(37-1)(450)2+(49-1)(650)2]/ [37+49-2])  =   572.8990


The degrees for freedom
df= m+n -2 = 37+49-2 = 84
Level of confidence = 94 percent. So
1 - α = .94,   α = .06   and   α/2 =.03. Therefore,
tm+n -2, α/2 = t84, .03 = invT(.97, 84)= 1.9065

MOE  =  E  =  tm+n -2, α/2Sp (1/m + 1/n) = 1.9065*572.8990* √(1/37 + 1/49) = 237.8840



LEP = X - Y - E = 6300 - 6800 - 237.8840 = -737.884
REP = X - Y + E = 6300 - 6800 + 237.8840 = -262.116

Determine the margin of error and a confidence interval of the difference μ1 - μ2 at the 94 percent level of confidence.

Exercise 8.2.4. The birth weight of the babies in developed and developing countries are normally distributed with mean μ1, μ2 and equal standard deviation σ. (My data is not real.) Suppose the following data about the birth weight from developed and developing nations were collected.

Developed
8.8 8.1 6.3 9.7 6.3
7.1 5.3 7.7 9.1 8.1
8.2 7.9 8.3 8.9 9.0
10.1 9.9 8.8 7.8 5.2
7.2        
 
Developing
6.3 5.2 8.3 5.9 5.5
7.1 8.1 7.9 6.3 6.9
9.1 8.1 7.0 4.9 5.3
6.3 7.1 6.3 6.1 5.8
5.7 6.8 8.3 7.7  

Compute the pooled estimate sp of σ and a confidence interval for μ 1- μ2 at the 97 percent level of significance.

Solution

Solution:
We will use 2-Sample T-interval, because σ1 and σ2 are unknown.
We use TI-84, as in Lesson 2, summarize the give data as follows:

  Population I (X) Population II (Y)
Sample size m   =   21 n   =   24
Sample Mean X= 7.9905 Y= 6.75
Sample St. deviation s1 = 1.3758 s2 = 1.1417

The pooled estimate of σ is given by

Sp  = √ ([(m-1)SX2+(n-1)SY2 ]/ [m+n-2])
 =  √ ([(21-1)(1.3758)2+(24 - 1)(1.1417)2]/ [21+24-2])  =   1.2560


The degrees for freedom
df= m+n -2 = 21+24-2 = 43
Level of confidence = 97 percent. So
1 - α = .97,   α = .03   and   α/2 =.015. Therefore,
tm+n -2, α/2 = t43, .015 = invT(.985, 43)= 2.2445

MOE  =  E  =  tm+n -2, α/2Sp (1/m + 1/n) = 2.2445*1.2560* √(1/21 + 1/24) = .8424



LEP = X - Y - E = 7.9905 - 6.75 - .8424 = .3981
REP = X - Y + E = 7.9905 - 6.75 + .8424 = 2.0829

Exercise 8.2.5. African elephants and Indian elephants are different in height, weight, and length of ear and tusk. It is natural to assume that all these are normally distributed. Assume that the height of African and Indian elephants have an equal mean σ. The mean heights of African elephants and Indian elephants are μ1, μ2, respectively. Suppose the following data were collected on the height of elephants from the two continents (these are not real data).

African
10.9 11.7 9.3 9.9 11.5
8.8 12.9 11.7 9.1 11.1
9.1 8.7 10.5 11.3 12.3
13.1 12.9 9.5 10.7 11.3
  
Indian
7.1 8.3 8.2 9.1 10.3
9.3 9.7 8.9 8.8 9.1
7.9 9.9 9.2 8.8 8.1
8.7 8.8 9.3 10. 1 9.9
9.9        

Compute the pooled estimate sp of σ and a confidence interval for μ 1- μ2 at the 99 percent level of significance.

Solution

Solution:
We will use 2-Sample T-interval, because σ1 and σ2 are unknown.
We use TI-84, as in Lesson 2, summarize the give data as follows:

  Population I (X) Population II (Y)
Sample size m   =   20 n   =   21
Sample Mean X= 10.815 Y= 9.0190
Sample St. deviation s1 = 1.4162 s2 = .8072

The pooled estimate of σ is given by

Sp  = √ ([(m-1)SX2+(n-1)SY2 ]/ [m+n-2])
 =  √ ([(20-1)(1.4162)2+(21 - 1)(.8072)2]/ [20+21-2])  =   1.1451


The degrees for freedom
df= m+n -2 = 20+21-2 = 39
Level of confidence = 97 percent. So
1 - α = .99,   α = .01   and   α/2 =.005. Therefore,
tm+n -2, α/2 = t39, .005 = invT(.995, 39)= 2.7079

MOE  =  E  =  tm+n -2, α/2Sp (1/m + 1/n) = 2.7079*1.1451* √(1/20 + 1/21) = .9688



LEP = X - Y - E = 10.815 - 9.0190 - .9688 = .8272
REP = X - Y + E = 10.815 - 9.0190 + .9688 = 2.7648


8.3 Comparing Two Population Proportionsback to top

In this section, estimation of the difference p1-p2 of the proportions of an attibute in two populations by confidence interval will be considered. Examples of such differences of proportions would include (1) the difference between the proportions p1 - p2 of the male and female populations who earn than fifty thousand dollars annually; (2) the difference p1 - p2 of the proportions of defective items produced by the old machine and the new machine in a factory.

Connsider proportions of an attribute A in two populations. Let p1 and p2 represent the proportions of the attribute A, in Population I and Population II, respectively. A confidence interval of p1-p2 will be constructed.

A sample of size m from Population I and a sample of size n Population II are collected. Let X be the number of sample members that have the attribute A and X=X/m be the sample proportion that has the attribute A. We take a sample from Population 2 of size n. Let Y be the number of sample members that has attribute A and Y=Y/n be the sample proportion that has the attribute A. ( In other words, X is the number of "success" and X=X/m is the proporrtion of success in Population I sample. Similarly, Y and Y=X/m for the Population II-sample.) Imortanly, these two samples are collected independently.

(An Example: Compare proportion of the male and female members in a community who earn more thatn $50,000 annually. A sample of m male members are interviewed and X would be the number of those who make more than fifty thousand annually and X=X/m would be the sample proportion of those who make more than fifty thousand annually. Similarly, interview n female members and Y=Y/n would be the sample proportion of female members who make more than fifty thousand.)

We develop a confidence interval for p1-p2 as follows.

  1. Notation. For the sample proportions, we have the following notatons:

    X=X/m                  Y=Y/n

  2. By CLT, X has N(p1,σ1) distribution where σ1 =  (p1(1-p1) /m) and Y has N(p2,σ2) distribution where σ2 =  (p2(1-p2) /n).

  3. An estimator of p1-p2 would be X-Y.
  4. Since the X samples and Y samples are mutually independent, it follows that X-Y has N(p1-p2,σ) distribution where σ =  ( σ12 + σ22 ).

  5. It follows,

    P(-zα/2 ≤  ( (X- Y)-(p1-p2))/σ  ≤   zα/2 )  =  1-α

    Simplify and get

    P((X- Y) -zα/2σ  ≤   p1-p2  ≤   (X- Y) +zα/2σ)  =  1-α



  6. As in section 7.4, we use X as an estimate for p1 and Y as an estimate for p2 and get the following theorem.

    Theorem. An approximate (1-α)100 percent confidence interval for p1-p2 is given by

    X-Y -E  ≤   p1-p2  ≤   X-Y+E

    where

    E= Zα/2( X(1-X)/m + Y(1-Y)/n )

  7. The E is called the margin of error.

Problems on 8.3: Comparing Two Population Proportions.

Exercise 8.3.1. Two independent samples were collected from two populations to compare the proportions p1, p2 of an attribute A present, respectively, in these two populations. Use 95 percent confidence interval to estimate p1-p2. It is given that x = 55 had the attribute A in a sample of size m = 117 from the first population and y = 37 had the attribute A in a sample of size n = 79 from the second sample. Solution

Solution:
The given data is summarized as follows:

  Population I (X) Population II (Y)
Numbr of Success X = 55 Y = 37
Sample size m   =   117 n   =   79
Sample Proportion X = X/m = 55/117 = .4701 Y= 39/79 = .4684


Level of confidence = 95 percent. So
1 - α = .95,   α = .05   and   α/2 =.025.
Therefore, zα/2 = z.05 = invNormal(1-.025)= 1.9600

E= Zα/2 [ X(1-X)/m + Y(1-Y)/n ]
= 1.9600* [ .4701*(1-.4701)/117 + .4684(1-.4684)/79] = .142435       [In this section, for the error term, we retain at least 6 decimal points.]



LEP = X - Y - E = .4701 - .4684 - .142435 = -.140735
REP = X - Y + E = .4701 + .4684 + .142435 = .114135

Exercise 8.3.2. To compare the proportions p1, p2 of defective lamps produced by new production center and old the production center, respectively, samples were collected. In a sample of 157 lamps from the new center, 26 were found to be defective; and in a sample of 141 lamps from the old center, 32 were defective. Compute a 99 percent confidence interval for p1-p2
Solution

Solution:
The given data is summarized as follows:

  Population I (X)
New Center
Population II (Y)
Old Center
Numbr of Success X = 26 Y = 32
Sample size m   =   157 n   =   141
Sample Proportion X = X/m = 26/157 = .1656 Y= 32/141 = .2270


Level of confidence = 99 percent. So
1 - α = .99,   α = .01   and   α/2 =.005.
Therefore, zα/2 = z.005 = invNormal(.995)= 2.5758

E= Zα/2 [ X(1-X)/m + Y(1-Y)/n ]
= 2.5758* [ .1656*(1-.1656)/157 + .2270(1-.2270)/141] = .118281       [In this section, for the error term, we retain at least 6 decimal points.]



LEP = X - Y - E = .1656 - .2270 - .118281 = -.179681
REP = X - Y + E = .1656 - .2270 + .118281 = .056881

Exercise 8.3.3. To compare the proportions p1,p2 of men and women, respectively, who watch football, data was collected. In a sample of 199 men, 83 said that they watch football; and in a sample of 161 women, 51 said they watch football. (These are not real data.) Construct a 99 percent confidence interval for p1-p2.
Solution

Solution:
The given data is summarized as follows:

  Population I (X)
Men
Population II (Y)
Women
Numbr of Success X = 83 Y = 51
Sample size m   =   199 n   =   161
Sample Proportion X = X/m = 83/199 = .4171 Y= 51/161 = .3168


Level of confidence = 99 percent. So
1 - α = .99,   α = .01   and   α/2 =.005.
Therefore, zα/2 = z.005 = invNormal(.995)= 2.5758

E= Zα/2 [ X(1-X)/m + Y(1-Y)/n ]
= 2.5758* [ .4171*(1-.4171)/199 + .3168(1-.3168)/161] = .130804       [In this section, for the error term, we retain at least 6 decimal points.]



LEP = X - Y - E = .4171 - .3168 - .130804 = -.030504
REP = X - Y + E = .4171 - .3168 + .130804 = .231104

Exercise 8.3.4. Two varieties of grapes are compared. To compare the proportions p1, p2 of acceptable grapes in these two varieties, respectively, samples were drawn. In a sample of 131 grapes from the variety I, 107 were acceptable. In a sample of 143 grapes from the variety II, 113 were acceptable. Construct a 97 percent confidence interval for the difference p1-p2.

Solution:
The given data is summarized as follows:

  Population I (X)
Variety I
Population II (Y)
Variety II
Numbr of Success X = 107 Y = 113
Sample size m   =   131 n   =   143
Sample Proportion X = X/m = 107/131 = .8168 Y= 113/143 = .7902


Level of confidence = 97 percent. So
1 - α = .97,   α = .03   and   α/2 =.015.
Therefore, zα/2 = z.015 = invNormal(1-.015)= 2.1701

E= Zα/2 [ X(1-X)/m + Y(1-Y)/n ]
= 2.1701* [ .8168*(1- .8168)/131 + .7902(1- .7902)/143] = .104111       [In this section, for the error term, we retain at least 6 decimal points.]



LEP = X - Y - E = .8168 - .7902 - .104111 = -.077511
REP = X - Y + E = .8168 - .7902 + .104111 = .130711

Exercise 8.3.5. To compare the proportions p1, p2 of students, respectively, in two state universities who pay more than $15 K tuition per year, samples were collected. In a sample of 217 students in the university I, 129 paid more than $15 K. In a sample of 313 students in the university II, 158 paid more than $15 K. Construct a 98 percent confidence interval for the difference p1-p2.

Solution:
The given data is summarized as follows:

  Population I (X)
University I
Population II (Y)
University II
Numbr of Success X = 129 Y = 158
Sample size m   =   217 n   =   313
Sample Proportion X = X/m = 129/217 = .5945 Y= 158/313 = .5048


Level of confidence = 98 percent. So
1 - α = .98,   α = .02   and   α/2 =.01.
Therefore, zα/2 = z.01 = invNormal(1-.01)= 2.3263

E= Zα/2 [ X(1-X)/m + Y(1-Y)/n ]
= 2.3263* [ .5945*(1- .5945)/217 + .5048(1- .5048)/313] = .101656       [In this section, for the error term, we retain at least 6 decimal points.]



LEP = X - Y - E = .5945 - .5048 - .101656 = -.011956
REP = X - Y + E = .5945 - .5048 + .101656 = .191356

Exercise 8.3.6. To compare the proportions p1, p2 of college graduates who earn more than 50 K, in two states, data was collected. In a sample of 444 college graduates in the state I, 334 earn more than 50 K. In a sample of 546 college graduates in the state II, 414 earn more than 50 K. Construct a 96 percent confidence interval for the difference p1-p2.

Solution:
The given data is summarized as follows:

  Population I (X)
State I
Population II (Y)
State II
Numbr of Success X = 334 Y = 414
Sample size m   =   444 n   =   546
Sample Proportion X = X/m = 334/444 = .7523 Y= 414/546 = .7582


Level of confidence = 96 percent. So
1 - α = .96,   α = .04   and   α/2 =.02.
Therefore, zα/2 = z.02 = invNormal(1-.02) = 2.0537

E= Zα/2 [ X(1-X)/m + Y(1-Y)/n ]
= 2.0537* [ .7523*(1- .7523)/444 + .7582(1- .7582)/546] = .056448       [In this section, for the error term, we retain at least 6 decimal points.]



LEP = X - Y - E = .7523 - .7582 - .056448 = -.062343
REP = X - Y + E = .7523 - .7582 + .056448 = .050548

Exercise 8.3.7. It is believed that women are safer drivers than men. Let p1, p2 denote the proportions of women and men drivers, respectively, who were involved in an auto accident in a year period. In a sample of a size 739 women drivers 39 were involved in auto accident during this period. During the same period, in a sample of size 1215 men 79 were involved in auto accident in a year. Construct a 95 percent confidence interval for the difference p1-p2.

back to top