MATH 105, Topics in Mathematics

Lesson 8 :Testing Hypotheses

Satya Mandal

The Philosophy of Testing Hypotheses

8.1 Significance Test for mean μ,
when σ is known
(Homework 26)

8.2 Significance Test for μ,
Case of σ Unknown
(Homework 27, 28)

8.3 Population Proportion
(Homework 29)

Homework 26 - 29

Due Date: See the Lecture Notes Site.

The Philosophy of Testing Hypotheses

The Testing of hypotheses is another approach to estimation of parameters. A hypothesis H₀, called the Null hypothesis, is tested against another hypothesis H_A, called the alternative hypothesis. Only one of these two hypotheses is true. Based on the collected sample and established testing criterion, one of them is accepted and the other one rejected. The following two examples would provide further insight.

Example 1. An assertion is made that the disparity between the wages (annual income) of working men and women does not exist any more. To test this assertion, the mean annual incomes μ₁, μ₂, respectively, of the working male and female populations were compared. Our Null hypothesis H₀ would be that the mean annual income μ₁ of the working male population would be higher than the mean annual income μ₂ of the working female population. The Alternative Hypothesis H_A would be, as the assertion suggests, that these two means would be equal. We write them formally as:

H₀ : μ₁- μ₂ > 0
H_A : μ₁- μ₂ = 0

Example 2. A TV commentator mentioned that, during the last decade, the life expectancy of human being has increased substantially from 75 years. To test this assertion, the mean life expectancy μ was compared with 75. The Null hypothesis H₀ would be that the mean life expectancy μ remains equal to 75, as it was before. The Alternative Hypothesis H_A would be that, as the assertion suggests, the mean μ rose above 75 year by now. We write them formally as:

H₀ : μ =75
H_A : μ >75

Definitions and Terminologies.

Following are some definitions and terminologies.

Definition. A statistical hypothesis is defined to be a statement, claim, or proposition regarding a population. Usually, it would be about the values of the population parameters. The hypotheses H₀ and H_A in the above two examples would examples of statistical hypotheses.
It would be important to distinguish which one would be the Null hypothesis and which one would be the alternative hypothesis in a given context. One of them would, essentially, be the negation of the other.
The Null hypothesis H₀ represents the status quo. It would be the conventional wisdom. It represents something that was accepted for a long time, or some assumption or method that has been working reliably for a long time. Null hypothesis would remain as the default, unless the collected data provides very strong evidence against it, in favor of the alternative. There is a clear bias in favor of the Null Hypothesis.

The alternative hypothesis represents a new claim or something out of the ordinary. It could be a researcher's new technology or some sales person's claim. The bar for acceptance of the Alternative Hypothesis is very high. The burden of proof of its validity belongs to those who assert the same. There may even be resistance or skepticism about its validity. It would be accepted only if there is very strong evidence, in the collected data, in its support.

There are reasons for such favoritism in favor of Null Hypothesis. This is because an incorrect decision to reject the Null may have more serious consequences than rejecting the Alternative incorrectly. For example, in any medical test, erroneously concluding that the patient does not have an ailment would have more grievous consequences than erroneously concluding that the patient has the same ailment. Similarly, when one designs a pregnancy test, the priority would be to minimize chances (probability) of erroneously concluding that one is not pregnant when one is indeed pregnant, than the converse. Common sense dictates that such a test could only allow a maximum of five percent of such erroneous conclusions. Such erroneous conclusions are also known as false negative and false positive.
Given a Null hypothesis H₀ and an alternative hypothesis H_A, a test of hypothesis is a rule or a procedure to decide, based on the collected sample, whether to accept H₀ or H_A. The test will be based on the value of a test statistic. The rule is also called the decision rule.
A test of hypothesis is also known as a Significance Test. The test will be based on the value of a test statistic.
Two Types of errors. In such testing of hypotheses, two types of mistaken conclusions (errors) are possible as follows.
1. Rejecting the Null H₀ when it is in fact true would be called the type one error. The analogy would be a false negative.
2. Accepting the Null H₀ when it is in fact false would be called the type two error. The corresponding analogy would be a false positive.
3. The probability of type one error would be called the level of significance. It would be denoted by α. Since the priority would be to minimize the frequency of false negative, α would be a small number. Most often, α will be a .1, .05, .01 or a small number.

The rest of this chapter would be analogous to Lesson 7. Corresponding to each interval estimation we considered, there would be one Significance Test.

8.1 A Significance Test for mean μ, when σ is known

Let X be a random variable with mean μ and standard deviation σ. Some of our hypotheses testing would look like the following.

Two Tail Test	Left Tail Test	Right Tail Test
H₀ : μ = 75 H_A : μ ≠ 75	H₀ : μ = 75 H_A : μ > 75	H₀: μ = 75 H_A: μ < 75

In this course, all the Null Hypotheses H₀ would be an equality. The alternative Hypotheses H_A would be one of the three inequalities as above.

Develop a Significance Test

A Significance Test for the mean μ would be developed for the following Null and Alternative hypotheses:

Take a sample X₁,X₂, …, X_m of size m from the X population and let X be the sample mean.

The Hypothesis Test for the mean μ, when σ is known

Arguing similary, set the Decision Rules for all three tests for mean μ. It is assumes that the value of σ is known.

Definition. The set of values (that is, the intervals) that leads to the rejection of the Null hypothesis H₀ is called the rejection region or the critical region.

p-Value based Decision Rules

Definition. Let T be a test statistic to test H₀ against H_A. Let the observed value of T = t. The p-value, for this test, is defined as the probability, assuming H₀ is true, that T will take a value at least as extreme as t or worse. In the above decision rules, the test statistic is

In particular for the Z-test, if Z = z is the observed value of Z, then p-value is define as follow. The normalcdf function of TI-84 can be used to compute the same.

p-value based Decision Rules:

For all three Z-Tests, the p-values can be computed as above. Then, the above decision rules could, equivalently, be written as, at the level of significance α,

Remark. For the rest of this chapter, the decision rules for various significance tests will be described in two ways: (1) By checking whether the value of the test statistics T falls within the critical region or not. (2) By checking whether the p-value < α or not?

Exercise 8.1.1. The standard deviation of life expectancy of a population is σ = 15 years. A sample of size 25 had mean life expectancy X = = 81 years. Perform a significence test for the null and alternative hypothesis, regarding the mean life expectancy μ:

Solution by Long Hand Method:
Here the population standard deviation σ = 15,
the sample size n = 25,
the sample mean X = 81,
Also, μ₀ = 75

Exercise 8.1.2. (Change the level of significance.) Assume the same situation as in exercise 8.1.1. At the 1 percent level of significance will you reject or accept the null hypothesis?

Solution by Long Hand Method:
This is also a two tail test. From Exercise 8.1.1, p-value = .0455.

One percent level of significance means α = .01.Since,
p-value = .0455 is not less than α = .01. We ACCEPT the null hypothesis at one percent level of significance.
That means, at one percent level of significance, we do not accept that the mean life expectancy μ ≠ 75.

Exercise 8.1.3. (Change the alternative hypothesis) Assume the same situation as in exercise 8.1.1 and change the hypotheses as follows:

Exercise 8.1.4. The time taken by an athlete to run an event is normally distributed with mean μ and known standard deviation σ = 3.5 seconds. The coach believes that his/her mean time μ has improved from last year's mean 34 seconds. To test, the athlete ran 16 times and the sample mean was found to be X = 31 seconds.

Exercise 8.1.5. The effectiveness of a weight loss program is to be tested on a group of 83 participants. At the beginning of the program, the mean weight of group is 210 pounds. At the end of the program the mean weight of the group is 199 pounds. The standard deviation of weight is known to be σ = 53.1 pounds. In terms of mean weight μ, perform a significance test that the program is effective.

Solution by Long Hand Method:
The population mean weight (after completing such a program) will be denoted by μ.

Exercise 8.1.6. A manufacturer of heating furnace is marketing a new model of energy efficient furnace. The mean gas consumption in January by ordinary furnaces is 153 CCF. A sample of 93 new model furnace had a mean consumption of 142 CCF in January. The standard deviation of consumption in January is known to be σ = 46 CCF. In terms of mean consumption μ, perform a significance test that the new model is really energy efficient.

Solution by Long Hand Method:
The population mean consumption in January will be denoted by μ.

Exercise 8.1.7. It is believed that due to favorable weather conditions the mean weight μ of King salmon in Anchor River would be higher than the last year's mean of 33 pounds . The standard deviation of the weight is known to be σ = 16 pounds. A catch of 53 King had a mean of 39 pounds. In terms of mean weight μ, perform a significance test that the weight would be higher.

Exercise 8.1.8. The instructor of Math 105 claims that due to his updated method of teaching, the student's learning has improved. The mean percent score of all his Math 365 courses before this semester was 68 percent. This semester in his call of 79 students, the mean percent score is 74 percent. The standard deviation of the percent score is known to be σ = 22 percent. In terms of mean consumption μ, perform a significance test that the percent score is higher.

Solution by Long Hand Method:
The population mean percent score will be denoted by μ.

Exercise 8.1.9. It is believed that the annual mean expenditure, including tuition, for students has increased from the corresponding mean in year 2000. In year 2000, the mean annual expenditure was $17,000. A sample of 87 students had annual mean expenditure of $19,500. The standard deviation annual expenditure is known to be σ = $7,500 percent. In terms of mean expenditure μ, perform a significance test that the mean annual expenditure μ has increased.

8.2 Significance Test for μ, Case of σ Unknown

Let X be a random variable with mean μ and standard deviation σ. In this section also, another confidence interval of the mean μ. In contrast to Z-Test, this section deals with the situation when σ is unknown. As in the case of T-itervals (section 7.2), X would be assumed to have a normal distribution. Two Tail, Left Tail and Right Tail Tests would be developed to test the null hypothesis H₀: μ = μ₀, in the case when the value of σ is not known.

A sample X₁,X₂,…,X_m of size m is drawn from the X population. Let X and S² denote the sample mean and variance, respectively. The statistic

Similar to the situation of T-intervals (section 7.2), when the null hypthesis H₀: μ = μ₀ is true, T has t-distribution with degrees of freedom m-1. Using the same kind of arguments as in section 8.1, the decision rules are set as follows:

p-Value based Decision Rules

For the T-Tests, if T = t is the observed value of T, then p-value is define as follow. The tcdf function of TI-84 can be used to compute the same.

p-value based Decision Rules:

For all three T-Tests, the p-values can be computed as above. Then, the above decision rules could, equivalently, be written as, at the level of significance α,

Exercise 8.2.1. A supplier of lamps claims that the mean lifetime of his lamps is longer than that of the lamps in the market. The mean lifetime of the bulbs on the market is 3456 hours. To test the claim of the supplier, a sample of 26 bulbs were examined. The sample mean was found to be 3720 hours and the sample standard deviation was s = 552 hours. In terms of mean lifetime μ, perform a significance test that the supplier's lamps last longer.