Math 105, Topics in Mathematics |
||||||||||||||||||||||||||||||||||||||||||||
Lesson 7: Distribution of the Sample Mean
IntroductionIn this lesson, our final theorem would be that the sample mean X = (X1+X2+ ... +Xn)/n has normal distribution. Given a set of data the mean or the average x value that we have computed in the previous chapters is, in fact, the observed value of a random variable X to be called the sample mean. Similarly, the standard deviation s that we have computed before is the observed value of a random variable S to be called sample standard deviation. Each time you collect a sample/data the computed sample mean x is the value of the random variable X for this sample. Our point of view is explained in the following example. Example 7.1.1. Suppose we want to study the height distribution of the U.S. population. We collect a data of size n = 713 as follows:
And so on Our point of view is that the height x1, x2, x3, ... , xn
Here X1 is the notation for height of the first member of the sample, which could be the height of anybody from the whole U.S. population, and in the case of our sample the value of X1 is 71. Similarly, X2 is the notation for height of the second member of the sample, which could be the height of anybody from the whole U.S. population, and in the case of our sample the value of X2is 62. Each time we collect a sample But the sample members x1, x2, x3, ... , xn happen to be the values of the same set of random variables X1, X2, X3, ... , Xn. Definition: We define the sample mean X as the random variable X = (X1+ X2+ ....+ Xn)/n. Each time we collect a sample of size n, we get a value of X,
namely the average of the sample x1, x2, x3,
... , xn. Remark: The main point here is that when we collect a sample and compute the mean x (or average), the value of x that we get is probabilistic or "chancy." We must talk about the probability distribution of x or X. If we know the distribution of X, then we will be able to answer the questions related to probability of various values of x that we may get. We could make similar comments and definitions about the standard deviation. But we may not need them. If we denote X to be the random variable of the height of an American, then we also say that X1, X2, X3, ... , Xn is a sample from the population X-population. We used the example of height distribution of the U.S. population to explain our point of view. But given any random variable X (like weight, wages, binomial), we can talk about a sample X1, X2, X3, ... , Xn from the X-population. Properties: Suppose X is a random variable and let X1, X2, X3, ... , Xn be a sample from the X-population. Then we have the following properties.
Theorem: The mean of the sample mean X is equal to the population mean μ. So, mean( The standard deviation of the sample mean X is given by σ X = σ /(n 1/2). The Central Limit Theorem: Suppose X1, X2, X3, ... , Xn is a sample from a population X with mean μ and standard deviation σ.
Problems on Sampling Exercise 7.1.1. It is known that the tuition X paid per semester by students in a university has a distribution with mean μ= $2,050 and standard deviation σ = $310. If 64 students are interviewed, what is the approximate probability that the sample mean tuition X paid will be above $2,060? Solution: Here we are asked to compute P(2,060 < X) ? The mean of X = μ = 2,050 and So Flash-animated Solution Exercise 7.1.2. Monthly water consumption X per household, in a subdivision in Kansas City, has normal distribution with mean 15,000 gallons and standard deviation 3,000 gallons. What is the probability that the mean consumption of the 44 households in the subdivision will exceed 16,000 gallons? Solution: Here we are asked to compute P(16000 < X). Sample size n = 44. The mean of X = μ = 15000 and standard
deviation of X = σ X
= σ /(n1/2) So, P(16000 < X) = Flash-animated Solution Exercise 7.1.3. According to some data, the annual Kansas wheat export X has a mean of 733 million dollars and standard deviation of 163 million dollars. What is the probability that over next ten years Kansas wheat export will exceed 8040 million dollars? Solution Exercise 7.1.4. The lead content in blood, in a county, has mean μ = 2 (microgram/deciliter) with the standard deviation σ = 0.02. If this claim is valid, then what is the approximate probability that n = 400 blood samples will have a sample mean lead content of more than 2.002? Solution Exercise 7.1.5. The mean annual salary in a local industry has mean μ = $67,000.00 and the standard deviation σ = $16,000. You collect a sample of size 256 employees. What is the probability that the mean salary will exceed $67,500.00? Solution |