MATH 365, Elementary Statistics

Lesson 5 : Continuous Random Variables

5.1 Probability Density Function (pdf)	5.2 The Normal Random Variable	5.3 Nomal Approximation to Binomial
Homework 14 - 16

5.1 Probability Density Function (pdf)

Given a sample space S, a continuous random variable was defined as a random variable X that can assume any value in an interval. The probability distribution of a continuous random variable is described very differently from that of a discrete random variable. We describe it as follows.

Definition. Let S be a sample space and X be a continuous random variable. Then there is a function f(x), of real numbers x, to be called the probability density function, abbreviated as pdf of X. This pdf f(x) has the following properties:

We have f(x) ≥ 0 for all real numbers x.
For any two real numbers a ≤ b (also for a = -∞ and b = ∞) the probability that X will be between a and b is given by the area under the graph of y = f(x), above the x-axis and between the vertical lines x = a and x = b. In mathematical notations we have

P(a ≤ X ≤ b) = P(a ≤ X < b) = P(a < X ≤ b) = P(a < X < b) =

the area under the graph of y = f(x), above x-axis, between the vertical lines x = a and x = b.

Look at the animations on
1. exponential probability.
2. normal probability.
If you had calculus, we have

P(a ≤ X ≤ b) = P(a ≤ X < b) = P(a < X ≤ b) = P(a < X < b) = _a∫^bf(x)dx
It follows that for any real number a

P(X = a) = 0.
This is very much in contrast with the discrete random variables.
The whole area under the graph of y = f(x) above the x-axis must be one.

Remark. Given a continuous random variable X, to get a model for f(x) we look at a large sample and look at the relative frequency histogram of the X-values.

Example. Let X have the following pdf:

f(x)	=		1 if 0 ≤ x ≤ 1
			0 Otherwise

Then we say X is uniformly distributed between 0 and 1 because it has the same density everywhere between 0 and 1.

Similarly, Y is said to be uniformly distributed between -1 and 3 if the pdf of Y is given by

g(x)	=		1/4 if -1 ≤ x ≤ 3
			0 Otherwise

The Mean and Variance

The mean μ, variance σ² and standard deviation σ of continuous random variables X are interpreted as we did for discrete random variables. As before, the mean μ, which is also called the expectation E(X), represents the average value of X.

But the definitions involve some calculus, which we are trying to avoid. If you have had calculus, I am giving the following definitions.

Suppose f(x) is the pdf of a continuous random variable X. Then the mean of X is

μ =E(X)=∫	^∞	xf(x)dx
	_-_∞

and the variance of X is

σ ² =Variance(X)=∫	^∞	(x- μ )²f(x)dx
	_-_∞

and the standard deviation σ is the square root of the variance σ².

Look at the following flash animations of graphs of some pdfs:

5.2 The Normal Random Variable

The most commonly encountered random variable in nature is the normal random variable. As we have seen in the last section, the probability distribution of a random variable is determined by the pdf of the random variable. The pdf of a normal random variable is described below.

PDF of a Normal Random Variable: Suppose f(x) is the pdf of a normal random variable X. Then we have the following properties of f(x).

The graph of the pdf y = f(x) has a symmetric bell shape as illustrated below:
Look at the flash animation of the pdf of normal random variables.
The pdf f(x) is completely known if we know the mean μ and the standard deviation σ.
The graph is symmetric around the vertical line x = μ. The graph is also peaked at x = μ.
The graph approaches the x-axis at both ends of the x-axis.
The larger the standard deviation σ is, the flatter the the graph of y = f(x) will be.
In fact,
f(x)= 1/[σ √(2 Π)] exp [-(x- μ)²/(2σ²)] for - ∞ < x < ∞ .
If X is a normal random variable, we say X is normally distributed, or X has normal distribution. We also write X has N(μ,σ)-distribution.

Definition. A normal random variable is called a Standard Normal Random Variable if it has mean μ = 0 and standard deviation σ = 1. So, a N(0,1)-random variable is called a standard normal variable. In some textbooks the standard normal random variable is denoted by Z. The GOOD NEWS is that a table is available to compute these probabilities. The following properties of Z will be useful.

The graph of the pdf y = f(x) of the standard random variable Z is symmetric around the y-axis.
The total area under the graph above the x-axis is one.
So, on each side of the y-axis, the area under the graph above the x-axis is .5.
Visit the flash animation on Standard Normal Probability to see illustrations of the above.

Using the Probability Tables: Tables are used widely to compute probability. However, due to the use of various software programs on probability, the importance of such tables has declined. In this chapter, we will use the Z-table to compute probability for the standard normal random variable. We note the following:

Tables are available in many different formats.
Visit the Z-table and try to understand it.
This table gives P(Z<z) for numbers z.
The probability P(Z<z) is the area on the left side of z, under the bell curve.
The number z is read from the left column and top. The probability P(Z<z) is given in the middle.
So, P(a < Z < b) = P(Z < b) - P(Z<a) = the difference between the probability P(Z < b) and P(Z<a) that we read from the table.

Inverse Probability: Sometimes we will be given the probability and asked to compute a "cut off" point.

Example: We may be given P(Z < c) = .975 and asked to compute c. You will see from the table P(Z<1.96) = .975 and conclude that c=1.96.
Example: We may be given P(l < Z) = .005 and asked to compute l. P(l<Z) represents the area on the right side of l, under the bell curve. So, P(Z < l) = 1 - .005 = .995. From the table P(Z<2.58) = .995 (actually .9951, but the exact match is not always expected). So, l=2.58.
Visit the animation on Inverse Z distribution to inspect a particular type of cut-off problem that we will use later.

Given a N(μ,σ)-random variable X, we can use the Z-table to compute probabilities for X because of the following theorem.

Theorem. Let X be a N(μ, σ)-random variable. Then Z = [(X-μ)/(σ)] is a standard random varable. So,

P(a < X < b) = P(

a- μ

< Z <

b- μ

)

P(a < X < b) = P(A < Z < B)

where A= (a-μ)/σ and B= (b-μ)/σ).

Now we can use the Z-table.

Problem Solving: We will have two types of problems in this section—probability computation and problems of inverse probability (or cut-off points).

For a problem on normal random variables X with mean μ and standard deviation σ, the first step is STANDARDIZATION.
Then, we look at the Z-table.
Example: Suppose X is a N(2, .5) random variable and P(X<L) = .95, what is the cut-off L? First, we standardize and we have P((X-μ)/σ < (L-μ)/σ) = P(Z < (L-μ)/σ ) = .95. From table, P(Z < 1.65) = .95 (approximately). So, L-μ/σ = 1.65 an L = μ+1.65σ = 2 + 1.65*.5 = 2.825.

Ubiquity of Normal Random Variables: Any random variable that we encounter in nature is, almost certainly, either normal or approximately normal. If there is one concept that you take from this course it is this: nature's random variables are normal or approximately normal. You will hear about normal random variables and the bell curve in your workplace or anywhere you may have to use statistics.

Problems on 5.2: the Normal Random Variable

Exercise 5.2.1. Let Z be the standard normal random variable.

Find the probability P(-1.1 < Z < 2.5).
Find the probability P(Z < -2.1).
Find the probability P(-2.1 < Z < -1.5).
Find the probability P(1.5 < Z).

Experiment with the normal animation.
Solution

Exercise 5.2.2. Let X be a normal random variable with mean μ = 3 and standard deviation σ = 1.5 .

Find the probability P(-1.1 < X < 2.5).
Find the probability P(X < -2.1).
Find the probability P(-1.2 < X < -0.5).
Find the probability P(1.5 < X).

Experiment with the normal animation.
Experiment with the Solution

Exercise 5.2.3. The length of life of some light bulbs produced in a factory is normally distributed with mean 8640 hours and standard deviation 1440 hours. Find the probability that a bulb will last

less than 5040 hours;
between 5040 hours and 8640 hours.

Solution

Exercise 5.2.4. The length X of a fish in a lake has normal distribution with mean 67 cm and standard deviation 21 cm. What proportion (i.e, probability) of fish are between 44 cm and 110 cm long?
Solution

Exercise 5.2.5. The diameter of the pumpkins in my patch has normal distribution with mean 13 inches and standard deviation 4.5 inches. What proportion (i.e., probability) of pumpkins is above 22 inches?
Solution

Exercise 5.2.6. The annual expenditure X of a student is approximately normally distributed with mean μ = 11,000 dollars and standard deviation σ = 1500 dollars. What percent of students spend less than 10,000 dollars?
Solution

Exercise 5.2.7. Suppose the annual production X of milk per cow is normally distributed with μ = 5500 liters and standard deviation σ = 150 liters. What percent of cows have annual yield less than 5155 liters?
Solution

Exercise 5.2.8. The amount of vegetable oil X produced by a machine in a day is normally distributed with μ = 130 liters and standard deviation σ = 25 liters. What is the probability that a machine will produce between 120 liters and 150 liters on a day?
Solution

Exercise 5.2.9. The weight X at birth of babies is normally distributed with mean μ = 114 oz and standard deviation σ = 18 oz. What percent of babies will have birth weight below 141 oz?
Solution

Problems on Cut-off values

Exercise 5.2.10. Let Z be the standard normal random variable.

Given that P(-1.1 < Z < c)=.6881, find c.
Given that P(Z < c)=0.0222, find c.
Given that P(c < Z < 1.5) = 0.0919, find c.
Given that P(c < Z) = 0.102, find c.

Experiment with the normal animation.
Solution

Exercise 5.2.11. The length X of a fish in a lake has normal distribution with mean 67 cm and standard deviation 21 cm. On a fishing trip to the lake, you are instructed to release those in the lower 33 percent in length. What is the cut-off length?
Solution

Exercise 5.2.12. The telephone company's data shows that length X of their international calls has normal distribution with mean 11.5 minutes and standard deviation 4.3 minutes. The company decided to give a special rate for the longest 20 percent calls. What is the cut-off time length?
Solution

Exercise 5.2.13. The weight X of babies (of a fixed age) is normally distributed with with mean μ = 212 oz and standard deviation σ = 25 oz. Doctors would be concerned (not necessarily alarmed) if a baby is among the lower 5.05 percent in weight. Find the cut-off weight L below which the doctors will be concerned.
Solution

Exercise 5.2.14. Monthly water consumption X per household, in a subdivision in Kansas City, has normal distribution with mean 15000 gallons and standard deviation 3000 gallons. It has been decided that a surcharge will be imposed for those in the top 25 percent. Find the cut-off consumption U in gallons.
Solution

5.3 Normal Approximation to Binomial

A wide range of random variables behave approximately like a normal random variable. One such example is binomial(n,p)-random variables.

Roughly, if X is a B(n,p) random variable, then X behaves approximately like a normal random variable with mean μ = np and standard deviation σ = [np(1-p)]^1/2.

As we know, a B(n,p) random variable X is discrete and

P(X=r) = _nC _rp^r(1-p)^n-r r=0,1,2,…,n.

On the other hand, if Y is a N(μ, σ) random variable then

P(Y = r) = 0.

Because of this, some correction needs to be done. The following theorem states how to use normal approximation to binomial random variables.

Theorem. Suppose X is a B(n,p) random variable. If n is large and p is not very close to 0 or 1, then X behaves, approximately, like a N(μ, σ) random variable where

μ = np and standard deviation σ = [np(1-p)]^1/2.

We have, for r=0,1,…,n

P(X = r) = P(r-0.5 < X < r + .5) =P(L < Z < R)

where L=(r-0.5-μ)/σ and R=(r+0.5-μ)/σ.

More generally, for r,s=0,1,…,n

P(r ≤ X ≤ s) = P(r-0.5 < X < s + .5) =P(L < Z < R)

where L=(r-0.5-μ)/σ and R=(s+0.5-μ)/σ. Now use the Z table.

This adjustment by .5 on two sides is called continuity correction.

Problems on 5.3: Normal Approximation to Binomial

Exercise 5.3.1. A Lawrence bank knows that 35 percent of its customers will visit the drive-through window. If 400 customers visit the bank, what is the approximate probability that more than 120 will visit the drive-through window?
Solution

Exercise 5.3.2. It is known that the probability that a household owns a food processor is 0.1. If 190 households are interviewed, find the approximate probability that

more than 26 households own a food processor;
less than 30 households own a food processor.

Solution

Exercise 5.3.3. The campaign committee of a candidate claims that sixty percent of the voters are in favor of the candidate. You interview 150 voters. Assuming that the campaign committe's claim is accurate, what is the approximate probability that less than 77 will favor the candidate?
Solution

Exercise 5.3.4. A technique is used to fertilize eggs in a fertility clinic laboratory. It is known that the probability that an egg will be fertilized by this technique is 0.1. If 500 eggs are treated, what is the probability that at least 60 eggs will be fertilized?
Solution

Exercise 5.3.5. The probability that a computer chip produced in a factory is defective is is .2. If you have a sample of 60 chips, what is the probability that the number of defective chips will be less than 20?
Solution

Exercise 5.3.6. The probability that a light bulb produced by a machine is defective is p = 0.2. Suppose a quality control inspector takes a sample of 120 bulbs. What is the probability that more than 30 bulbs will be defective?
Solution

Exercise 5.3.7. Suppose the probability that a student has access to the Internet is p = 0.8. Suppose you interview 160 students. What is the probability that less than 120 students will have access to the Internet?
Solution

Exercise 5.3.8. Suppose that the probability that a person favors medical use of marijuana is p = 0.6. If 780 individuals are interviewed, what is the probability that less than 450 will be in favor?
Solution

Exercise 5.3.9. Suppose that the probability that a middle-income family invests in the stock market is p = 0.8. If we interview 880 middle-income families, what is the probability that more than 700 have invested in the stock market?
Solution

Exercise 5.3.10. Suppose that an insurance company knows from experience that the probability that a life-insurance policyholder will survive another 10 years is p = 0.9. The company has 2280 policyholders. What is the probability that more than 2025 will survive another 10 years.
Solution

Math 365, Elementary Statistics

Lesson 5 : Continuous Random Variables

5.1 Probability Density Function (pdf)

5.2 The Normal Random Variable

5.3 Normal Approximation to Binomial