Math 365, Elementary Statistics |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Chapter 4 : Random VariablesSatya Mandal 4.1 Random VariablesA random variable is a complete description of a numerical characteristic of the population. Among the examples would be (1) Weight of the fish population in a lake, (2) Number of typos in the textbooks used in KU, (3) GPA of the student population, (4) Delay in departure time of all the commercial flights originating from US. The following is a formal definition. Definition. A random variable is a rule or a formula or a mechanism that associates a numerical value to each member of the the sample space S. So, given a member w of S, a variable X assigns a numerical value X(w) to w. For us, X(w) will be a characteristic (like height, weight, time, salary) of the population.
Example. Suppose the KU student population is under study. Therefore, sample space S is the whole collection of KU students population. A KU student is a sample unit. Following are some of random variables:
Definition. Random variables are classified to two different types: the continuous random variables and discrete random variables.
Examples of Continuous and Discrete Variables
Two Examples.
|
Value x |
Probability p(x) |
---|---|
x1 | p(x1) |
x2 | p(x2) |
x3 | p(x3) |
… | … |
Properties of Probability function. Suppose X is a discrete random variable that assumes value
x1, x2, x3, …
and let p(x) be the probability function. Then we have the following:
0 ≤ p(xi) ≤ 1.
∑ p(xi) = 1.
Definitions. Let X be a discrete random variable
that assumes the values
x1, x2, x3, …
and p(x) be the probability function of X.Then the mean μ of X is defined as
μ=∑ xip(xi).
The mean μ is also called the expected value of X and is denoted by E(X). The mean μ is also called the population mean.
σ2= Variance(X)=∑ (xi-μ)2p(xi).
Some simplification will show
σ2= Variance(X)=∑ xi2p(xi) - μ2.
The variance σ2 is also called the population variance. If we take a large sample and compute the sample variance s2 then s2 will be an estimate for σ2.
The standard deviation σ of X is defined as the positive square root of the variance of X.
standard deviation of X= σ =√Variance(X)
The standard deviation σ is also called the population standard deviation.
Remark. As was mentioned above, a random variable X represents numerical characteristics of the population. Statistics has a place in life, only because the population is unknown and it needs to be modeled and estimated. In particular,
Example. Suppose you design a coin toss game . In this game, you give the opponent $3 if a head comes and you collect $1 if a tail comes. Let X be the money you receive. Then X assumes the values -3 and 1. You also have a loaded coin so that
P(H) = 1/9 P(T) = 8/9.
Then the probability distribution of X is given by
Value x |
Probability p(x) |
---|---|
-3 | 1/9 |
1 | 8/9 |
So, the mean μ of X is given by
μ=∑ xip(xi)= (-3)(1/9)+1(8/9)=5/9.
The variance is given by
σ2 = ∑ xi2p(xi) - μ2 =(-3)21/9 +128/9 - (5/9)2 =1.5802
The standard deviation is given by
σ =√Variance(X) = √1.5802=1.2571.
Interpretation of mean μ
of X:
Similarly, if Z represents the height of the KU student population, then the mean μ = E(Z) is the actual mean height of the KU student population. If we take a large sample from the KU student population and compute the sample mean, it should approximate μ.
Problems on 4.2: Probability Distribution
Exercise 4.2.1. The number of passengers X in a car on a freeway has the following probability distribution.
X=x | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
p(x) | 0.35 | 0.30 | 0.15 | 0.15 | 0.05 |
Find:
Exercise 4.2.2. Karin is a plumber who works
for 3 different employers. Employer A pays her $120 a day, employer
B pays her $70 dollars a day, and employer C pays her $180 a day. She
works for whoever calls her first. The probability that employer A calls
her first is 0.30; the probability that employer B calls first is .20;
and the probability that employer C calls her first is 0.40 (the probability
that no one calls is .10). What is the expected income and variance
of Karin per day?
Solution
Exercise 4.2.3. An insurance company sells
a flight insurance policy at a flat rate of $500 per flight. If a policyholder
dies in flight, the insurance company pays $100,000 to the survivors.
The probability that a policyholder will die in flight is .003. What
is the expected gain and variance of the company per sale?
Solution
Exercise 4.2.4. The following table gives the proportion of credit hours that earned grades F, D, C, B and A in KU:
grade | A | B | C | D | F |
proportion | .15 | .35 | .30 | .15 | .05 |
Let X represent the points earned for grades A,B,C,D and F. Write down the probability distribution of X and compute the mean (or the ex- pected value E(X)) and the standard deviation.
Solution:
We have X = 0, 1, 2, 3, 4 respectively, when the grades are F, D, C, B, A. Therefore, the distribution of X is given by
x | 0 | 1 | 2 | 3 | 4 |
p(x) | .05 | .15 | .30 | .35 | .15 |
There are many random variables that we encounter fairly often. The first one that we discuss is called a Bernoulli random variable.
Definition. There are many statistical experiments
that have only two outcomes. In such cases, the outcomes may be called
a success
or a failure. So the sample space is
S={s,f}.
Here s means success and f means failure.
Such an experiment is called a Bernoulli trial.
Given a Bernoulli trial, we can define a random variable as
X = 1 if success
X = 0 if failure
If the probability P(success) = p then we have P(failure) = 1-p. So, the probability distribution of a Bernoulli random variable is given by
Value x |
Probability p(x) |
---|---|
0 | 1-p |
1 | p |
The mean of X is
μ = 0(1-p)+1p = p.
The variance of X is
σ2 = ∑ xi2p(xi) - μ2 = (0.(1-p)+1p) -p2 = p-p2 = p(1-p).
Note that the variance σ2 =(probability of success)*(probability of failure).
The standard deviation of X is
σ =√p(1-p)
Examples of Bernoulli random experiment (or trial) would be
Binomial Random Variable
Definition. An interesting statistical experiment
is a combination of n "identical and independent"
Bernoulli trials. Such an experiment is called a binomial
experiment. More formally, given a positive integer n and a number
p with 0 ≤ p ≤ 1 a binomial(n,p) experiment
(or B(n,p) experiment) is characterized as
follows:
Definition. Given a B(n,p)-experiment, let
X = total number of successes in these n trials.
Then X is called a binomial (n,p)-random (or B(n,p)-random) variable. Following are some important facts about a B(n,p)-random variable X. ( We will not try to prove them in this course.)
p(r) = P(X = r) = P(r success) = nCr pr(1-p)n-r
where r runs through 0,1,2,…,n.
μ = E(X) = np.
mean =μ = (number of trial)*(probability of success in each trial).
σ2 = Variance(X) = np(1-p).
variance =σ2 = (number of trial)*(probability of success in each trial)*(probability of failure in each trial).
σ =√np(1-p)
Following shows the graph of the probability function of a binomial random variable. It also computes probabilities.
Animation 4.3.1 |
|
Use of Calculators (TI-84): |
---|
Computing Binomial Probability:
In "binomilapdf", cdf stands for probability density function. (It was not proper for them to use the word "density".) |
binomilacdf function in TI-84: Again, suppose X is aB(12, .6) random variable. To compute P(X is at mosr 8) = P(0 ≤ X ≤ 8) do the following. a) Press 2nd and then Distr (VARS) b) Scroll down to binomialcdf and ENTER d) type in "12, .6, 8)" and ENTER. TI will give the answer. In "binomilacdf", cdf stands for cummulative density function. (It was not proper for them to use the word "density".) |
Problems on 4.3: Binomial Experiments
Exercise 4.3.1. Let X be a B(6,.3)-random
variable. Find P(X = 2). Also find the probability that X is at least
2.
Solution
Solution by TI-84:
Here n=5, p=.3. We use TI-84 :
P(X=2)= binomialpdf(6,.3,2)=.324135
Exercise 4.3.2. According to a report entitled "Pediatric Nutrition Surveillance" published by Centers for Disease Control (CDC), 18 percent of children younger than 2 years had anemia in 1997. On a particular day, a pediatrician examined 11 children.
Solution by TI-84:
Here n=11, p=.18.
X= Number of children with anemia. X is B(11, .18) random variable.
We use TI-84 :
Exercise 4.3.3. A gardener planted 15 seeds. The probability that a seed will germinate is 0.1.
Solution by TI-84:
Here n=15, p=.1.
X= Number of seed that will germinate. X is B(15, .1) random variable.
We use TI-84 :
Exercise 4.3.4. In a particular county, 60 percent of the population is Hispanic.
Solution by TI-84:
Here n=12, p=.6.
X= Number of Hispanic juries. X is B(15, .6) random variable.
We use TI-84 :
Exercise 4.3.5. From the hiring statistics
of a corporation (say IBM), it is known that for every 4 interviews
they give, they make 1 job offer. Suppose that the corporation interviews
8 candidates each time it comes to campus. What is the mean and standard
deviation of the number of job offers made each time?
Solution
Solution by TI-84:
Here n=8, p=.25.
X= Number of job that will be offered each time. X is B(8. .25) random variable.
We use TI-84 :
Remark. In the some of the problems above, sometimes we had to add only less than 10 terms with the binomialpdf function. In a real life situation, one may have to add a large number of such terms. In those cases, it is better to use binomialcdf function. Following are some such problems.
Exercise 4.3.5. It is believed proportion of voters (in a county) who vote by absentee ballot is p=.18. You sample 725 voters.
Solution by TI-84:
Here n=725, p=.18.
X= Number of absentee votes among this sample of 725.
X is a B(725, .18) random variable.
We use TI-84 :
Exercise 4.3.6. About 27 percent of the population take flu shots. You are in a class of 750 students.
Solution by TI-84:
Here n=750, p=.27.
X= Number of students in this class who took the flu shot.
X is a B(750, .27) random variable.
We use TI-84 :
Exercise 4.3.7. It is believed that 35 percent of the population in a county shop in health food market. If you sample 800 individuals, what is the probability that at least 400 would shop in health food market?
Solution by TI-84:
Here n=800, p=.35.
X= Number of shop in health food market.
X is a B(750, .27) random variable.
We use TI-84 :
Exercise 4.3.8. It is known that 78 percent of the microwave ovens last more that five years. A SQC inspector sampled 600 microwaves.
Exercise 4.3.9. It is known that a vaccine may cause fever as side effect, after one takes the shot. The producer of the vaccine claims that only 11 percent of those who take the shot experience such side effects. You sample 978 individuals who took the shot.
Remark. When the number of trials n in a binomial experiment is too large (say more than 1000) TI-84 fails. We will provide a remedy to this problem in Lesson 5. Following variation of the above demonstrate that TI-84 fails.
Exercise 4.3.10. It is known that a vaccine may cause fever as side effect, after one takes the shot. The producer of the vaccine claims that only 11 percent of those who take the shot experience such side effects. You sample 1000 individuals who took the shot.