Math 105, Topics in Mathematics |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Lesson 3: Measure of Locations of DataIntroduction
Last lesson was mostly about graphical representations of data.
A measure of central tendency of a data represents an "average value." Mean, median, mode (if you already know them) are measures of central tendency. A measure of dispersion is a measure of how widely the data is scattered. 3.1 Measure of Central tendency or Location
The most important measure of central tendencies is the mean or arithmetic mean. Definition. The mean or the arithmetic mean of a set data is given by
If the data is a sample, then the mean is also called the sample
mean.
The Frequency Table and the Mean When the frequency table of a data set is given, then we can use the frequency table to compute the mean of the original data. Let us consider the the following example: Example 3.1.1 To estimate the mean time taken to complete a three-mile drive by a racecar, the racecar did several time trials, and the following sample of times taken (in seconds) to complete the laps was collected:
Now we want to compute the mean time.
To do so, we add all the data values and divide by the data size 35.
The frequency distribution tells us that, in the data, 46 was present one time, 47 was present one time, 48 was present three times, and so on. So, if we use the frequency distribution, we compute the mean as follows:
This example shows that, to compute the mean of the original data, we can use the frequency table. A new formula for the mean is
where fi is the frequency of xi. Properties of the Mean:
Example 3.1.2. Suppose the mean score in
the class on a test is 78 points. Then the teacher announces that you
can add 7 points to your score. Now what is the new mean score? Example 3.1.3. Suppose it is known that the
average income in Canada in 35,000 Canadian dollars. What is the average
Canadian income in U.S. dollars? The Median The median represents the middle value of the data. Half the data is less than or equal to the median, and half the data is greater than or equal to the median. You are above the median-American-income if half the American population are making less than you make. Definition. Suppose the data is arranged in an increasing order (i.e., in an array). If the data size is odd, then the median is the middle value. If the data size is even, then the median is the mean of the middle two values. The Percentiles: Definition. For a number p between 0 to 100, the pth percentile xp of the data is a number such that at least p percent of the data members are below xp, and at least 100-p percent of the data members are above xp.
There is one other measure of central tendencies. Definition. The MODE of the data is the value that appears maximum number of times in the data.
Problems on 3.1: Mean and Median Exercise 3.1.1. The following
is the price (in dollars) of a stock (say CISCO SYSTEMS, if you like)
checked by a trader several times on a particular day.
Find the median price and mean price observed by the trader. Exercise 3.1.2. The
following figures refer to GPA of six students:
Find the median and mean GPA. Exercise 3.1.3. The following data give the lifetime (in days) of certain light bulbs.
Find the mean and median life time of these bulbs.
Exercise 3.1.4. An
athlete ran an event 32 times. The following frequency table gives the
time taken (in seconds) by the athlete to complete the events.
Compute the mean median time taken by the athlete. Exercise 3.1.5. The following is the weight (in ounces), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.
Compute the mean and median weight, at birth, of the babies. Solution Exercise 3.1.6. Following
is some data on the hourly wages (paid only in whole dollars) of 99
employees in some industry.
Compute the mean hourly wage and median. Solution Exercise 3.1.7. The following
is the frequency table on the number of typos in a sample of 30 books
published by a publisher.
Find the mean and median number of typos in a book. Solution Exercise 3.1.8. The
following are the length (in inches), at birth, of 96 babies born in
Lawrence Memorial Hospital in May 2000.
Compute the mean and median length, at birth, of these babies. 3.2 Measure of Dispersion or Variability
Clearly, the measures of central tendency--mean, median, mode--cannot tell us the "whole story" about the data. They do give us an idea where the "central values" of the data belong. Example 3.2.1. Suppose two sections of the statistics class have the following percentage score distribution at the end of the semester:
Both these sections have the same mean 82. But in Section A, everybody will get a B grade. In section B, we will have two Cs, one B, and two As. In which section would you like to be? The measure of dispersion will give you some idea. The measure of dispersion (or variability) is a measure of how widely the data are scattered around. In section A, the data have a very small dispersion or variability, while in section B they have a large dispersion. A simple (or the simplest) measure of dispersion is the range of the data as we have defined before:
We discuss three more of measures of dispersions.
Let us quickly do some computation with the above example 3.2.1: The mean deviation for section A = (1+2+1+2+0)/5= 6/5 and the mean deviation for section B = (10+11+10+0+11)/5= 42/5. Because the variability of the section B was much higher the mean deviation was very high. Let us compute the the sample variances: For section A the sample variance = [(81-82)2+(84-82)2+(83-82)2+
(80-82)2+(82-82)2] /[5-1] For section B the sample variance is = [(72-82)2+(93-82)2+(92-82)2+
(82-82)2+(71-82)2] /[5-1]
Problems on 3.2: Variance and Standard Deviation Exercise 3.2.1. The following is the price (in dollars) of a stock (say CISCO SYSTEMS, if you like) checked by a trader several times on a particular day.
Find the variance and standard deviation of the price. Exercise 3.2.2. The following figures refer to GPA of six students:
Find the variance and standard deviation of GPA. Exercise 3.2.3. The following data give the lifetime (in days) of certain light bulbs.
Find the variance and standard deviation of lifetime of these bulbs.
Exercise 3.2.4. An athlete ran an event 32 times. The following frequency table gives the time taken (in seconds) by the athlete to complete the events.
Compute the variance and standard deviation of time
taken by the athlete. Exercise 3.2.5. The following is the weight (in ounces), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.
Compute the variance and standard deviation of the weight, at birth,
of these babies. Exercise 3.2.6. Following is some data on the hourly wages (paid only in whole dollars) of 99 employees in some industry.
Compute the variance and standard deviation of the hourly
wages. Exercise 3.2.7. The following is the frequency table on the number of typos in a sample of 30 books published by a publisher.
Find the mean number, variance and standard deviation of
typos in a book. Exercise 3.2.8. The following are the length (in inches), at birth, of 96 babies born in Lawrence Memorial Hospital in May 2000.
Compute the variance and standard deviation of the length, at birth,
of these babies. Exercise 3.2.9. What does it mean when the variance or mean deviation of some data is zero? The answer is that all the data members are EQUAL! Exercise 3.2.10. The following represents the frequency table of weight (in pounds) of some salmon in a river.
Find the mean, variance, and the standard deviation.
Exercise 3.2.11. The following data represent time (in minutes) taken by students to drive to the campus.
Find the mean, variance, and the standard deviation
of the data. 3.3 Use of the Frequency Table
When a frequency table is given to compute the mean and variance of the data we can give new formulas. Formulas. Suppose the data consisting of n observations are given in a frequency table (ungrouped). Let xi denote the values and fi be the frequency of xi. Then the variance = s2 = (f1(x1 - x)2 + f2(x2 - x)2 + ... ) )/ (n-1) We have already told you how to use the calculator to compute the variance and standard deviation when we have a frequency table. |