Lesson 9: Population Size, Census and Sampling
9.1 Populations Size
Given a population S, by the population size ,
we mean the total number of members in S. The population size is sometimes
denoted by
N.
We will discuss the Capture-recapture Method of estimating the
population size N.
For example, to estimate the size N of the fish
population in in Clinton Lake, we do the following:
- Step 1. (The capture) Capture a sample of m fish, tag them, and release them back into the water.
The idea is to estimate the proportion p = m/N of tagged fish.
-
Step 2. (The recapture) After everything has settled down,
capture a new sample of n fish. Count the number of tagged fish.
Suppose that k of them are tagged.
- It is reasonable that,
k/n would be a good estimante for p = m/N.
-
Accordingly, for an estimate
N of N, se solve the equation
m/N= k/n.
- So, N is given by
- Obviously, same applies to any other similar situation.
-
Section 7.3 and 8.3 will apply. However,
we will not get in to any discussion on that.
Problems on 9.1: Estimating population size N
Exercise 9.1.1. As part of
a project we made two trips to a local lake.
The first day we caught m=325 fish
and tagged them. On the second day we caught n=525 fish,
and of those
k=125 were tagged fish. Estimate the total number
of fish in the lake.
Solution:
Here m=325, n=525 and k=125
So, an estimate of N is
N = mn/k = 325*525/125 = 1365
Solution
|
Exercise 9.1.2. Last year you tagged m = 526 birds migrating through Lawrence. This year again you captured
n = 517 birds migrating through Lawrence, and of those k = 113 were
tagged last year. Estimate the total number of birds migrating through
Lawrence every year.
Solution:
Here m= 526, n= 517 and k= 113
So, an estimate of N is
N = mn/k = 526* 517/113 ≈ 2407
Solution
|
Exercise 9.1.3.
You want to estimate the number of homeless people in New York.
On a night you identify 376 homeless people in New York.
After some time, on another night
you identify 497 homeless people. Of these 497, you found that 119 were identified last time as well. Estimate the number of homeless people in New York.
Solution:
Here m= 376, n= 497 and k= 119
So, an estimate of N is
N = mn/k = 376* 497/119 ≈ 1570
Solution
|
Exercise 9.1.4.
To estimate the number of tigers in Sunderban you capture 194 tigers and
tag them. After some time you capture 212 tigers, and of those 87
were tagged. Estimate the number of tigers in Sunderban.
Solution:
Here m= 194, n= 212 and k= 87
So, an estimate of N is
N = mn/k = 194* 497/87 ≈ 1108
Solution
|
Exercise 9.1.5
To estimate population of butterflies, 1500 butterflies were tagged in Lawrence. After another two weeks, 1800 butterflies were recaptured. Out of them 300 were tagged. Estimate the popula- tion of the butterflies in Lawrence.
|
Exercise 9.1.6 To estimate the snake population in a forest area, 78 snakes were captured and tagged. After a few weeks, 112 snakes were caught again and out of them 26 were tagged last time. Estimate the snake population.
|
Exercise 9.1.7 To estimate the alligator population in a certain
area in Florida, 314 alligator were captured and tagged.
After a few months again, 628 alligators were captured and out of them
157 were tagged last time. Estimate the alligators population.
|
Exercise 9.1.8 To estimate the bison population in
an area in Califor- nia, 228 bisons were tagged on a day. After a few months
again, 285 bisons were captured. Out of them 57 were tagged last time.
Estimate the bison population in this area.
|
Exercise 9.1.9 To estimate the rhino population in an
area in Africa, 129 rhinos were tagged on a day. After a few months again,
215 bisons were captured. Out of them 43 were tagged last time.
Estimate the rhino population in this area.
|
9.2 Census, Surveys, Opinion Polls, Clinical Studies
(Reading Only)
Census
Article 1 and Article 2 of the Constitution of the United States mandates
that a national census be conducted every ten years. By census we mean
an official enumeration of the population. Not only in the United States,
but all over the world, a census is conducted every ten years.
Following are some comments about census:
- Originally the intent of the census was to count heads for taxes
and representation. That is why it may also become a political
issue
as it did during the year 2000 census. This means,
methods of counting suggested or resisted becuase
of concerns regarding undercounting of overcounting
of some segment of the population.
- Census is one major source of data about the population, and the
United Nations assumes a role in the worldwide census.
- Census has often failed to count all members of the population.
It is believed that a complete count is not really possible.
- In the 2000 census, the U.S. population was counted by using statistical
techniques. The Congress and the administration fought over this law,
and the law was challenged in the courts.
Surveys
A more realistic and economical alternative to census is to collect
data only from a small subgroup and then use this data to make inferences
about the whole population. This approach is called a survey,
and the subgroup of the population from which the data is collected
is called a sample.
The basic idea behind a survey is that if we can find a "representative"
sample of the whole population (that means it is not biased) then anything
we need to know about the population can be derived from that sample.
Public Opinion Polls
We all know about public opinion polls.
During any election season, about a dotzen polling organizations
publish poll numbers.
You can look at any standard textbook for a general discussion
on polls. Some of them would explain how and why the predictions
made by various opinion polls in the presidential elections
in 1936 (Franklin
Roosevelt vs. Alfred Landon) and 1948 (Harry Truman vs. Thomas Dewey)
went wrong. It happened because the contemporary sampling methods were
not sophisticated enough, and the samples the polsters drew failed to
represent the whole population.
These days, it is fairly common
that about a dotzen polls, predicting opposite outcome during an
election season.
It happens because of use of flawed sample that is not representative of the whole population.
Traditionally, such polls used the listing in the telephone books.
In the recent past, the advent of cell phones have created a confussion
in polling industry, because cell phones are not listed and some people do
not own a land phone.
Clinical Studies
When a vaccine or a new drug is tested, the statistical methods used
are interesting. Following are some of the the main points
regarding the process:
- We pick two samples to be called the control group and the treatment
group. The two samples need not have the same size.
- The treatment group receives the treatment, and the control group
does not receive the treatment.
- Both the groups are ignorant about who is receiving the treatment
and who is not.
- Finally, the two groups are compared. If the treatment group does
better than the control group, then it is accepted that the treatment
is working.
9.3 Sampling Methods 
Random Sampling
Developing a "representative sample" is a real challenge for
a statistician (rest is mathematics and is easy because it
has already been worked out).
If a statistician tries to pick a sample, his/her human
bias is essentially bound to result in a "biased sample." Whatever
method we use to select a sample, the selection of the sample members
must be random. That means that mathematics and methods of chance must
guide the selection of sample members. A sample picked in such a manner
is called a random sample, and the method is
called random sampling.
Another important concern regarding sampling is its cost.
We briefly describe two methods of random sampling here.
- In the method of simple random sampling
each member of the population has an equal chance of being selected
in the sample.
- The other method of sampling is called stratified
sampling: First, divide the population into categories, called
strata, and randomly select a sample from these strata. Then further
divide the chosen strata into categories, called substrata, and select
a random sample of substrata from each of those strata. The process
is continued for a number of times.
Sample Size
The sample size required for statistical studies need not be
too large,
even when the population is large. In practice, it is often less than
1500. If you follow CNN polls or others, they normally sample from 700
to 1200 people.