Ed 510
Applications of Educational Research
| Here are
some terms that will help you understand this week's lesson.
Population
|
Introduction
Samples and populations are related concepts. A population is the
entire lot on which a universe of observations can be found. Samples
are drawn from populations. Therefore they are smaller than the population
itself. Although this seems obvious one sometimes can detect confusion
in the words used by researchers. For example, one might read in
a study about the sample population. There is no such thing.
The researcher is probably trying to express that a sample was drawn from
a particular population of importance to a study. In this case the
population should be referred to as the population of interest. The
sample is a subset of that population.
Populations
Populations may be either theoretical or real. Theoretical populations
do not exist in real terms. They are instead large distributions
of numbers or scores that have been generated in order to model a particular
measurable phenomenon. For example, an economist may create a theoretical
population that represents the activity of stock market prices under specific
conditions. An educational researcher may create a theoretical population
that represents the hypothetical distribution of responses to items on
a test so that real distributions, based on actual scores, can be compared
to the hypothetical case. Theoretical populations always represent
score distributions under a hypothetical set of conditions. Real
distributions of scores are then compared to the hypothetical case.
It would be
interesting to see if members of the class could think of examples where
an educational researcher might find this approach useful.
Real populations
are based on real scores or numerical values that have been collected as
actual data. These populations are described in terms of the measured
characteristics of important variables. For example, the population
of IQ scores in Media refers to the scores as a population rather than
to the human beings who took the test. The study of populations is
fundamentally about the statistical characteristics of distributions.
Samples
Samples are
subsets of populations. They are described in terms of the percentage
of the population that is sampled.
One reads of a 10 percent sample, or a 25 percent sample. Samples are also described in terms of the number (N) of observations that any sample contains. Thus a sample is described as: a 25 percent sample (N=100). One can tell immediately that the population size was 400.
Samples are used because they are more efficient and less costly to study than entire populations. They are also more accessible in many cases. It is much easier to study a sample of Learning Disabled students than it is to hunt down every simile individual in a category for the purposes of research.
Because samples are supposed to represent the population from which they have been drawn, they are proxies for the population itself. It is essential that the statistical description of a sample be a closely matched to the statistical description of its parent population. When a population of 10,000 IQ scores has a mean of 115 and a standard deviation of 15, it is expected that the sample that represents that population also has a mean of 115 and a standard deviation of 15. When there is a close match, the sample is described as unbiased. When there are discrepancies between sample and population in terms of important descriptive statistics, then the sample is described as biased.
Sampling bias
is important to document. The reason? Anytime a sample deviates
statistically from its parent population it ceases to be a good proxy.
The greater the deviation, the greater the bias. Information that
is gathered from such a sample cannot be used to generalize to the population.
So sampling bias refers to the extent to which a sample and its statistical
characteristics do not correspond to the characteristics of the parent
population.
| Have some
fun.
A party boat sailed off the coast of Cape May to fish for flounder. It was reported that a huge population of flounder could be found in a certain location, approximately 10,000 fish in all. It was further reported that on average fish in this population weighed about 5 pounds. As baskets of founder were weighed by the party goers, they noticed that one basket contained flounder that weighed either 1 or 7 pounds, but averaged 5 pounds. Another basket contained flounder that weighed exactly 5 pounds per fish, and averaged naturally 5 pounds. Yet a third basket contained flounder that weighed either 4 pounds or 6 pounds, and averaged 5 pounds. A fourth basket contained fish that weighed 6 to 8 pounds per fish. Thus the average weight of a flounder could not be 5 pounds. What does this final set of measurements represent? Answer: B gmvlf! Can you decode the cryptogram? |
Sampling
error- Sampling error can be demonstrated statistically. It is
a value that basically determines the extent to which a sample deviates
from population expectations. Imagine an infinite number of samples
drawn from a population of numbers. Each sample consists of scores
and a mean can be calculated for the sample. Each mean should approximate
the population mean. However this will not always happen. Some
samples will have means that are larger than the population mean, others
will have means that are smaller than the population mean. This finding
represents sampling error. However when all the samples are averaged
together, the grand mean of all sample means should approximate the mean
of the population. When this does not happen one looks for individual
samples whose means are out of line. These will be the samples that
contribute to sampling errors.
The samples
should be understood as comparable to individual scores. One can
graph a set of individual scores as a frequency distribution and locate
the center (and the mean) of the distribution. One can also graph
a set of samples using sample means instead of scores. The grand
mean of all samples should define the center of the sampling distribution.
Scientific
or random sampling - This an approach to sampling that seeks to minimize
bias. It is also referred to as random sampling. Random sampling
implies that all observations have been drawn from the population in such
a way that each observation has an equally likely chance of being observed
or drawn. This phenomenon is called the equal likelihood assumption
and it is the key characteristic of random or scientific sampling.
In order to
insure that a sample is random, the selection of sample members is conducted
in ways that enhance randomization. Numbers picked out of a hat or
the use of a table of random numbers are frequently used approaches.
More modern
practices include the use of computer software packages that use random
number generators.
Systematic
sampling - If a population is very large and its members can be arranged
or listed in a sequential way, it is also possible to select a random number
as a starting point and then select all other members of the sample on
a percentage basis. When this procedure is used, it is referred to
as systematic sampling.
Stratified
random sampling is another approach to random sampling. A sample
is subdivided according to categorical or discrete variables that are important
to a research question. Think of the population from which Dr. Peoples'
drew her research subjects. They are described by variables such
as sex, tenure status, school principal or not, or personality type (AE
or RPS), etc. Then individual members are drawn from the parent population
to represent membership in each of the categories. Any member that
is drawn must also be a member of each category. So, for example,
an individual drawn as part of a stratified sample must be either a male
or a female, tenured or non tenured, a principal or non principal, and
either an AE or a RPS. In a stratified random sample each category
of subjects must also be proportionally represented. Thus if 50 percent
of the population is male, then 50 percent of the sample must also be male.
If 40 percent of the population is AE, then 40 percent of the sample must
also be AE. The process of sampling subjects continues until all
categories are filled, and in filling those categories, the resulting size
of each group represents the percentages of each group in the population
at large.
Weighted
or Tailored samples - There are times when researchers solve problems
using samples that come from populations that are difficult to describe.
A researcher may not know how large a population is and therefore is unable
to calculate with confidence the important descriptive statistics for the
population. Under these circumstances it is impossible to know if
a random sample represents the population.
Weighted samples
are therefore samples that consist of groups of observations that are studied
separately. The members are drawn at random according
to characteristics that are known to be important and descriptive of the
population. It is believed that as several samples are aggregated,
the overall effect of all samples will be to represent the population at
large. However, individual samples will be poor proxies. Only
when taken together do they represent the population at large.
Studies of
consumer behavior, publishers of text book series and standardized tests,
studies of voter behavior, and US census studies of the country's population
use weighted samples frequently. The exact size of the population
of interest is unknown, as are the descriptive statistics. Therefore
knowledge of a population will be crafted from the study of samples that
when pooled begin to resemble the larger population statistically.
That is why sampling is frequently based on regions of the country as the
starting point, rather than the country as a whole.
Sample accuracy
- Stratified random samples are believed to be the most accurate.
Sampling errors are estimated to be 1 percent. Simple random samples
have an error rate of approximately 5 percent. Weighted samples can
vary in accuracy from 85 to 90 percent and when well designed have error
rates from 10 to 15 percent. What principles might a researcher apply
to decide which sampling method to use? It may not be necessary
to have 99 percent accuracy, or even desirable.
How large
should a sample be? Your internet readings discuss this point
at length. It is important to remember that the smaller the population,
the greater the percentage of members must be sampled in order to insure
reasonable accuracy. Why would that be the case?
In addition,
as a population exceeds 10,000 members in size, the size of a sample needed
to guarantee a reasonable level of accuracy tends to level off. A
sample size of 1000 will be about as accurate for a population of 10,000
as for a population of 100.000. Why would that be the case?
Finally, consider
the guidelines that appear in the table below. They help us understand
that there is always a tradeoff between sample size and sampling error.
| Sample size is smaller | Sample size is larger | ||
| Sampling is random | Generalizability
is high
Errors may be high |
Least error
Most confidence |
|
| Sampling is non random | Most error
Least confidence |
Errors may be low Generalizability is low |
How does a
researcher sort through this information?
Page created March 17, 2001. Copyright - Antonia D'Onofrio - 2001/2002/2003.