<< Chapter < Page | Chapter >> Page > |
Suppose you randomly sampled 10 people from the population of women in Houston Texas between the ages of 21 and 35 years andcomputed the mean height of your sample. You would not expect your sample mean to be equal to the mean of all women inHouston. It might be somewhat lower or it might be somewhat higher, but it would not equal the population meanexactly. Similarly, if you took a second sample of 10 people from the same population, you would not expect the mean of thissecond sample to equal the mean of the first sample.
Recall that inferential statistics concerns generalizing from a sample to a population . A critical part of inferential statistics involves determining how far sample statistics arelikely to vary from each other and from the population parameter . (In this example, the sample means are sample statistics and the sample parameter is thepopulation mean.) As the later portions of this chapter show ( Sampling Distribution of the Mean and Sampling Distribution of Difference Between Means ), these determinations are based on sampling distributions .
We will illustrate the concept of sampling distributions with a simple example. shows three pool balls, each with a number on it. Two of the balls are selectedrandomly (with replacement) and the average of their numbers is computed.
All possible outcomes are shown in .
Outcome | Ball 1 | Ball 2 | Mean |
---|---|---|---|
1 | 1 | 1 | 1.0 |
2 | 1 | 2 | 1.5 |
3 | 1 | 3 | 2.0 |
4 | 2 | 1 | 1.5 |
5 | 2 | 2 | 2.0 |
6 | 2 | 3 | 2.5 |
7 | 3 | 1 | 2.0 |
8 | 3 | 2 | 2.5 |
9 | 3 | 3 | 3.0 |
Notice that all the means are either 1.0, 1.5, 2.0, 2.5, or 3.0. The frequencies of these means are shown in . The relative frequencies are equal to the frequencies divided by nine because there are ninepossible outcomes.
Mean | Frequency | Relative Frequency |
---|---|---|
1.0 | 1 | 0.111 |
1.5 | 2 | 0.222 |
2.0 | 3 | 0.333 |
2.5 | 2 | 0.222 |
3.0 | 1 | 0.111 |
shows a relative frequency distribution of the means based on . This distribution is also a probability distribution since the Y-axis is the probability of obtaining a given mean from a sample oftwo balls in addition to being the relative frequency.
The distribution shown in is called the sampling distribution of the mean . Specifically, it is the sampling distribution of the mean fora sample size of 2 ( ). For this simple example, the distribution of pool balls and the sampling distribution are both discretedistribution. The pool balls have only the numbers 1, 2, and 3, and a sample mean can have one of only five possiblevalues.
There is an alternative way of conceptualizing a sampling distribution that will be useful for more complexdistributions. Imagine that two balls are sampled (with replacement) and the mean of the two balls is computed andrecorded. Then this process is repeated for a second sample, a third sample, and eventually thousands of a samples. Afterthousands of samples are taken and the mean computed for each, a relative frequency distribution is drawn. The more samples,the closer the relative frequency distribution will come to the sampling distribution shown in . As the number of samples approaches infinity, the frequency distribution will approach thesampling distribution. This means that you can conceive of a sampling distribution as being a frequency distribution basedon a very large number of samples. To be strictly correct, the sampling distribution only equals the frequency distributionexactly when there is an infinite number of samples.
Notification Switch
Would you like to follow the 'Collaborative statistics (custom lecture version modified by t. short)' conversation and receive update notifications?