<< Chapter < Page Chapter >> Page >

Importantly, in the case of the analysis of the distribution of sample means, the Central Limit Theorem told us the expected value of the mean of the sample means in the sampling distribution, and the standard deviation of the sampling distribution. Again the Central Limit Theorem provides this information for the sampling distribution for proportions. The answers are:

  1. The expected value of the mean of sampling distribution of sample proportions, µ p' , is the population proportion, p.
  2. The standard deviation of the sampling distribution of sample proportions, 𝛿 p' , is the population standard deviation divided by the square root of the sample size, n.

Both these conclusions are the same as we found for the sampling distribution for sample means. However in this case, because the mean and standard deviation of the binomial both rely upon p, the formula for the standard deviation of the sampling distribution requires algebraic manipulation to be useful. We will take that up in the next chapter. The proof of these important conclusions from the Central Limit Theorem is provided below.

E ( p ' ) = E ( x n ) = ( 1 n ) E ( x ) = ( 1 n ) n p = p
(The expected value of X, E(x), is simply the mean of the binomial distribution which we know to be np.)
σ p' 2 = Var ( p ' ) = Var ( x n ) = 1 n 2 ( Var ( x ) ) = 1 n 2 ( n p ( 1 p ) ) = p ( 1 p ) n
σ p' = p ( 1 P ) n

Parameter Population Distribution Sample Sampling Distribution of p's
Mean µ=np p' = x n p' and E(p')=p
Standard Deviation σ = npq σ p' = p ( 1 p ) n

[link] summarizes these results and shows the relationship between the population, sample and sampling distribution. Notice the parallel between this Table and Table 7.1 for the case where the random variable is continuous and we were developing the sampling distribution for means.

Reviewing the formula for the standard deviation of the sampling distribution for proportions we see that as n increases the standard deviation decreases. This is the same observation we made for the standard deviation for the sampling distribution for means. Again, as the sample size increases, the point estimate for either µ or p is found to come from a distribution with a narrower and narrower distribution. We concluded that with a given level of probability, the range from which the point estimate comes is smaller as the sample size, n, increases. Figure 7.7 on page 295 shows this result for the case of sample means. Simply substitute p ' for x ¯ and we can see the impact of the sample size on the estimate of the sample proportion.

Finite population correction factor

We saw that the sample size has an important effect on the variance and thus the standard deviation of the sampling distribution. Of interest is also the proportion of the total population that has been sampled. We have assumed that the population is extremely large and that we have sampled a small part of the population. As the population becomes smaller and we sample a larger number of observations the sample observations are not independent of each other. To correct for the impact of this, the Finite Correction Factor can be used to adjust the variance of the sampling distribution. It is appropriate when more than 5% of the population is being sampled and the population has a known population size. There are cases when the population is known, and therefore the correction factor must be applied. The issue arises for both the sampling distribution of the means and the sampling distribution of proportions. The Finite Population Correction Factor for the variance of the means is:

Z = x ¯ µ σ n * N n N 1
and for the variance of proportions is:
σ p' = p ( 1 p ) n × N n N 1

The following examples show how to apply the factor. Sampling variances get adjusted using the above formula.

Professor Price learns that the population of White German Shepherds in the USA is 4,000 dogs, and the mean weight for German Shepherds is 75.45 pounds. He also learns that the population standard deviation is 10.37 pounds.

If the sample size is 100 dogs, then find the probability that a sample will have a mean that differs from the true probability mean by less than 2 pounds.

N = 4000 , n = 100 , σ = 10.37 , µ = 75.45 , ( x ¯ µ ) = ± 2

Z = x ¯ µ σ n * N n N 1 = ± 2 10.37 100 * 4000 100 4000 1 = ± 1.95
f ( Z ) = 0.4744 * 2 = 0.9488
Note that "differs by less" references the area on both sides of the mean within 2 pounds right or left.

When a customer places an order with Rudy's On-Line Office Supplies, a computerized accounting information system (AIS) automatically checks to see if the customer has exceeded his or her credit limit. Past records indicate that the probability of customers exceeding their credit limit is .06.

Suppose that on a given day, 3,000 orders are placed in total. If we randomly select 360 orders, what is the probability that between 10 and 20 customers will exceed their credit limit?

N = 3000 , n = 360 , p = 0.06

σ p' = p ( 1 p ) n × N n N 1 = 0.06 ( 1 0.06 ) 360 × 3000 360 3000 1 = 0.0117436
p 1 = 10 360 = 0.0278 , p 2 = 20 360 = 0.0556
Z = p ' p p ( 1 p ) n * N n N 1 = 0.0278 0.06 0.011744 = −2.74
Z = p ' p p ( 1 p ) n * N n N 1 = 0.0556 0.06 0.011744 = −0.38
p ( 0.0278 0.06 0.011744 z 0.0556 0.06 0.011744 ) = p ( −2.74 z −0.38 ) = 0.4969 0.1480 = 0.3489

Formula review

Standard deviation of sampling distribution of sampling proportions: σ p' = p ( 1 P ) n

Sampling distribution of means: p' and E(p')=p

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introductory statistics. OpenStax CNX. Aug 09, 2016 Download for free at http://legacy.cnx.org/content/col11776/1.26
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introductory statistics' conversation and receive update notifications?

Ask