<< Chapter < Page | Chapter >> Page > |
Just as there were confidence intervals for proportions, or more formally, the population parameter p of the binomial distribution, there is the ability to test hypotheses concerning p .
The population parameter for the binomial is p . The estimated value (point estimate) for p is p′ where p′ = x/n , x is the number of successes in the sample and n is the sample size.
When you perform a hypothesis test of a population proportion p , you take a simple random sample from the population. The conditions for a binomial distribution must be met, which are: there are a certain number n of independent trials, the outcomes of any trial are binary, success or failure, and each trial has the same probability of a success p . The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np′ and nq′ must both be greater than five ( np′ >5 and nq′ >5). In this case the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with and . Remember that . There is no distribution that can correct for this small sample bias and thus if these conditions are not met we simply cannot test the hypothesis with the data available at that time. We met this condition when we first were estimating confidence intervals for a binomial.
Again, we begin with the standardizing formula modified because this is the distribution of a binomial.
Substituting , the hypothesized value of p, p', and q' , we have:
This is the test statistic for testing hypothesized values of p , where the null and alternative hypotheses take one of the following forms:
Two-Tailed Test | One-Tailed Test | One-Tailed Test |
---|---|---|
H 0 : p = p 0 | H 0 : p ≤ p 0 | H 0 : p ≥ p 0 |
H a : p ≠ p 0 | H a : p>p 0 | H a : p<p 0 |
The decision rule stated above applies here also: if the calculated value of Z c is "too many" standard deviations from the proportion of the hypothesized distribution, the null hypothesis cannot be accepted. The decision as to what is "too many" is pre-determined by the analyst depending of the level of confidence required in the test.
Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50% . Joon samples 100 first-time brides and 53 reply that they are younger than their grooms. For the hypothesis test, she uses a 5% level of significance.
STEP 1 : Set the null and alternative hypothesis.
H 0 : p = 0.50 H a : p ≠ 0.50
The words "is the same or different from" tell you this is a two-tailed test. The Type I and Type II errors are as follows: The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%.(Reject the null hypothesis when the null hypothesis is true). The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, infact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)
STEP 2 : Decide the level of confidence and draw the graph showing the critical value
The level of confidence has been set by the problem at the 95% level. Because this is two-tailed test one-half of the alpha value will be in the upper tail and one-half in the lower tail as shown on the graph. The critical value for the normal distribution at the 95% level of confidence is 1.96. This can easily be found on the student’s t-table at the very bottom at infinite degrees of freedom remembering that at infinity the t-distribution is the normal distribution. Of course the value can also be found on the normal table but you have go looking for one-half of 95 (0.475) inside the body of the table and then read out to the sides and top for the number of standard deviations.
STEP 3 : Calculate the sample parameters and critical value of the test statistic.
The test statistic is a normal distribution, Z, for testing proportions and is:
For this case, the sample of 100 found 53 first-time brides who were younger than their groom. The sample proportion, p′ = 53/100= 0.53 The test question, therefore, is : “Is 0.53 significantly different from .50?” Putting these values into the formula for the test statistic we find that 0.53 is only 0.60 standard deviations away from .50. This is barely off of the mean of the standard normal distribution of zero. There is virtually no difference from the sample proportion and the hypothesized proportion.
STEP 4 : Compare the test statistic and the critical value.
The calculated value is well within the critical values of ± 1.96 standard deviations and thus we cannot reject the null hypothesis. To reject the null hypothesis we need significant evident of difference between the hypothesized value and the sample value. In this case the sample value is very nearly the same as the hypothesized value measured in terms of standard deviations.
STEP 5 : Reach a conclusion
The formal conclusion would be “At a 95% level of confidence we cannot reject the null hypothesis that 50% of first-time brides in the United States are younger than their grooms”. Less formally we would say that “There is no evidence that one-half of first-time brides in the United States are significantly different in age from their grooms”. Notice the length to which the conclusion goes to include all of the conditions that are attached to the conclusion. These are first-time brides; these are marriages in the United States. All this matters. Statisticians for all the criticism they receive, are careful to be very specific even when this seems trivial. Statisticians cannot say more than they know and the data constrain the conclusion to be within the metes and bounds of the data.
Notification Switch
Would you like to follow the 'Introductory statistics' conversation and receive update notifications?