<< Chapter < Page Chapter >> Page >

The Central Limit Theorem, as before, provides us with the standard deviation of the sampling distribution, and further, that the expected value of the mean of the distribution of differences in sample means is equal to the differences in the population means. Mathematically this can be stated:

E ( µ x - 1 - µ x - 2 ) = µ 1 - µ 2

Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or standard error , of the difference in sample means , X ¯ 1 X ¯ 2 .

The standard error is:

( s 1 ) 2 n 1 + ( s 2 ) 2 n 2

The test statistic ( t -score) is calculated as follows:

t c = ( x ¯ 1 x ¯ 2 ) δ 0 ( s 1 ) 2 n 1 + ( s 2 ) 2 n 2

    Where:

  • s 1 and s 2 , the sample standard deviations, are estimates of σ 1 and σ 2 , respectively.
  • σ 1 and σ 1 are the unknown population standard deviations.
  • x ¯ 1 and x ¯ 2 are the sample means. μ 1 and μ 2 are the population means.

The number of degrees of freedom ( df ) requires a somewhat complicated calculation. The df are not always a whole number. The test statistic calculated previously is approximated by the Student's t -distribution with df as follows:

Degrees of freedom

d f = ( ( s 1 ) 2 n 1 + ( s 2 ) 2 n 2 ) 2 ( 1 n 1 1 ) ( ( s 1 ) 2 n 1 ) 2 + ( 1 n 2 1 ) ( ( s 2 ) 2 n 2 ) 2

When both sample sizes n 1 and n 2 are five or larger, the Student's t approximation is very good. If each sample has more than 30 observations then the degrees of freedom can be calculated as n1 + n2 - 2.

The format of the sampling distribution, differences in sample means, specifies that the format of the null and alternative hypothesis is:

H 0 : µ 1 - µ 2 = δ 0
H a : µ 1 - µ 2 δ 0

where δ 0 is the hypothesized difference between the two means. If the question is simply “is there any difference between the means?” then δ 0 = 0 and the null and alternative hypotheses becomes:

H 0 : µ 1 = µ 2
H a : µ 1 µ 2

An example of when δ 0 might not be zero is when the comparison of the two groups requires a specific difference for the decision to be meaningful. Imagine that you are making a capital investment. You are considering changing from your current model machine to another. You measure the productivity of your machines by the speed they produce the product. It may be that a contender to replace the old model is faster in terms of product throughput, but is also more expensive. The second machine may also have more maintenance costs, setup costs, etc. The null hypothesis would be set up so that the new machine would have to be better than the old one by enough to cover these extra costs in terms of speed of production. This form of the null and alternative hypothesis shows how valuable this particular hypothesis test can be. For most of our work we will be testing simple hypotheses asking if there is any difference between the two distribution means.

Independent groups

The average amount of time boys and girls aged seven to 11 spend playing sports each day is believed to be the same. A study is done and data are collected, resulting in the data in [link] .

Sample Size Average Number of Hours Playing Sports Per Day Sample Standard Deviation
Girls 9 2 0.866
Boys 16 3.2 1.00

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introductory statistics. OpenStax CNX. Aug 09, 2016 Download for free at http://legacy.cnx.org/content/col11776/1.26
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introductory statistics' conversation and receive update notifications?

Ask