<< Chapter < Page Chapter >> Page >
This course is a short series of lectures on Introductory Statistics. Topics covered are listed in the Table of Contents. The notes were prepared by EwaPaszek and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

Confidence intervals ii

Confidence intervals for means

In the preceding considerations ( Confidence Intervals I ), the confidence interval for the mean μ of a normal distribution was found, assuming that the value of the standard deviation σ is known. However, in most applications, the value of the standard deviation σ is rather unknown, although in some cases one might have a very good idea about its value.

Suppose that the underlying distribution is normal and that σ 2 is unknown. It is shown that given random sample X 1 , X 2 , ... , X n from a normal distribution, the statistic T = X ¯ μ S / n has a t distribution with r = n 1 degrees of freedom, where S 2 is the usual unbiased estimator of σ 2 , (see, t distribution ).

Select t α / 2 ( n 1 ) so that P [ T t α / 2 ( n 1 ) ] = α / 2. Then

1 α = P [ t α / 2 ( n 1 ) X ¯ μ S / n t α / 2 ( n 1 ) ] = P [ t α / 2 ( n 1 ) S n X ¯ μ t α / 2 ( n 1 ) S n ] = P [ X ¯ t α / 2 ( n 1 ) S n μ X ¯ + t α / 2 ( n 1 ) S n ] = P [ X ¯ t α / 2 ( n 1 ) S n μ X ¯ + t α / 2 ( n 1 ) S n ] .

Thus the observations of a random sample provide a x ¯ and s 2 and x ¯ t α / 2 ( n 1 ) s n , x ¯ + t α / 2 ( n 1 ) s n is a 100 ( 1 α ) % interval for μ .

Let X equals the amount of butterfat in pound produced by a typical cow during a 305-day milk production period between her first and second claves. Assume the distribution of X is N ( μ , σ 2 ) . To estimate μ a farmer measures the butterfat production for n-20 cows yielding the following data:

481 537 513 583 453 510 570
500 487 555 618 327 350 643
499 421 505 637 599 392 -

For these data, x ¯ = 507.50 and s = 89.75 . Thus a point estimate of μ is x ¯ = 507.50 . Since t 0.05 ( 19 ) = 1.729 , a 90% confidence interval for μ is 507.50 ± 1.729 ( 89.75 20 ) , or equivalently, [472.80, 542.20].

Got questions? Get instant answers now!

Let T have a t distribution with n -1 degrees of freedom. Then, t α / 2 ( n 1 ) > z α / 2 . Consequently, the interval x ¯ ± z α / 2 σ / n is expected to be shorter than the interval x ¯ ± t α / 2 ( n 1 ) s / n . After all, there gives more information, namely the value of σ , in construction the first interval. However, the length of the second interval is very much dependent on the value of s . If the observed s is smaller than σ , a shorter confidence interval could result by the second scheme. But on the average, x ¯ ± z α / 2 σ / n is the shorter of the two confidence intervals.

If it is not possible to assume that the underlying distribution is normal but μ and σ are both unknown, approximate confidence intervals for μ can still be constructed using T = X ¯ μ S / n , which now only has an approximate t distribution.

Generally, this approximation is quite good for many normal distributions, in particular, if the underlying distribution is symmetric, unimodal, and of the continuous type. However, if the distribution is highly skewed , there is a great danger using this approximation. In such a situation, it would be safer to use certain nonparametric method for finding a confidence interval for the median of the distribution.

Confidence interval for variances

The confidence interval for the variance σ 2 is based on the sample variance S 2 = 1 n 1 i = 1 n ( X i X ¯ ) 2 .

In order to find a confidence interval for σ 2 , it is used that the distribution of ( n 1 ) S 2 / σ 2 is χ 2 ( n 1 ) . The constants a and b should selected from tabularized Chi Squared Distribution with n -1 degrees of freedom such that P ( a ( n 1 ) S 2 σ 2 b ) = 1 α .

That is select a and b so that the probabilities in two tails are equal: a = χ 1 α / 2 2 ( n 1 ) and b = χ α / 2 2 ( n 1 ) . Then, solving the inequalities, we have 1 α = P ( a ( n 1 ) S 2 1 σ 2 b ( n 1 ) S 2 ) = P ( ( n 1 ) S 2 b σ 2 ( n 1 ) S 2 a ) .

Thus the probability that the random interval  [(n-1)S 2 /b, (n-1)S 2 /a] contains the unknown σ 2 is 1- α . Once the values of X 1 , X 2 , ... , X n are observed to be x 1 , x 2 , ... , x n and s 2 computed, then the interval [(n-1)S 2 /b, (n-1)S 2 /a] is a 100 ( 1 α ) % confidence interval for σ 2 .

It follows that [ ( n 1 ) / b s , ( n 1 ) / a s ] is a 100 ( 1 α ) % confidence interval for σ , the standard deviation.

Assume that the time in days required for maturation of seeds of a species of a flowering plant found in Mexico is N ( μ , σ 2 ) . A random sample of n =13 seeds, both parents having narrow leaves, yielded x ¯ =18.97 days and 12 s 2 = i = 1 13 ( x x ¯ ) 2 = 128.41 .

A confidence interval for σ 2 is [ 128.41 21.03 , 128.41 5.226 ] = [ 6.11 , 24.57 ] , because 5.226 = χ 0.95 2 ( 12 ) and 21.03 = χ 0.055 2 ( 12 ) , what can be read from the tabularized Chi Squared Distribution. The corresponding 90% confidence interval for σ is [ 6.11 , 24.57 ] = [ 2.47 , 4.96 ] .

Got questions? Get instant answers now!

Although a and b are generally selected so that the probabilities in the two tails are equal, the resulting 100 ( 1 α ) % confidence interval is not the shortest that can be formed using the available data. The tables and appendixes gives solutions for a and b that yield confidence interval of minimum length for the standard deviation.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introduction to statistics. OpenStax CNX. Oct 09, 2007 Download for free at http://cnx.org/content/col10343/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introduction to statistics' conversation and receive update notifications?

Ask