<< Chapter < Page | Chapter >> Page > |
The random variable
is not a statistic, since it uses the unknown parameter μ . However, the following is a statistic.
It would appear that might be a reasonable estimate of the population variance. However, the following result shows that a slight modification is desirable.
The statistic
is an estimator for the population variance.
VERIFICATION
Consider the statistic
Noting that , we use the last expression to show
The quantity has a bias in the average. If we consider
The quantity V n with rather than is often called the sample variance to distinguish it from the population variance. If the set of numbers
represent the complete set of values in a population of N members, the variance for the population would be given by
Here we use rather than .
Since the statistic V n has mean value σ 2 , it seems a reasonable candidate for an estimator of the population variance. If we ask how good is it, we need to considerits variance. As a random variable, it has a variance. An evaluation similar to that for the mean, but more complicated in detail, shows that
For large n , is small, so that V n is a good large-sample estimator for σ 2 .
Consider a population random variable uniform [-1, 1]. Then and . We take 100 samples of size 100, and determine the sample sums. This gives a sample of size 100 of the sample sum random variable S 100 , which has mean zero and variance 100/3.For each observed value of the sample sum random variable, we plot the fraction of observed sums less than or equal to that value. Thisyields an experimental distribution function for S 100 , which is compared with the distribution function for a random variable .
rand('seed',0) % Seeds random number generator for later comparison
tappr % Approximation setupEnter matrix [a b] of x-range endpoints [-1 1]Enter number of x approximation points 100
Enter density as a function of t 0.5*(t<=1)
Use row matrices X and PX as in the simple case
qsample % Creates sample
Enter row matrix of VALUES XEnter row matrix of PROBABILITIES PX
Sample size n = 10000 % Master sample size 10,000Sample average ex = 0.003746
Approximate population mean E(X) = 1.561e-17Sample variance vx = 0.3344
Approximate population variance V(X) = 0.3333m = 100;
a = reshape(T,m,m); % Forms 100 samples of size 100A = sum(a); % Matrix A of sample sums
[t,f]= csort(A,ones(1,m)); % Sorts A and determines cumulative
p = cumsum(f)/m; % fraction of elements<= each value
pg = gaussian(0,100/3,t); % Gaussian dbn for sample sum valuesplot(t,p,'k-',t,pg,'k-.') % Comparative plot
% Plotting details (see
[link] )
Notification Switch
Would you like to follow the 'Applied probability' conversation and receive update notifications?