<< Chapter < Page Chapter >> Page >
This course is a short series of lectures on Introductory Statistics. Topics covered are listed in the Table of Contents. The notes were prepared by EwaPaszek and Marek Kimmel. The development of this course has been supported by NSF 0203396 grant.

Asymptotic distribution of maximum likelihood estimators

Let consider a distribution with p.d.f. f ( x ; θ ) such that the parameter θ is not involved in the support of the distribution. We want to be able to find the maximum likelihood estimator θ ^ by solving [ ln L ( θ ) ] θ = 0 , where here the partial derivative was used because L ( θ ) involves x 1 , x 2 , ... , x n .

That is, [ ln L ( θ ^ ) ] θ = 0 , where now, with θ ^ in this expression, L ( θ ^ ) = f ( X 1 ; θ ^ ) f ( X 2 ; θ ^ ) · · · f ( X n ; θ ^ ) .

We can approximate the left-hand member of this latter equation by a linear function found from the first two terms of a Taylor’s series expanded about θ , namely [ ln L ( θ ) ] θ + ( θ ^ θ ) 2 [ ln L ( θ ) ] θ 2 0 , when L ( θ ) = f ( X 1 ; θ ) f ( X 2 ; θ ) · · · f ( X n ; θ ) .

Obviously, this approximation is good enough only if θ ^ is close to θ , and an adequate mathematical proof involves those conditions. But a heuristic argument can be made by solving for θ ^ θ to obtain

   θ ^ θ = [ ln L ( θ ) ] θ 2 [ ln L ( θ ) ] θ 2

Recall that ln L ( θ ) = ln f ( X 1 ; θ ) + ln f ( X 2 ; θ ) + · · · + ln f ( X n ; θ ) and

ln L ( θ ) θ = i = 1 n [ ln f ( X i ; θ ) ] θ ;

The expression (2) is the sum of the n independent and identically distributed random variables Y i = [ ln f ( X i ; θ ) ] θ , i = 1 , 2 , ... , n . and thus the Central Limit Theorem has an approximate normal distribution with mean (in the continuous case) equal to

[ ln f ( x i ; θ ) ] θ f ( x ; θ ) d x = [ f ( x i ; θ ) ] θ f ( x i ; θ ) f ( x i ; θ ) d x = [ f ( x i ; θ ) ] θ d x = d [ f ( x i ; θ ) d x ] = d [ 1 ] = 0.

Clearly, the mathematical condition is needed that it is permissible to interchange the operations of integration and differentiation in those last steps. Of course, the integral of f ( x i ; θ ) is equal to one because it is a p.d.f.

Since we know that the mean of each Y is [ ln f ( x i ; θ ) ] θ f ( x ; θ ) d x = 0 let us take derivatives of each member of this equation with respect to θ obtaining

{ 2 [ ln f ( x i ; θ ) ] θ 2 f ( x ; θ ) + [ ln f ( x i ; θ ) ] θ [ f ( x i ; θ ) ] θ } d x = 0.

However, [ f ( x i ; θ ) ] θ = [ ln f ( x i ; θ ) ] θ f ( x ; θ ) so { [ ln f ( x i ; θ ) ] θ } 2 f ( x ; θ ) d x = 2 [ ln f ( x i ; θ ) ] θ 2 f ( x i ; θ ) d x .

Since E ( Y ) = 0 , this last expression provides the variance of Y = [ ln f ( X ; θ ) ] / d . Then the variance of expression (2) is n times this value, namely

n E { 2 [ ln f ( x i ; θ ) ] θ 2 } .

Let us rewrite (1) as

n ( θ ^ θ ) 1 E { 2 [ ln f ( X ; θ ) ] / θ 2 } = [ ln L ( θ ) ] / θ E { 2 [ ln f ( X ; θ ) ] / θ 2 } 1 n 2 [ ln L ( θ ) ] θ 2 E { 2 [ ln f ( X ; θ ) ] / θ 2 }

The numerator of (4) has an approximate N ( 0 , 1 ) distribution; and those unstated mathematical condition require, in some sense for 1 n 2 [ ln L ( θ ) ] θ 2 to converge to E [ 2 [ ln f ( X ; θ ) ] / θ 2 ] . Accordingly, the ratios given in equation (4) must be approximately N ( 0 , 1 ) . That is, θ ^ has an approximate normal distribution with mean θ and standard deviation 1 n E { 2 [ ln f ( X ; θ ) ] / θ 2 } .

With the underlying exponential p.d.f. f ( x ; θ ) = 1 θ e x / θ , 0 < x < , θ Ω = { θ ; 0 < θ < } . X ¯ is the maximum likelihood estimator. Since ln f ( x ; θ ) = ln θ x θ and [ ln f ( x ; θ ) ] θ = 1 θ + x θ 2 and 2 [ ln f ( x ; θ ) ] θ = 1 θ 2 2 x θ 3 , we have E [ 1 θ 2 2 X θ 3 ] = 1 θ + 2 θ θ 3 = 1 θ 2 because E ( X ) = θ . That is, X ¯ has an approximate distribution with mean θ and standard deviation θ / n . Thus the random interval X ¯ ± 1.96 ( θ / n ) has an approximate probability of 0.95 for covering θ . Substituting the observed x ¯ for θ , as well as for X ¯ , we say that x ¯ ± 1.96 x ¯ / n is an approximate 95% confidence interval for θ .

Got questions? Get instant answers now!

The maximum likelihood estimator for λ in f ( x ; λ ) = λ x e λ x ! , x = 0 , 1 , 2 , ... ; θ Ω = { θ : 0 < θ < } is λ ^ = X ¯ Now ln f ( x ; λ ) = x ln λ λ ln x ! and [ ln f ( x ; λ ) ] λ = x λ 1 and 2 [ ln f ( x ; λ ) ] λ 2 = x λ 2 . Thus E ( X λ 2 ) = λ λ 2 = 1 λ and λ ^ = X ¯ has an approximate normal distribution with mean λ and standard deviation λ / n . Finally x ¯ ± 1.645 x ¯ / n serves as an approximate 90% confidence interval for λ . With the data from example(…) x ¯ = 2.225 and hence this interval is from 1.887 to 2.563.

Got questions? Get instant answers now!

It is interesting that there is another theorem which is somewhat related to the preceding result in that the variance of θ ^ serves as a lower bound for the variance of every unbiased estimator of θ . Thus we know that if a certain unbiased estimator has a variance equal to that lower bound, we cannot find a better one and hence it is the best in the sense of being the unbiased minimum variance estimator . This is called the Rao-Cramer Inequality .

Let X 1 , X 2 , ... , X n be a random sample from a distribution with p.d.f. f ( x ; θ ) , θ Ω = { θ : c < θ < d } , where the support X does not depend upon θ so that we can differentiate, with respect to θ , under integral signs like that in the following integral:

f ( x ; θ ) d x = 1.

If Y = u ( X 1 , X 2 , ... , X n ) is an unbiased estimator of θ , then

V a r ( Y ) 1 n { [ ln f ( x ; θ ) / θ ] } 2 f ( x ; θ ) d x = 1 n [ 2 ln f ( x ; θ ) / θ 2 ] f ( x ; θ ) d x .

Note that the two integrals in the respective denominators are the expectations E { [ ln f ( X ; θ ) θ ] 2 } and E [ 2 ln f ( X ; θ ) θ 2 ] sometimes one is easier to compute that the other.

Note that above the lower bound of two distributions: exponential and Poisson was computed. Those respective lower bounds were θ 2 n and λ n . Since in each case, the variance of X ¯ equals the lower bound, then X ¯ is the unbiased minimum variance estimator.

The sample arises from a distribution with p.d.f. f ( x ; θ ) = θ x θ 1 , 0 < x < 1 , θ Ω = { θ : 0 < θ < } .

We have ln f ( x ; θ ) = ln θ + ( θ 1 ) ln x , ln f ( x ; θ ) θ = 1 θ + ln x , and 2 ln f ( x ; θ ) θ 2 = 1 θ 2 .

Since E ( 1 / θ 2 ) = 1 / θ 2 , the lower bound of the variance of every unbiased estimator of θ is θ 2 / n . Moreover, the maximum likelihood estimator θ ^ = n / ln i = 1 n X i has an approximate normal distribution with mean θ and variance θ 2 / n . Thus, in a limiting sense, θ ^ is the unbiased minimum variance estimator of θ .

Got questions? Get instant answers now!

To measure the value of estimators; their variances are compared to the Rao-Cramer lower bound. The ratio of the Rao-Cramer lower bound to the actual variance of any unbiased estimator is called the efficiency of that estimator. As estimator with efficiency of 50% requires that 1/0.5=2 times as many sample observations are needed to do as well in estimation as can be done with the unbiased minimum variance estimator (then 100% efficient estimator).

Questions & Answers

I'm interested in biological psychology and cognitive psychology
Tanya Reply
what does preconceived mean
sammie Reply
physiological Psychology
Nwosu Reply
How can I develope my cognitive domain
Amanyire Reply
why is communication effective
Dakolo Reply
Communication is effective because it allows individuals to share ideas, thoughts, and information with others.
effective communication can lead to improved outcomes in various settings, including personal relationships, business environments, and educational settings. By communicating effectively, individuals can negotiate effectively, solve problems collaboratively, and work towards common goals.
it starts up serve and return practice/assessments.it helps find voice talking therapy also assessments through relaxed conversation.
miss
Every time someone flushes a toilet in the apartment building, the person begins to jumb back automatically after hearing the flush, before the water temperature changes. Identify the types of learning, if it is classical conditioning identify the NS, UCS, CS and CR. If it is operant conditioning, identify the type of consequence positive reinforcement, negative reinforcement or punishment
Wekolamo Reply
please i need answer
Wekolamo
because it helps many people around the world to understand how to interact with other people and understand them well, for example at work (job).
Manix Reply
Agreed 👍 There are many parts of our brains and behaviors, we really need to get to know. Blessings for everyone and happy Sunday!
ARC
A child is a member of community not society elucidate ?
JESSY Reply
Isn't practices worldwide, be it psychology, be it science. isn't much just a false belief of control over something the mind cannot truly comprehend?
Simon Reply
compare and contrast skinner's perspective on personality development on freud
namakula Reply
Skinner skipped the whole unconscious phenomenon and rather emphasized on classical conditioning
war
explain how nature and nurture affect the development and later the productivity of an individual.
Amesalu Reply
nature is an hereditary factor while nurture is an environmental factor which constitute an individual personality. so if an individual's parent has a deviant behavior and was also brought up in an deviant environment, observation of the behavior and the inborn trait we make the individual deviant.
Samuel
I am taking this course because I am hoping that I could somehow learn more about my chosen field of interest and due to the fact that being a PsyD really ignites my passion as an individual the more I hope to learn about developing and literally explore the complexity of my critical thinking skills
Zyryn Reply
good👍
Jonathan
and having a good philosophy of the world is like a sandwich and a peanut butter 👍
Jonathan
generally amnesi how long yrs memory loss
Kelu Reply
interpersonal relationships
Abdulfatai Reply
What would be the best educational aid(s) for gifted kids/savants?
Heidi Reply
treat them normal, if they want help then give them. that will make everyone happy
Saurabh
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introduction to statistics. OpenStax CNX. Oct 09, 2007 Download for free at http://cnx.org/content/col10343/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introduction to statistics' conversation and receive update notifications?

Ask