<< Chapter < Page Chapter >> Page >

Suppose a public health official conducts a survey to estimate 0 1 , the percentage of the population eating pizza at least once per week. As a result, the official foundnine people who had eaten pizza in the last week, and three who had not.If no additional information is available regarding how the survey was implemented, then there are at least twoprobability models we can adopt.

  • The official surveyed 12 people, and 9 of them had eaten pizza in the last week. In this case, we observe x 1 9 , where x 1 Binomial 12 The density for x 1 is f x 1 12 x 1 x 1 1 12 x 1
  • Another reasonable model is to assume that the official surveyed people until he found 3 non-pizza eaters. In this case, we observe x 2 12 , where x 2 NegativeBinomial 3 1 The density for x 2 is g x 2 x 2 1 3 1 x 2 3 1 3
The likelihoods for these two models are proportional: x 1 x 2 9 1 3 Therefore, any estimator that adheres to the likelihoodprinciple will produce the same estimate for , regardless of which of the two data-generation models is assumed.

The likelihood principle is widely accepted among statisticians. In the context of parameter estimation, any reasonable estimator should conform to the likelihoodprinciple. As we will see, the maximum likelihood estimator does.

While the likelihood principle itself is a fairly reasonable assumption, it can also be derived from twosomewhat more intuitive assumptions known as the sufficiency principle and the conditionality principle. See Casella and Berger, Chapter 6 .

The maximum likelihood estimator

The maximum likelihood estimator x is defined by l x Intuitively, we are choosing to maximize the probability of occurrence of the observation x .

It is possible that multiple parameter values maximize the likelihood for a given x . In that case, any of these maximizers can be selected as the MLE. It is alsopossible that the likelihood may be unbounded , in which case the MLE does not exist.

The MLE rule is an implementation of the likelihood principle. If we have two observations whose likelihoods areproportional (they differ by a constant that does not depend on ), then the value of that maximizes one likelihood will also maximize the other. In other words, both likelihood functions lead to thesame inference about , as required by the likelihood principle.

Understand that maximum likelihood is a procedure , not an optimality criterion. From the definition of the MLE, we have no idea how close itcomes to the true parameter value relative to other estimators. In constrast, the MVUE is defined as the estimatorthat satisfies a certain optimality criterion. However, unlike the MLE, we have no clear produre to follow to compute theMVUE.

Computing the mle

If the likelihood function is differentiable, then is found by differentiating the likelihood (or log-likelihood), equating with zero, and solving: l x 0 If multiple solutions exist, then the MLE is the solution that maximizes l x , that is, the global maximizer.

In certain cases, such as pdfs or pmfs with an esponential form, the MLE can beeasily solved for. That is, l x 0 can be solved using calculus and standard linear algebra.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical signal processing. OpenStax CNX. Jun 14, 2004 Download for free at http://cnx.org/content/col10232/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical signal processing' conversation and receive update notifications?

Ask