Let's evaluate the Cramr-Rao bound for the example we
have been discussing: the estimation of the mean and varianceof a length
sequence of
statistically independent Gaussian random variables. Let theestimate of the mean
be the sample average
; as shown in the last example, this estimate is
unbiased. Let the estimate of the variance
be the unbiased estimate
. Each term in the Fisher information matrix
is given by the
expected value of the paired products of derivatives of thelogarithm of the likelihood function.
The logarithm of the likelihood function is
its partial derivatives are
and its second partials are
The Fisher information matrix has the surprisingly simple form
its inverse is also a diagonal matrix with the elements on the
main diagonal equalling the reciprocal of those in theoriginal matrix. Because of the zero-values off-diagonal
entries in the Fisher information matrix, the errors betweenthe corresponding estimates are not inter-dependent. In this
problem, the mean-square estimation error can be no smallerthan
Note that
nowhere in the preceding example
did the form of the estimator enter into the computation of thebound. The only quantity used in the computation of the
Cramr-Rao bound is the logarithm of the likelihood
function, which is a consequence of the problem statement, nothow it is solved.
Only in the case of unbiased
estimators is the bound independent of the estimatorsused.
That's why we assumed in
the example that we used an unbiased estimator for thevariance.
Because of this property, the Cramr-Rao
bound is frequently used to assess the performance limits thatcan be obtained with an unbiased estimator in a particular
problem. When bias is present, the exact form of the estimator'sbias explicitly enters the computation of the bound. All too
frequently, the unbiased form is used in situations where the
existence of an unbiased estimator can be
questioned. As we shall see, one such problem is time delayestimation, presumably of some importance to the reader. This
misapplication of the unbiased Cramr-Rao arises from
desperation: the estimator is so complicated and nonlinear thatcomputing the bias is nearly impossible. As shown in
this problem , biased
estimators can yield mean-squared error smaller as well aslarger than the unbiased version of the Cramr-Rao
bound. Consequently, desperation can yield misinterpretationwhen a general result is misapplied.
In the single-parameter estimation problem, the
Cramr-Rao bound incorporating bias has the well-known
form
Note that this bound differs somewhat
from that originally given by
Cramr
(1946) p.480 ; his derivation ignores the additive bias
term
.
Note that the sign of the bias's derivative determines whether
this bound is larger or potentially smaller than the unbiasedversion, which is obtained by setting the bias term to zero.
Efficiency
An interesting question arises: when, if ever, is the bound
satisfied with equality? Recalling the details of thederivation of the bound, equality results when the quantity
equals zero. As this quantity is the expected value of the
square of
, it can only equal zero if
. Substituting in the form of the column matrices
and
, equality in the
Cramr-Rao bound results whenever
This complicated expression means that only if estimationproblems (as expressed by the
a priori density have the form of the right side of this equation can
the mean-squared error equal the Cramr-Rao bound. In
particular, the gradient of the log likelihood function can
only depend on the observations through
the estimator. In all other problems, the Cramr-Rao
bound is a lower bound but not a tight one
no estimator can have error
characteristics that equal it. In such cases, we have limitedinsight into ultimate limitations on estimation error size
with the Cramr-Rao bound. However, consider the case
where the estimator is unbiased (
). In addition, note the maximum likelihood estimate
occurs when the gradient of the logarithm of the likelihoodfunction equals zero:
when
. In this case, the condition for equality in the
Cramr-Rao bound becomes
As the Fisher information matrix is positive-definite, we
conclude that if the estimator equals the maximum likelihoodestimator, equality in the Cramr-Rao bound
can be satisfied with equality,
only the maximum likelihood estimate will
achieve it. To use estimation theoretic terminology,
if an efficient estimate exists, it is the maximum
likelihood estimate. This result stresses the
importance of maximum likelihood estimates, despite theseemingly
ad hoc manner by which they are
defined.
Consider the Gaussian example being examined so frequently
in this section. The components of the gradient of thelogarithm of the likelihood function were given earlier by
and
. These
expressions can be rearranged to reveal
The first component, which corresponds to the estimate of
the mean,
is expressed in the form
required for the existence of an efficient estimate. Thesecond component--the partial with respect to the variance
--
cannot be rewritten in a
similar fashion. No unbiased, efficient estimate of thevariance exits in this problem. The mean-squared error of
the variance's unbiased estimate, but not the maximumlikelihood estimate, is lower-bounded by
. This error is strictly greater than the
Cramr-Rao bound of
. As no unbiased estimate of the variance can have
a mean-squared error equal to the Cramr-Rao bound
(no efficient estimate exists for the variance in theGaussian problem), one presumes that the closeness of the
error of our unbiased estimator to the bound implies that itpossesses the smallest squared-error of any estimate. This
presumption may, of course, be incorrect.
Properties of the maximum likelihood estimator
The maximum likelihood estimate is the most used estimation
technique for nonrandom parameters. Not only because of itsclose linkage to the Cramr-Rao bound, but also because
it has desirable asymptotic properties in the context of
any problem
(Cramr (1946) pp. 500-506) .
-
The maximum likelihood estimate is at least
asymptotically unbiased. It may be unbiased for any
number of observations (as in the estimation of the mean of asequence of independent random variable) for some
problems.
-
The maximum likelihood estimate is
consistent.
-
The maximum likelihood estimates is
asymptotically efficient. As more and more
data are incorporated into an estimate, theCramr-Rao bound accurately projects the best
attainable error and the maximum likelihood estimate hasthose optimal characteristics.
-
Asymptotically, the maximum likelihood estimate
is distributed as a Gaussian random variable. Because of the previous properties, the mean
asymptotically equals the parameter and the covariancematrix is
.
Most would agree that a "good" estimator should have these
properties. What these results do not provide is assessment ofhow many observations are needed for the asymptotic results to
apply to some specified degree of precision. Consequently,they should be used with caution; for instance, some other
estimator may have a smaller mean-square error than themaximum likelihood for a modest number of observations.