The design of a hypothesis test/detector often
involves constructing the solution to an optimizationproblem. The optimality criteria used fall into two classes:
Bayesian and frequent.
In the Bayesian setup, it is assumed that the
a priori probability of
each hypothesis occuring(
) is known. A cost
is assigned to each possible outcome:
The optimal test/detector is the one that minimizes the Bayes
risk, which is defined to be the expected cost of anexperiment:
In the event that we have a binary problem, and both
hypotheses are
simple , the decision rule that
minimizes the Bayes risk can be constructed explicitly. Let usassume that the data is continuous (
i.e. ,
it has a density) under each hypothesis:
Let
and
denote the
decision
regions corresponding to the optimal test. Clearly,
the optimal test is specified once we specify
and
.
The Bayes risk may be written
Recall that
and
partition the input space: they are
disjoint and their union is the full input space. Thus, everypossible input
belongs to precisely one of these regions. In order to minimize
the Bayes risk, a measurement
should belong to the decision
region
for which the corresponding integrand in the preceding equationis smaller. Therefore, the Bayes risk is minimized by assigning
to
whenever
and assigning
to
whenever this inequality is reversed. The resulting rule may beexpressed concisely as
Here,
is called the
likelihood ratio ,
is called the threshold, and
the overall decision rule is called the
Likelihood Ratio Test (LRT). The expressionon the right is called a
threshold .
An instructor in a course in detection theory wants to
determine if a particular student studied for his last test.The observed quantity is the student's grade, which we
denote by
. Failure may not indicate studiousness:
conscientious students may fail the test. Define the modelsas
: did not study
: did study
The conditional densities of the grade are shown in
.
Based on knowledge of student behavior, the instructor
assigns
a priori probabilities of
and
. The costs
are chosen to reflect the instructor's sensitivity
to student feelings:
(an erroneous decision either way is given the
same cost) and
. The likelihood ratio is plotted in
and the threshold value
, which is computed from the
a
priori probabilities and the costs to be
, is indicated. The calculations of this
comparison can be simplified in an obvious way.
or
The multiplication by the factor of 50 is a simple
illustration of the reduction of the likelihood ratio to asufficient statistic. Based on the assigned costs and
a priori probabilities, the optimum
decision rule says the instructor must assume that thestudent did not study if the student's grade is less than
16.7; if greater, the student is assumed to have studieddespite receiving an abysmally low grade such as 20. Note
that as the densities given by each model overlap entirely:the possibility of making the wrong interpretation
always haunts the instructor. However,
no other procedure will be better!
A special case of the minimum Bayes risk rule, the
minimum probability of error rule , is
used extensively in practice, and is discussed at length inanother module.
Problems
Denote
and
. Express the Bayes risk
in terms of
and
,
, and
. Argue that the optimal decision rule is not
altered by setting
.
Suppose we observe
such that
. Argue that it doesn't matter whether we assign
to
or
.