<< Chapter < Page | Chapter >> Page > |
Detection theory concerns making decisions from data. Decisions are based on presumptive models that may have produced the data.Making a decision involves inspecting the data and determining which model was most likely to have produced them. In this way, we are detecting which model was correct.Decision problems pervade signal processing. In digital communications, for example, determining if the current bit received in the presence of channel disturbances was a zero or a one is a detection problem.
More concretely, we denote by the model that could have generated the data . A "model" is captured by the conditional probability distribution of the data, which is denoted by the vector . For example, model is described by . Given all the models that can describe the data, we need to choose which model best matched what was observed.The word "best" is key here: what is the optimality criterion, and does the detection processing and the decision rule depend heavily on the criterion used?Surprisingly, the answer to the second question is "No." All of detection theory revolves around the likelihood ratio test , which as we shall see, emerges as the optimal detector under a wide variety of optimality criteria.
In a binary detection problem in which we have two models, four possible decision outcomes can result. Model did in fact represent the best model for the data and the decision rule said it was (a correct decision) or saidit wasn't (an erroneous decision). The other two outcomes arise when model was in fact true with either a correct or incorrect decision made. The decision process operates by segmentingthe range of observation values into two disjoint decision regions and . All values of fall into either or . If a given lies in , we will announce our decision
"model was true"; if in , model would be proclaimed. To derive a rational method of deciding which model best describes the observations, we needa criterion to assess the quality of the decision process so that optimizing this criterion will specify the decision regions.
The
Bayes' decision criterion seeks to minimize a
cost function associated with making a decision. Let
be the cost of mistaking model
for model
(
) and
the presumably smaller cost of correctly choosing
model
:
,
. Let
be the
The data processing operations are captured entirely by the likelihood ratio . However, the calculations required by the likelihood ratio can be simplified in many cases. Note that only the value of thelikelihood ratio relative to the threshold matters.Consequently, we can perform any positively monotonic transformation simultaneously on the likelihood ratio and the threshold without affecting the result of thecomparison. For example, we can multiply by a positive constant, add any constant or apply a monotonically increasing functionto reduce the complexity of the expressions. We single out one such function, the logarithm, because it often simplifies likelihoodratios that commonly occur in signal processing applications. Known as the log-likelihood , we explicitly express the likelihood ratio test with it as
The likelihood ratio is comprised of the quantities , which are known as likelihood functions and play an important role in estimation theory. It is the likelihood function that portrays the probabilistic modeldescribing data generation. The likelihood function completely characterizes the kind of "world" assumed by eachmodel. For each model, we must specify the likelihood function so that we can solve the hypothesis testing problem.
A complication, which arises in some cases, is that the sufficient statistic may not be monotonic. If it is monotonic, thedecision regions and are simply connected: all portions of a region canbe reached without crossing into the other region. If not, the regions are not simply connected and decision regionislands are created. Disconnected regions usually complicate calculations of decision performance. Monotonic ornot, the decision rule proceeds as described: the sufficient statistic is computed for each observation vector and comparedto a threshold.
The coach of a soccer team suspects his goalie has been less than attentive to his training regimen. The coach focuses on the kicks the goalie makes to send the ball down the field.The data he observes is the length of a kick. The coach defines the models as
Notification Switch
Would you like to follow the 'Elements of detection theory' conversation and receive update notifications?