<< Chapter < Page Chapter >> Page >

Lower performance bounds

In other modules, estimators/predictors are analyzed, in order to obtain upper bounds on their performance. These bounds are of the form:

min f F E [ d ( f ^ n , f ) ] C n - γ

where γ > 0 . We would like to know if these bounds are tight, in the sense that there is no otherestimator that is significantly better. To answer this, we need lower bounds like

inf f ^ n sup f F E [ d ( f ^ n , f ) ] c n - γ

We assume we have the following ingredients:

  • Class of models, F S . F is a class of models containing the “true" model and is a subset of some bigger class S . E.g. F could be the class of Lipschitz density functions or distributions P X Y satisfying the box-counting condition.
  • An observation model, P f , indexed by f F . P f denotes the distribution of the data under model f . E.g. in regression and classification, this is the distribution of Z = ( X 1 , Y 1 , , X n , Y n ) Z . We will assume that P f is a probability measure on the measurable space ( Z , B ) .
  • A performance metric d ( . , . ) . 0 . If you have a model estimate f ^ n , then the performance of that model estimate relative to the true model f is d ( f ^ n , f ) . E.g.
    Regression: d ( f ^ n , f ) = | | f ^ n - f | | 2 = ( f ^ n ( x ) - f ( x ) ) 2 d x 1 / 2
    Classification: d ( f ^ n , f ) = R ( G ^ n ) - R * = G ^ n Δ G * | 2 η ( x ) - 1 | d P X ( x )

As before, we are interested in the risk of a learning rule, in particular the maximal risk given as:

sup f F E f [ d ( f ^ n , f ) ] = sup f F d ( f ^ n ( Z ) , f ) d P f ( Z )

where f ^ n is a function of the observations Z and E f denotes the expectation with respect to P f .

The main goal is to get results of the form

R n * = Δ inf f ^ n sup f F E [ d ( f ^ n , f ) ] c s n

where c > 0 and s n 0 as n . The inf is taken over all estimators, i.e. all measurable functions f ^ n : Z S .

Suppose we have shown that

lim inf n s n - 1 R n * c > 0 (A lower bound)

and also that for a particular estimator f ¯ n

lim sup n s n - 1 sup f F E f [ d ( f ¯ n , f ) ] C
lim sup n s n - 1 R n * C ,

We say that s n is the optimal rate of convergence for this problem and that f ¯ n attains that rate.

Two rates of convergence Ψ n and Ψ n ' are equivalent, i.e. Ψ n Ψ n ' iff
0 < lim inf n Ψ n Ψ n ' lim sup n Ψ n Ψ n ' <

General reduction scheme

Instead of directly bounding the expected performance, we are going to prove stronger probability bounds of the form

inf f ^ n sup f F P f ( d ( f ^ n , f ) s n ) c > 0

These bounds can be readily converted to expected performance bounds using Markov's inequality:

P f ( d ( f ^ n , f ) s n ) E f [ d ( f ^ n , f ) ] s n

Therefore it follows:

inf f ^ n sup f F E f [ d ( f ^ n , f ) ] inf f ^ n sup f F s n P f ( d ( f ^ n , f ) s n ) c s n

First reduction step

Reduce the original problem to an easier one by replacing the larger class F with a smaller finite class { f 0 , , f M } F . Observe that

inf f ^ n sup f F P f ( d ( f ^ n , f ) s n ) inf f ^ n sup f { f 0 , , f M } P f ( d ( f ^ n , f ) s n )

The key idea is to choose a finite collection of models such that the resulting problem is as hard as the original, otherwise the lower bound will not be tight.

Second reduction step

Next, we reduce the problem to a hypotheses test. Ideally, we would like to have something like

inf f ^ n sup f F P f ( d ( f ^ n , f ) s n ) inf f ^ n sup j { 0 , , M } P f j ( h ^ n ( Z ) j )

The inf is over all measurable test functions

h ^ n : Z { 0 , , M }

and P f j ( h ^ n ( Z ) j ) denotes the probability that after observing the data, the test infers the wrong hypothesis.

This might not always be true or easy to show, but in certain scenarios it can be done. Suppose d ( . , . ) is a semi-distance, i.e. it satisfies

Questions & Answers

what is the anterior
Tito Reply
Means front part of the body
Ibrahim
what is anatomy
Ruth Reply
To better understand how the different part of the body works. To understand the physiology of the various structures in the body. To differentiate the systems of the human body .
Roseann Reply
what is hypogelersomia
aliyu Reply
what are the parts of the female reproductive system?
Orji Reply
what is anatomy
Divinefavour Reply
what are the six types of synovial joints and their ligaments
Darlington Reply
draw the six types of synovial joint and their ligaments
Darlington
System of human beings
Katumi Reply
System in humans body
Katumi
Diagram of animals and plants cell
Favour Reply
at what age does development of bone end
Alal Reply
how many bones are in the human upper layers
Daniel Reply
how many bones do we have
Nbeke
bones that form the wrist
Priscilla Reply
yes because it is in the range of neutrophil count
Alexander Reply
because their basic work is to fight against harmful external bodies and they are always present when chematoxin are released in an area in body
Alexander
What is pathology
Samuel Reply
what is pathology
Nbeke
what's pathology
Nbeke
what is anatomy
ESTHER Reply
what is plasma and is component
Fad Reply
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical learning theory. OpenStax CNX. Apr 10, 2009 Download for free at http://cnx.org/content/col10532/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical learning theory' conversation and receive update notifications?

Ask