5.1 Lms algorithm analysis

Statistical signal processing Page 1 / 1

Objective

Minimize instantaneous squared error

e_{k} w 2 y_{k} x_{k} w 2

Lms algorithm

w_{k} w_{k 1} μ x_{k} e_{k}

Where

w_{k}

is the new weight vector,

w_{k 1}

is the old weight vector, and

μ x_{k} e_{k}

is a small step in the instantaneous error gradient direction.

Interpretation in terms of weight error vector

Define

v_{k} w_{k} w_{opt}

Where

w_{opt}

is the optimal weight vector and

ε_{k} y_{k} x_{k} w_{opt}

where

ε_{k}

is the minimum error. The stochastic difference equation is:

v_{k} I μ x_{k} x_{k} v_{k 1} μ x_{k} ε_{k}

Convergence/stability analysis

Show that (tightness)

B

∞ v k B 0

With probability 1, the weight error vector is bounded for all

k

Chebyshev's inequality is

v_{k} B v_{k} 2 B 2

and

v_{k} B 1 B 2 v_{k} 2 v_{k}

where

v_{k} 2

is the squared bias. If

v_{k} 2 v_{k}

is finite for all

k

, then

B

∞ v k B 0 for all

k

Also,

v_{k} tr v_{k} v_{k}

Therefore

v_{k}

is finite if the diagonal elements of

Γ_{k} v_{k} v_{k}

are bounded.

Convergence in mean

$v_{k} 0$ as $k$ ∞ . Take expectation of using smoothing property to simplify the calculation. We haveconvergence in mean if

$R_{xx}$ is positive definite (invertible).
$μ 2 λ_{\max} R_{xx}$ .

Bounded variance

Show that $Γ_{k} v_{k} v_{k}$ , the weight vector error covariance is bounded for all $k$ .

We could have

v_{k} 0

, but

v_{k}

∞ ; in which case the algorithm would not be stable.

Recall that it is fairly straightforward to show that the diagonal elements of the transformed covariance

C_{k} U Γ_{k} U

tend to zero if

μ 1 λ_{\max} R_{xx}

(

U

is the eigenvector matrix of

R_{xx}

;

R_{xx} U D U

). The diagonal elements of

C_{k}

were denoted by

γ_{k, i} i i 1 … p

v_{k} tr Γ_{k} tr U C_{k} U tr C_{k} i 1 p γ_{k, i}

Thus, to guarantee boundedness of

v_{k}

we need to show that the "steady-state" values

γ_{k, i} γ_{i}

∞ .

We showed that

γ_{i} μ α σ_{ε} 2 2 1 μ λ_{i}

where

σ_{ε} 2 ε_{k} 2

λ_{i}

is the

i^{th}

eigenvalue of

R_{xx}

(

R_{xx} U λ_{1} … 0 ⋮ ⋱ ⋮ 0 … λ_{p} U

), and

α c σ_{ε} 2 1 c

0 c 12 i 1 p μ λ_{i} 1 μ λ_{i} 1

We found a sufficient condition for

μ

that guaranteed that the steady-state

γ_{i}

's (and hence

v_{k}

) are bounded:

μ 23 i 1 p λ_{i}

Where

i 1 p λ_{i} tr R_{xx}

is the input vector energy.

With this choice of $μ$ we have:

convergence in mean
bounded steady-state variance

This implies

B

∞ v k B 0

In other words, the LMS algorithm is stable about the optimumweight vector

w_{opt}

Learning curve

Recall that

e_{k} y_{k} x_{k} w_{k 1}

and . These imply

e_{k} ε_{k} x_{k} v_{k 1}

where

v_{k} w_{k} w_{opt}

. So the MSE

e_{k} 2 σ_{ε} 2 v_{k 1} x_{k} x_{k} v_{k 1} σ_{ε} 2 x_{n} ε_{n} n n k v_{k 1} x_{k} x_{k} v_{k 1} σ_{ε} 2 v_{k 1} R_{xx} v_{k 1} σ_{ε} 2 tr R_{xx} v_{k 1} v_{k 1} σ_{ε} 2 tr R_{xx} Γ_{k 1}

Where

tr R_{xx} Γ_{k 1} α_{k 1} α c σ_{ε} 2 1 c

. So the limiting MSE is

ε

∞ k ∞ e k 2 σ ε 2 c σ ε 2 1 c σ ε 2 1 c

Since

0 c 1

was required for convergence,

ε

∞ σ ε 2 so that we see noisy adaptation leads to an MSE larger than the optimal

ε_{k} 2 y_{k} x_{k} w_{opt} 2 σ_{ε} 2

To quantify the increase in the MSE, define the so-called misadjustment :

M ε

∞ σ ε 2 σ ε 2 ε ∞ σ ε 2 1 α σ ε 2 c 1 c

We would of course like to keep

M

as small as possible.

Learning speed and misadjustment trade-off

Fast adaptation and quick convergence require that we take steps as large as possible. In other words,learning speed is proportional to $μ$ ; larger $μ$ means faster convergence. How does $μ$ affect the misadjustment?

To guarantee convergence/stability we require $μ 23 i 1 p λ_{i} R_{xx}$ Let's assume that in fact $≪ μ 1 i 1 p λ_{i}$ so that there is no problem with convergence. This condition implies $≪ μ 1 λ_{i}$ or $≪ μ λ_{i} 1 i i 1 … p$ . From here we see that

≪ c 12 i 1 p μ λ_{i} 1 μ λ_{i} 12 μ i 1 p λ_{i} 1

This misadjustment

M c 1 c c 12 μ i 1 p λ_{i}

This shows that larger step size

μ

leads to larger misadjustment.

Since we still have convergence in mean, this essentially means that with a larger step size we "converge"faster but have a larger variance (rattling) about $w_{opt}$ .

Summary

small $μ$ implies

small misadjustment in steady-state
slow adaptation/tracking

large

μ

implies

large misadjustment in steady-state
fast adaptation/tracking

$w_{opt} 11$ $x_{k} 01001$ $y_{k} x_{k} w_{opt} ε_{k}$ $ε_{k} 0 0.01$

Lms algorithm

initialization $w_{0} 00$ and $w_{k} w_{k 1} μ x_{k} e_{k} k k 1$ , where $e_{k} y_{k} x_{k} w_{k 1}$

Learning curve

μ 0.05

Lms learning curve

μ 0.3

Comparison of learning curves

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Statistical signal processing. OpenStax CNX. Jun 14, 2004 Download for free at http://cnx.org/content/col10232/1.1

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical signal processing' conversation and receive update notifications?

Ask

	Assembly Programming Language By JavaChamp Team Start Quiz
	6 Microbiology Midterm Practice By Madison Christian Start Test
	English Proficiency Test By Anindyo Mukhopadhyay Start Quiz
	8 Neuroanatomy 08 The Vestibular System By Stephen Voron Start Quiz
	1 Microeconomics 01 What Is Economics? By OpenStax Start Flashcards
	1 Understanding Societies 1 By Jessica Collett Start Exam
©flickr: Dutch	Las Vegas Timeshare By Donyea Sweets Start Test
	12 Neuroanatomy 12 The Basal Ganglia By Stephen Voron Start Quiz
	Classical Music By Marion Cabalfin Start Quiz
	Computer Skills Literacy MCQ By LaToya Trowers Start Quiz