0.5 Beyond lossless compression (Page 2/7)

Universal algorithms in signal Page 2 / 7

\begin{matrix} Pr (w_{i} = b | w^{n ∖ i}) & = & \frac{Pr (w_{i} = b, w^{n ∖ i})}{Pr (w^{n ∖ i})} \\ = & \frac{Pr (w_{i} = b, w^{n ∖ i})}{\sum_{b^{'} \in \hat{α}} Pr (w_{i} = b^{'}, w^{n ∖ i})} \\ = & \frac{\frac{1}{Z_{t}} exp {- \frac{1}{t} ϵ (w_{i} = b, w^{n ∖ i})}}{\sum_{b^{'}} \frac{1}{Z_{t}} exp {- \frac{1}{t} ϵ (w_{i} = b^{'}, w^{n ∖ i})}} . \end{matrix}

We can now write,

ϵ (y_{i} = b, y^{n ∖ i}) = ϵ (y_{i} = a) + Δ H (b, a) - β Δ d (b, a),

where $Δ H_{k} (y^{i - 1} b y_{i + 1}^{n}, a)$ is the change in $H_{k} (w^{n})$ when $y_{i} = a$ is replaced by $b$ , and $Δ d (b, a, x_{i}) = d (b, x_{i}) - d (a, x_{i}) = {(b - x_{i})}^{2} - {(a - x_{i})}^{2}$ is the change in distortion. Combining [link] and [link] ,

\begin{matrix} Pr (w_{i} = b | w^{n ∖ i}) & = & \frac{exp {- \frac{1}{t} [ϵ (a) + Δ H (b, a) - β Δ d (b, a)]}}{\sum_{b^{'}} exp {- \frac{1}{t} [ϵ (a) + Δ H (b^{'}, a) - β Δ d (b^{'}, a)]}} \\ = & \frac{exp {- \frac{1}{t} \cdot 0}}{\sum_{b^{'}} exp {- \frac{1}{t} [Δ H (b^{'}, a) - β Δ d (b^{'}, a) - Δ H (b, a) + β Δ d (b, a)]}} \\ = & \frac{1}{\sum_{b^{'}} exp {- \frac{1}{t} [Δ H (b^{'}, b) - β Δ d (b^{'}, b)]}} \end{matrix}

The maximum change in the energy within an iteration of MCMC algorithm is then bounded by

\begin{matrix} Δ_{k} = max_{1 \leq i \leq n} max_{w^{n} \in {\hat{α}}^{n}} max_{b, b^{'} \in \hat{α}} | n Δ H_{k} (b, b^{'}) + c_{4} Δ d (b, b^{'}) | . \end{matrix}

We refer to the resampling from a single location as an iteration, and group the $n$ possible locations into super-iterations. Baron and Weissman [link] recommend an ordering where each super-iteration scans a permutation of all $n$ locations of the input, because in this manner each location is scanned fairly often. Other orderingsare possible, including a completely random order.

During the simulated annealing, the temperature $t$ is gradually increased, where in super-iteration $i$ we use $t = O (1 / log (i))$ [link] , [link] . In each iteration, the Gibbs sampler modifies $w^{n}$ in a random manner that resembles heat bath concepts in statistical physics. AlthoughMCMC could sink into a local minimum, we decrease the temperature slowly enough that the randomness of Gibbs sampling eventually drives MCMC out of the localminimum toward the set of minimal energy solutions, which includes $\hat{x^{n}}$ , because low temperature $t$ favors low-energy $w^{n}$ . Pseudo-code for our encoder appears in Algorithm 1 below.

Algorithm 1 : Lossy encoder with fixed reproduction alphabet Input: $x^{n} \in α^{n}$ , $\hat{α}$ , $β$ , $c$ , $r$ Output: bit-stream Procedure:

Initialize $w$ by quantizing $x$ with $\hat{α}$
Initialize $n_{w} (\cdot, \cdot)$ using $w$
for $r = 1$ to $R$ do // super-iteration
$t_{r} \leftarrow \frac{c n Δ_{k}}{log (r)}$ // temperature
Draw permutation of numbers ${1, ..., n}$ at random
for $r^{'} = 1$ to $n$ do
Let $j$ be component $t^{'}$ in permutation
Generate new $w_{j}$ using $f_{s} (w_{j} = \cdot | w^{n ∖ j})$
Update $n_{w} (\cdot, \cdot) [\cdot]$
Apply CTW to $w^{n}$ // compress outcome

Computational issues

Looking at the pseudo-code, it is clear that the following could be computational bottlenecks:

Initializing $n_{w} (\cdot, \cdot)$ - a naive implementation needs to scan the sequence $w^{n}$ (complexity $O (n)$ ) and initialize a data structure with $| \hat{α} |^{K + 1}$ elements. Unless $K ≲ {log}_{| \hat{α} |} (n)$ , this is super linear in $n$ . Therefore, we recall that $K = o (log (n))$ , and initializing $n_{w} (\cdot, \cdot)$ requires linear complexity $O (n)$ .
The inner loop is run r n times, and each time computing Pr ( w j = b | w n ∖ j ) for all possible b ∈ α ˆ might be challenging. In particular, let us consider computation of Δ d and Δ H .
1. Computation of $Δ d$ requires constant time, and is not burdensome.
2. Computation of $Δ H$ requires to modify the symbol counts for each context that was modified. A key contribution by Jalali and Weissman was to recognize that the array of symbol counts, $n_{w} (\cdot, \cdot)$ , would change in $O (k)$ locations, where $k$ is the context order. Therefore, each computation of $Δ H$ requires $O (k)$ time. Seeing that $| \hat{α} |$ such computations per iteration are needed, and there are $r n$ iterations, this is $O (k r n | \hat{α} |)$ .
3. Updating $n_{w} (\cdot, \cdot)$ after $w_{j}$ is re-sampled from the Boltzmann distribution also requires $O (k)$ time. However, this step is performed only once per iteration, and not $| \hat{α} |$ times. Therefore, this step requires less computation than step (b).

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Universal algorithms in signal processing and communications. OpenStax CNX. May 16, 2013 Download for free at http://cnx.org/content/col11524/1.1

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Universal algorithms in signal processing and communications' conversation and receive update notifications?

Ask

	Chemistry Final By Briana Hamilton Start Flashcards
	Principles of economics By OpenStax Read Online Course
	NCE Ch 07 Lifestyle Career Development By Anh Dao Start Quiz
	3 Psychology MCQ 2009 2 Exam By John Gabrieli Start Exam
	Concepts of biology By OpenStax Read Online Course
	Concepts of biology By OpenStax Read Online Course
	3 AP 03 Cellular Level of Organization MCQ By OpenStax Start Quiz
	PE Power Enigeering Safety By Gerr Zen Start Quiz
	World Capitals Quiz By Yasser Ibrahim Start Quiz
	Corporate Communication BUS210 By P. Wynn Norman Start Quiz