We can now write,
where
is the change in
when
is replaced by
, and
is the change in distortion.
Combining
[link] and
[link] ,
The maximum change in the energy within an iteration of MCMC algorithm is then bounded by
We refer to the resampling from a single location as an iteration,
and group the
possible locations into super-iterations.
During the simulated annealing, the temperature
is gradually increased, where in
super-iteration
we use
[link] ,
[link] .
In each iteration, the Gibbs sampler modifies
in a
random manner that resembles heat bath concepts in statistical physics. AlthoughMCMC could sink into a local minimum, we decrease the temperature slowly enough
that the randomness of Gibbs sampling eventually drives MCMC out of the localminimum toward the set of minimal energy solutions, which includes
,
because low temperature
favors low-energy
.
Pseudo-code for our encoder appears in Algorithm 1 below.
Algorithm 1 : Lossy encoder with fixed reproduction alphabet
Input:
,
,
,
,
Output: bit-stream
Procedure:
- Initialize
by quantizing
with
- Initialize
using
-
for
to
do //
super-iteration
-
//
temperature
- Draw permutation of numbers
at random
-
for
to
do
- Let
be component
in permutation
- Generate new
using
- Update
- Apply CTW to
//
compress outcome
Computational issues
Looking at the pseudo-code, it is clear that the following could be computational bottlenecks:
- Initializing
- a naive implementation needs to scan the sequence
(complexity
) and initialize a data structure with
elements. Unless
, this is super linear in
. Therefore, we recall that
, and initializing
requires linear complexity
.
- The inner loop is run
times, and each time computing
for all possible
might be challenging. In particular, let us consider computation of
and
.
- Computation of
requires constant time, and is not burdensome.
- Computation of
requires to modify the symbol counts for each context that was modified.
A key contribution by Jalali and Weissman was to recognize that the array of symbol counts,
,
would change in
locations, where
is the context order. Therefore, each computation of
requires
time. Seeing that
such computations per iteration are needed,
and there are
iterations, this is
.
- Updating
after
is re-sampled from the Boltzmann distribution also requires
time. However, this step is performed only once per iteration, and not
times.
Therefore, this step requires less computation than step (b).