Adaptive rate control is one way of achieving error-resilience.
For channel matched source rate control the channel bit error probability
needs to be estimated first. Transmitting pilot symbols to estimate
pe is a common solution. But, this introduces delay and costs
bandwidth. We now discuss a way to overcome this problem
by jointly controlling the rate of quantization
and on-line estimation of pe using VSLA. We assume that pe can
take only one of the three values, 10-1, 10-2, or 10-3.
At the beginning of the transmission the actual pe for the channel
is unknown.
Now, let the
set of actions of the automaton, correspond to the three channel matched empirically optimal quantizers
for channel bit error rates 10-1, 10-2, and 10-3 respectively.
Since there is a one-to-one correspondence between quantizers
and the channel bit error probability, by learning the optimal
quantization parameter we also estimate pe.
Let a favorable response ()from the decoder for a chosen quantizer
imply that the
PSNR of the current received video frame is greater than or
equal to that of the previous frame
and 1 is
an unfavorable response. We assume here that PSNR and the number of
blocks in error in the received frame are inversely proportional.
Therefore, a high PSNR implies fewer number of blocks in error.
The penalty probabilities for the choice of each action is unknown
and defines the environment (channel's conditions).
The goal is to maximize the PSNR of the received video
signal by learning the channel condition and choosing the corresponding
optimal rate for the source quantizer. This corresponds to learning
the action with the minimum penalty probability.
The probability of
choosing the quantizers are updated using LRI.
The typical steps involved in the proposed algorithm
are as follows:
I-frame is quantized using a high rate quantizer and transmitted
with sufficient protection.
The decoder, after reconstructing this frame
quantizes it using the the optimal source rates for the
three channel bit error
probabilities under consideration and stores them
in a buffer.
The VSLA chooses a quantizer randomly for the nth
frame with the given probability .
The quantized is transmitted along with the
the quantizer information as protected side information.
A response based on the PSNR of the decoded frame
is transmitted as a feedback
information.
Based on this feedback
the probabilities of choosing the quantizers for the
n+1st frame is computed using LRI.
The algorithm learns
until the probability of choosing the
optimal action for the unknown pe converges to 1.
When the algorithm converges it has learnt the unknown pe of the channel
by the optimal choice of the quantizer factor.
Of course, this method could result in the first few frames to be
sub-optimally quantized. This is the cost incurred in on-line channel
estimation.
However, the learning delay can be controlled by the value
of the reward parameter. Depending on the value of the error tolerance
a suitable reward parameter can be chosen.
A higher reward parameter results in faster
convergence of the LRI learning.
Since p(n) converge w.p.1 let w.p. 1.
Then, the average rate of convergence of LRI learning is [21]
.After every frame n, decreases by a factor of .If is the time taken for to decrease
to d times its value, then For very slowly changing channels,
If the channel bit error probability changes then the encoder will receive
a series of penalties. Then the learning process can be started again.
Therefore, the encoder is in sync with the channel again.
Figure 10: Missamerica sequence, a=0.3
Figure 11: Susie sequence, a=0.3
Figure 12: Claire sequence, a=0.5
Figure 13: Reconstructed Missamerica frames for pe=10-1
Figure 14: Reconstructed Missamerica frames for pe=10-2