AC3 Encoder

The AC3 encoder operates block-based, too, thus a buffer is used again to split the PCM input signal into blocks of samples. A highpass filter at 3 Hz eliminates a possible DC offset. The subwoofer channel is low pass filtered above 120 Hz.

AC3 EncoderThe filtered full-bandwidth input signals are analyzed with a high frequency bandpass filter to detect the presence of transients. This information is used to adjust the block size of the TDAC filterbank. The adjustment restricts the quantization noise associated with the transient within a small temporal region around the transient and avoids temporal unmasking.

The following filterbank cuts the signal into sub-bands. Before this is done a transformation coding transforms the signal from time to frequency domain. The transformation is done using a FFT (fast fourier transformation) or a MDCT (modified discrete cosinus transformation). Processing signals in the frequency domain has the advantage that the signal changes slowly, so that it can be transmitted less often than a time domain signal.

The transformation uses input blocks of 512 or 256 samples. Blocks with 512 samples are used if the time signal alters slowly. Short blocks are used, if the signal changes very fast so that fast transients occur. The actual block length must be transmitted as a side information to the decoder (transient flags). The transformed blocks employ 50 % overlap at the edge of the block to avoid discontinuities at block boundaries and to minimize errors during coding.

The transformation coefficients are converted to floating point representation for a further separate processing of the mantissa and the exponent.

The advantage of floating point representation is the loss of leading zeros, therefore there are no dynamic range limitations of the signal during the process.

As mentioned before, modern coding methods exploit the selective frequency sensitivity of the human ear to reduce data. Especially high frequencies can be reduced. One of the reasons is that a direct perception of high frequencies is not possible. The human ear is only able to hear the difference between the frequencies, which is called the envelope. With that knowledge the signal is fragmented for high frequencies in a carrier and an envelope signal. The perceptible envelope carries the more important information and is therefore quantized with a high bit rate, whereas the carrier is quantized with a low bit rate. Across the sound channels carriers can be selectively combined. This is called coupling. The coupling is represented by the coupling coefficients, which have to be transmitted to the decoder, so that the decoder can identify the combined carriers.

The global bit allocation controls the optimization of the bit rate during the quantization of the mantissa of the filterbank coefficients. The exponents are used as scaling factors for the optimization. The calculation of the ideal bit rate is done globally across all channels using a psychoacoustic model. Channels with a high data density are coded using a high level of quantization that results in a high bit rate, while channels with a low data density are quantized with a low bit rate. Thus it is possible to keep the global data rate almost constant. Information about the bit allocation strategy are then transmitted to the decoder.

The coded audio data and the side information are packed together with synchronization and error correction data by a multiplexer to a defined AC3 bit stream.

More