Enhancing AMR-WB+ with a Conversational Mode
Abstract: With the arrival of the 2.5G and 3G mobile networks, demands on the quality of mobile services have increased. The AMR-WB+ hybrid speech and generic audio codec standardized by the 3GPP has been shown to perform very well on both speech and generic audio signals, but it does however exhibit a longer algorithmic delay than other low bit rate speech coders. This makes the codec in its current form unsuitable for conversational use.
This thesis investigates ways in which the algorithmic delay in the AMR-WB+ codec can be lowered, and specically targets the elimination of large codec frame sizes. The reason for these large frame sizes is due to the variable frame size transform-based coding method present in the internal TCX coder. Two new low delay transform modes are therefore presented, implemented and evaluated in this thesis. These are based on perceptually warped filter banks that aim to rectify the usual shortcomings of shorter transforms. The new encoding modes are finally evaluated in simulations and a small formal listening test is conducted. Listening tests show that the quality of the new encoding modes is noticeably inferior to the original codec on broad spectral signals without any compensation in bit rates.
CLICK HERE TO DOWNLOAD THE WHOLE ESSAY. (in PDF format)