Publications – Details

Model-Based Speech Enhancement Exploiting Temporal and Spectral Dependencies

Author:: Thomas Esch
Editor:: Peter Vary
Type:: Dissertation
Series:: Aachener Beiträge zu Digitalen Nachrichtensystemen (ABDN)
Number:: 32
School:: IND, RWTH Aachen
Publisher:: Verlag Mainz in Aachen
Date:: Apr. 2012
ISBN:: 3-861-30359-0
URL:: http://darwin.bth.rwth-aachen.de/opus3/volltexte/2...
Language:: English

Abstract

Mobile telephony has become an integral part of everyday life for billions of people around the world. The exchange of information via speech is nowadays possible from almost all places at anytime. However, even though the vision of permanent reachability and connectivity has been realized in the meantime nearly worldwide, there is still room for improvements when it comes to the transmission of speech under noisy conditions. The performance of any speech communication system may significantly deteriorate when the speech signal is disturbed by ambient interferences such as traffic noise or office noise, possibly leading to a poor speech quality and intelligibility.

In this thesis, a novel model-based speech enhancement system is presented which performs single-channel noise reduction of degraded speech signals. In contrast to state-of-the-art noise suppression techniques, the developed algorithms explicitly exploit temporal and spectral dependencies of speech and noise signals. To account for the temporal correlation, a modified Kalman filter is derived in the frequency domain. As main novelties, the proposed solution performs complex-valued prediction of speech and noise DFT coefficients and uses SNR-dependent MMSE estimators which are adapted to measured statistics of the input signal. In order to incorporate the spectral dependencies of speech signals, a new wideband speech enhancement system is presented which utilizes techniques known from artificial bandwidth extension. The developed method re-uses the processed and enhanced signal from lower frequencies to improve the results of a conventional noise suppression technique at higher frequencies. As additional part, this work proposes effective countermeasures to reduce the occurrence of musical noise and provides a novel solution for the suppression of rapidly time-varying harmonic noise.

All developed speech enhancement techniques within this thesis are thoroughly evaluated by means of instrumental measurements and auditory judgments. It turns out that the proposed algorithms achieve distinctly better results compared to state-of-the-art approaches with respect to noise attenuation and speech distortions. The novel model-based system is not restricted to the application in mobile phones. It can be used in addition to improve the speech quality of hands-free devices, conferencing systems or digital hearing aids.

Download of Publication

Copyright Notice

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

The following notice applies to all IEEE publications:
© IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

File

esch12a.pdf 5926 K