RWTH Aachen
University
Institute for Communication
Systems and Data Processing
Skip to content
Direkt zur Navigation
Home
Home

Publications – Details

Reverberation-Based Post-Processing for Improving Speech Intelligibility

Authors:
Magnus Schäfer, Marco Jeub, Bastian Sauert, and Peter Vary
Book Title:
International Congress on Acoustics (ICA)
Venue:
Sydney, Australia
Event Date:
23.-27.8.2010
Organization:
Australian Acoustical Society
Location:
Sydney
Date:
Aug. 2010
ISBN:
978-0-64654-052-8
URL:
http://www.acoustics.asn.au/conference_proceedings...
Language:
English

Abstract

When evaluating new algorithms for speech and audio coding or enhancement systems (e.g., noise reduction, echo control, or artificial bandwidth extension), one will usually listen to audio examples on headphones and not use any loudspeaker setup that might be available. The reasoning behind this choice is that using a headphone reproduction system makes it easier to identify even small signal processing artifacts which would be at least partly concealed by room reflections in listening rooms.

Usually, these artifacts due to coding or signal enhancement can not be completely removed but only minimized with respect to the constraints of the application. Examples could be a limited data rate for speech and audio coding or a trade-off decision between noise attenuation and speech distortion in noise reduction algorithms.

Based on the aforementioned superiority of headphones for making these artefacts noticeable, this contribution presents a postfilter that mimics the properties of listening rooms to conceal residual errors and artifacts. This postfilter is a finite impulse response filter that is designed according to measured or simulated room impulse responses.

The main focus of this contribution lies on the evaluation of different types of impulse responses for a reverberation-based postfiltering of speech signals that were transmitted by speech codecs at low data rates. In an exemplary study based on the Adaptive Multi-Rate WideBand (AMR-WB) speech codec, the proposed post-processing leads to an increase in the Speech Transmission Index (STI), which indicates a better intelligibility. Optimized impulse responses for the different data rates of (AMR-WB) are given in order to maximize the (STI).

Download of Publication

Copyright Notice

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

The following notice applies to all IEEE publications:
© IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

File

schaefer10.pdf 131 K