Publications – Details

Speech-Codebook Based Soft Voice Activity Detection

Authors:: Florian Heese, Markus Niermann, and Peter Vary
Book Title:: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Venue:: Brisbane, QLD, Australia
Event Date:: 19.-24.4.2015
Organization:: IEEE
Location:: Piscataway, NJ, USA
Date:: Apr. 2015
Language:: English

Abstract

A novel noise-robust soft Voice Activity Detector (VAD) operating in

the short-time Fourier domain is presented. A speech energy gain is

obtained by frame-wise processing of a noisy speech signal with a

speech codebook algorithm. This gain can be used for robust voice

detection. A speaker-independent speech codebook, consisting of

spectral envelopes, is created in the training process. While

applying the algorithm, the codebook is adapted in every frame to

the current speaker by combining the harmonic pitch structure of the

actual noisy speech frame with the codebook entries. Soft VAD

values ranging from zero to one are calculated by post-processing of

the speech gain which is obtained using gain shape vector

quantization. A binary VAD is carried out by applying a

threshold. The proposed method does not rely on noise

a-priori knowledge and is robust w.r.t. highly non-stationary

noise and adverse SNR conditions. In addition, it is possible to

compromise between the detection-rate and the false-alarm-rate by

varying a threshold without increasing the total number of

mis-detections. Compared to state-of-the-art VAD systems, the

proposed method is characterized by better detection-rates at

significant lower false-alarm-rates.

Download of Publication

Copyright Notice

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

The following notice applies to all IEEE publications:
© IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

File

heese15a.pdf 332 K