Microphone Array Wiener Beamformer and Speaker Localization With emphasis on WOLA Filter Bank

University essay from Blekinge Tekniska Högskola/Sektionen för ingenjörsvetenskap

Author: Hemanth Yerramsetty; [2012]

Keywords: RIR; Bemaforming; filterbank; srp-phat;

Abstract: This thesis describes the design and implementation of a speech enhancement system that uses 4-channel microphone array beam forming and speech enhancement algorithms applied to a speech signal in a multiple source environment. To locate the accurate Direction of Arrival (DOA) from the source, it is necessary to design a suitable microphone array system with more efficient localization algorithm. The goal of the system is to improve the quality of the primary speech signal. A filter bank is a signal processing tool that can facilitate manipulation of signals in the frequency domain. The WOLA (Weighted Overlap and Add) filter is an efficient method used to implement a uniformly distributed multi-channel filter bank. The WOLA is generally used in applications that demand high quality filters in term of stop band rejection and filter shape. Beamformers work by means of steering an array of microphones towards a desired look direction through utilizing signal information rather than physically moving the array. In this research, Wiener beam former is examined the input signals are first split into frequency bands so that Wiener beam forming techniques can be used. There are many algorithms developed for estimating the number of sources and locating the DOA, such as Bayesian algorithm, kalman filtering, Generalized Cross Correlation (GCC) and Steered Response Power (SRP) algorithm. But SRP algorithm with its steered beam forming technique for speaker localization is more robust using microphone array. The Phase Alignment Transform (PHAT) has gained a lot of attention in the recent research for its quite robust response in low noise, but reverberant environment. So combining SRP-PHAT will become the robust localizer in reverberant environment. Experiments were done on recorded data of human talkers. The algorithm gives accurate DOA from the dominant speaker. In addition to these, listener opinion testing is performed.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)