Far-Field Wiener Beamforming and Source Localization in Frequency Domain

University essay from Blekinge Tekniska Högskola/Sektionen för ingenjörsvetenskap

Abstract: In present conference environments where video recording is required, a set of cameras operated by a human being is needed to track the active speaker as he discusses in the conference. In order to automate this procedure, different methods have been developed in acoustic and visual tracking. In this thesis work, a robust speaker tracking system is developed namely, Steered Response Power PHase Alignment Transform (SRP-PHAT) and Steered Response Kurtosis PHase Alignment Transform (SRK-PHAT) which compute the likelihood of each source position using the generalized cross correlation estimations between each pair of microphones. While developing the hands-free speech applications in a smart room environment, speech source will be located at a distance from the microphones and the effect of presence of noise and reverberation is high in estimating the location of the speech source. The accuracy of the SRP-Phat and SRK-Phat methods in estimating the source location is limited by the time resolution of weighted PHAT function. In this thesis work, SRP-Phat and SRK-Phat has been implemented using 2 element microphone array and 4 element microphone array and to compare the above methods in detail, the performance of the methods has been analyzed for 64,128 and 256 subbands in a WOLA filter bank. The estimated Time difference of arrival (TDOA’s) and Direction of Arrival (DOA’s) of SRPPhat and SRK-Phat are compared along with Original values to determine the best method for estimating the speech source location. Mean estimation error and Standard deviation are calculated to determine the accuracy of the TDOA’s estimated. In this thesis work, Wiener Beamforming is implemented for removing noise and reverberation in a room environment using a 2 element microphone array. The performance of the method is analyzed using Signal-to-Noise Ratio (SNR) and Perceptual Evaluation of Speech Quality (PESQ). In order to improve the results obtained, a De-reverberation procedure is also included in the Wiener Beamforming method and the improvement in PESQ values is discussed in chapter 4.The performance of the wiener beamforming method is tested for brown noise, babble noise, fan noise and white noise.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)