Room Impulse Response Interpolation

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Author: Daníel Thor Wilcox; [2023]

Keywords: Virtual Acoustics; Machine Learning; Signal Processing; Room Impulse Response; Virtuell Akustik; Maskininlärning; Signalbehandling; rumsimpulssvar;

Abstract: In Virtual Reality (VR) systems, the incorporation of acoustics allows for the generation of audio-visual stimuli, facilitating applications in engineering, architecture, and design. The goal of virtual acoustics is to create a realistic sound field in continuous space. Realistic virtual acoustic environments can be produced with wave-based acoustic simulations. However, rendering a sound field with a dense grid of room impulse responses (RIRs) in real-time is slow and memory-intensive. Conventionally, a more sparsely spaced grid of RIRs is used and as a workaround linear interpolation between the nearest RIRs is performed, allowing users to listen at an arbitrary location. However, the linear interpolation method reduces the quality of the sound field as it does not produce natural-sounding RIRs. The aim of this thesis is therefore to answer the question of whether we are able to achieve a better interpolation technique than linear interpolation using a machine learning approach. In this thesis, we present a novel neural network-based method for interpolating between Room Impulse Responses (RIRs). The networks were trained using RIRs from a wave-based simulation of a single 3D room and developed through a series of experiments. The experimental process was performed in three distinct stages. Firstly, we explored various representations of the RIRs: unprocessed RIRs, Short-time Fourier transform (STFT) of RIRs, and encoded STFT of the RIRs using an autoencoder. Secondly, we examined several different neural network architectures: Multi-layer perception, residual neural network, autoencoder, and U-Net. Additionally, we experimented with training the networks in a Generative Adversary Network (GAN) setting. Thirdly, we experimented with different sizes of the best-performing architecture. Results show that using an STFT representation of the RIRs combined with a residual neural network architecture yielded the most optimal results. Furthermore, we were able to outperform the established linear interpolation baseline.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Room Impulse Response Interpolation

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-26)