Data-Driven Procedural Audio : Procedural Engine Sounds Using Neural Audio Synthesis

University essay from KTH/Datavetenskap

Author: Anton Lundberg; [2020]

Keywords: ;

Abstract: The currently dominating approach for rendering audio content in interactivemedia, such as video games and virtual reality, involves playback of static audiofiles. This approach is inflexible and requires management of large quantities of audio data. An alternative approach is procedural audio, where sound models are used to generate audio in real time from live inputs. While providing many advantages, procedural audio has yet to find widespread use in commercial productions, partly due to the audio produced by many of the proposed models not meeting industry standards. This thesis investigates how procedural audio can be performed using datadriven methods. We do this by specifically investigating how to generate the sound of car engines using neural audio synthesis. Building on a recently published method that integrates digital signal processing with deep learning, called Differentiable Digital Signal Processing (DDSP), our method obtains sound models by training deep neural networks to reconstruct recorded audio examples from interpretable latent features. We propose a method for incorporating engine cycle phase information, as well as a differentiable transient synthesizer. Our results illustrate that DDSP can be used for procedural engine sounds; however, further work is needed before our models can generate engine sounds without undesired artifacts and before they can be used in live real-time applications. We argue that our approach can be useful for procedural audio in more general contexts, and discuss how our method can be applied to other sound sources.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)