Re-synthesis of instrumental sounds with Machine Learning and a Frequency Modulation synthesizer

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Frequency Modulation (FM) based re-synthesis to find the parameter values which best make a FM-synthesizer produce an output sound as similar as possible to a given target sound is a challenging problem. The search space of a commercial synthesizer is often non-linear and high dimensional. Moreover, some crucial decisions need to be done such as choosing the number of modulating oscillators or the algorithm by which they modulate each other. In this work we propose to use Machine Learning (ML) to learn a mapping from target sound to the parameter space of an FM-synthesizer. In order to investigate the capabilities of ML to implicitly learn to make the mentioned key desicions in FM, we design and compare two approaches: first a concurrent approach where all parameter values are compared at once by one model, and second a sequential approach where the prediction is done by a mix of classifiers and regressors. We evaluate the performance of the approaches with respect to ability to reproduce instrumental sound samples from a dataset of 2255 samples from over 700 instrument in three different pitches with respect to four different distance metrics, . The results indicate that both approaches have similar performance at predicting parameters which reconstruct the frequency magnitude spectrum and envelope of a target sound. However the results also point at the sequential model being better at predicting the parameters which reconstruct the temporal evolution of the frequency magnitude spectrums. It is concluded that despite the sequential model outperforming the concurrent, it is likely possible for a model to make key decisions implicitly, without explicitly designed subproblems.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)