The Gunnlod Dataset : Engineering a dataset for multi-modal music generation

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: This report details the creation of a new dataset named the Gunnlod dataset (after the Norse giantess who guarded the mead of poetry) for use in research in the field of machine learning as applied to music creation, particularly multi-modal music in the MIDI format of symbolic music representation. The dataset is based on a subset of approximately four fifths of the Lakh MIDI dataset. Each of the selected files has been processed to create an array representation of the file intended for easy use with machine learning models, as well as a taggram - a matrix of values specifying the degree to which the music exhibits certain traits at different points in time. These traits include genres (such as ”rock”, ”pop” and ”country”), instrumentation (such as ”flute”, ”vocals” and ”guitar”) as well as other more general descriptors (such as ”catchy”, ”quiet” and ”weird”). The trait values are generated by a preexisting machine learning model, circumventing the need for intense human labour. The dataset is intended to enable future researchers to create tools to aid in h‘uman creative tasks, such as a virtual ”composer’s assistant” capable of offering suggestions for melodies or drum beats based on the user’s requests. The code used to create Gunnlod can be found at https://gits-15.sys.kth.se/joeli/midiPipe. The report also includes an ethical analysis of the dataset rooted in the seven guidelines for ethical AI outlined in the framework Ethics guidelines for trustworthy AI commissioned by the European Commission. It concludes that the creation of Gunnlod raises concerns regarding Privacy and data governance and Diversity, non-discrimination and fairness, which are to some degree alleviated by its Transparency, and suggests ways to perform future research in ethical ways.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)