OpenAI releases deep neural music network MuseNet
OpenAI has released the deep neural network MuseNet, which allows users to create music with a length of four minutes using 10 different instruments combined with country, Mozart or Lady Gaga. “MuseNet was not explicitly programmed with our understanding of music, but instead discovered patterns of harmony, rhythm, and style by learning to predict the next token in hundreds of thousands of MIDI files.”
MuseNet uses the same general-purpose unsupervised technology as GPT-2, a large-scale transformer model that can be used to predict the notes behind the song. OpenAI also collects a variety of audio material from various sources as training data for MuseNet, including the website files of ClassicalArchives and BitMidi, the MAESTRO dataset, and popular, African, Indian, and Arabic-style music files on the web.
OpenAI conducted various experiments and finally found an expressive and concise coding method that combines information such as pitch, volume, and instrument into a single token. During the training process, OpenAI:
- Transpose the notes by raising and lowering the pitches (later in training, we reduce the amount of transposition so that generations stay within the individual instrument ranges).
- Augment the volumes, turning up or turning down the overall volumes of the various samples.
- Augment timing (when using the absolute time in seconds encoding), effectively slightly slowing or speeding up the pieces.
- Use mixup on the token embedding space
Currently, OpenAI has made early results public, and users can specify composers or styles in MuseNet. After selecting a famous music piece as the beginning, the website will create a new piece of music. MuseNet will be open until May 12th, after which the feature will go offline and be adjusted based on user feedback.