Skip to content

Latest commit

 

History

History
103 lines (62 loc) · 3.39 KB

README.md

File metadata and controls

103 lines (62 loc) · 3.39 KB

Melody generator with RNNs


The goal of this project is to learn how to apply machine learning techniques to produce music. In this project, I trained and deployed two RNN models with different configurations using a dataset of pop/electronic melodies. The piano melodies were extracted from songs in MIDI format and converted into note sequences using one-hot encoding. The trained models are capable of generating monophonic melodies given a primer melody. The coolest part of the project is interacting with the model utilizing Magenta’s midi interface in Ableton. This setup enables you to generate AI music based on melodies played in real-time.

conda environment:
# Create a new environment for Magenta with Python 3.6.x as interpreter
conda create --name magenta python=3.6

# Then activate it
conda activate magenta

# Then you can install Magenta 2.1.2 and the dependecies
pip install magenta=2.1.2 visual_midi tables

About the Data


In this project I'll use the "The Lakh MIDI Dataset v0.1" and matched content from "The Million Song Dataset."

I'll fetch a song's genre using the Last.fm API

  • LMD-matched - A subset of 45,129 files from LMD-full which have been matched to entries in the Million Song Dataset.

  • Match scores - A json file which lists the match confidence score for every match in LMD-matched and LMD-aligned.

  • Dataset not provided in this repo

Data Visualization

  • Instrument Class of the entire dataset


  • Extract only Piano tracks of Pop and Electronic

  • Distribution of Piano lengths of Pop and Electronic Songs


Class count for Pop and Electronic


Training Magenta's Melody RNN models

  • Melody RNN (basic configuration)

    • This configuration acts as a baseline for melody generation with an LSTM model. It uses basic one-hot encoding to represent extracted melodies as input to the LSTM. For training, all sequence examples are transposed to the MIDI pitch range [48,84] and outputs will also be in this range.
  • Melody RNN (attention configuration)

    • Attention allows the model to more easily access past information without having to store that information in the RNN cell's state. This allows the model to more easily learn longer term dependencies, and results in melodies that have longer arching themes.

Training and evaluation data

  • Melody RNN Baseline (Loss)

  • Melody RNN Baseline (Accuracy)


  • Melody RNN w/ Attention config (Loss)

  • Melody RNN w/ Attention config (Accuracy)


Generate melodies by priming the trained models

Primer Midi "Uptown Funk"

  • I'll generate melodies by priming the baseline and Attention models with 2.5 seconds of main melody of Uptown Funk.


Base model generated melody

  • Here you can see the primer MIDI and the continued sequence.

Melody RNN w/ Attention config

  • As you can see here, the primer MIDI and how the attention was able to generate a longer arching theme from the primer.


Checkout the apps directory to see how I applied the two models.