Code for classifying music genre using Bidirectional LSTM.
- Applied 13 MFCCs bands and saved the results in json
- Created the model using 32 hidden neurons, output (4 neurons)
- Adam with default learning rate and CrossEntropyLoss is used.
- Trained the model for 50 epochs.
The data is not saved as png file because we can encounter losses of data and change in information in the process. Spectrograms cannot be represented as images.
There is no audio dataset publicly available, so the data had to be created manually. There is about 400 samples of music , each 30 seconds of length. For searching the song youtube is used and for trimming the audio audacity (Open Source software) is used. For training 75% of data is used and for testing remaining 25% is used.
git clone -b test https://github.com/xettrisomeman/Nepali-Music-Genre-Classification
cd Nepali-Music-Classification
pip install -r requirements.txt
cd genreclassify
python predict.py --help
- Mel-Frequency Cepstral Coefficients Explained
- Preparing dataset for Music genre Classification
- Poorjam, Amir Hossein. (2018). Re: Why we take only 12-13 MFCC coefficients in feature extraction?. Retrieved from: https://www.researchgate.net/post/Why_we_take_only_12-13_MFCC_coefficients_in_feature_extraction/5b0fd2b7cbdfd4b7b60e9431/citation/download.