MusicOSet - An Enhanced Music Dataset for Music Data Mining
This repository stores an open and enhanced dataset of musical elements (music, albums, and artists) suitable for music data mining.
The attractive features of MusicOSet include:
- Integration and centralization of different musical data sources
- Calculation of popularity scores and classification of hits and non-hits musical elements, varying from 1962 to 2018
- Enriched metadata for music, artists, and albums from the US popular music industry
- Availability of acoustic and lyrical resources
- Unrestricted access in two formats: SQL database and compressed .csv files
Data | # Records |
---|---|
Songs | 20,405 |
Artists | 11,518 |
Albums | 26,522 |
Lyrics | 19,664 |
Acoustic Features | 20,405 |
Genres | 1,561 |
MusicOSet is available in a public repository in two different formats
- Relational Database
- musicoset.sql: SQL file that will create the relational database and subsequently loads all the information in the tables by a MySQL installation (233MB)
- .csv Tables
- musicoset_metadata.zip: Contains textual and numeric information about songs, artists, and albums (5,73MB)
- musicoset_popularity.zip: Contains nine tables of musical popularity information (11,8MB)
- musicoset_songfeatures.zip: Contains lyrics and acoustic fingerprints of the songs collected (52MB)
- Metadata Analysis: Collaboration profiles and their impact on musical success, ACM/SAC, Cyprus, 2019.
- Hit Song Science: Causality analysis between collaboration profiles and musical success, Technical Report, Brazil, 2019.
@InProceedings{silva2019musicoset,
title = {{MusicOSet: An Enhanced Open Dataset for Music Data Mining}},
author = {Silva, Mariana O. and Rocha, La\'{\i}s M. and Moro, Mirella M.},
booktitle = {{XXXIV} Simp{\'{o}}sio Brasileiro de Banco de Dados: Dataset Showcase Workshop, {SBBD} 2019 Companion},
address = {Fortaleza, CE, Brazil},
year = {2019}
}
- The dataset is meant for research purposes.
The work is supported by CNPq, Brazil.