SML – Music genre – François Coste

Music genre identification

Usefull resources

A notebook showing how to preprocess musical signal in Python
A lecture with a review on feature choice/preprocessing for music…

Datasets

GTZAN Genre Collection
Dataset used for the well known paper in genre classification ” Musical genre classification of audio signals” by G. Tzanetakis and P. Cook in IEEE Transactions on Audio and Speech Processing 2002.
The dataset consists of 10 genres : Blues, Classical, Country, Disco, Hiphop, Jazz, Metal, Pop, Reggae, Rock. Each genre contains 100 songs. Tracks are 30 seconds long and are all 22050Hz Mono 16-bit audio files in .wav format.
Useful links:
- The gtzan dataset: Its contents, its faults, their effects on evaluation, and its future use Bob L SturmarXiv preprint arXiv:1306.1461, 2013. (reference [48] cited by in https://arxiv.org/abs/1703.09179 : “… The Gtzan genre dataset has been extremely popular, although some flaws have been found [48] …”.)
FMA: A Dataset For Music Analysis
Dataset: fma_small.zip: 8,000 tracks of 30s, 8 balanced genres (GTZAN-like) (7.2 GiB)
The dataset is a dump of the Free Music Archive (FMA), an interactive library of high-quality, legal audio downloads.
Note by authors: This is a pre-publication release. As such, this repository as well as the paper and data are subject to change. Stay tuned!
Used by: WWW 2018 Challenge: Learning to Recognize Musical Genre from Audio on the Web By EPFL
Reference:
- FMA: A Dataset For Music Analysis, M. Defferrard, K. Benzi, P. Vandergheynst, X. Bresson
  Paper abstract: We introduce the Free Music Archive (FMA), an open and easily accessible dataset suitable for evaluating several tasks in MIR, a field concerned with browsing, searching, and organizing large music collections. The community’s growing interest in feature and end-to-end learning is however restrained by the limited availability of large audio datasets. The FMA aims to overcome this hurdle by providing 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres. It provides full-length and high-quality audio, pre-computed features, together with track- and user-level metadata, tags, and free-form text such as biographies. We here describe the dataset and how it was created, propose a train/validation/test split and three subsets, discuss some suitable MIR tasks, and evaluate some baselines for genre recognition. Code, data, and usage examples are available at https://github.com/mdeff/fma.

Audio Set
Especially Music genre, cited by Music Genre Classification using Machine Learning Techniques which used it on 7 labels:

1. Pop Music (8100)
2. Rock Music (7990)
3. Hip Hop Music (6958)
4. Techno (6885)
5. Rhythm Blues (4247)
6. Vocal (3363)
7. Reggae Music (2997)

Music Information Retrieval Evaluation eXchange (MIREX 2019 and previous….)
Many tasks at : https://www.music-ir.org/mirex/wiki/2019:Audio_Classification_(Train/Test)_Tasks
(22.05 kHz mono wav clips)
- Audio Classical Composer Identification
- Audio US Pop Music Genre Classification
- Audio Latin Music Genre Classification
- Audio Mood Classification
- Audio K-POP Mood Classification
- Audio K-POP Genre Classification