⭐ Table of supported datasets ⭐¶
This table is provided as a guide for users to select appropriate datasets. The list of annotations omits some metadata for brevity, and we document the dataset’s primary annotations only. The number of tracks indicates the number of unique “tracks” in a dataset, but it may not reflect the actual size or diversity of a dataset, as tracks can vary greatly in length (from a few seconds to a few minutes), and may be homogeneous. For specific information about the contents of each dataset, click the link provided in the “Module” column.
“Downloadable” possible values:
- ✅ : Freely downloadable
- 🔑 : Available upon request
- 📺 : Youtube Links only
- ❌ : Not available
Module | Name | Downloadable? | Annotation Types | Tracks |
---|---|---|---|---|
beatles | The Beatles
Dataset
|
|
180 | |
beatport_key | Beatport EDM key |
|
|
1486 |
dali | DALI |
|
5358 | |
groove_midi | Groove MIDI
Dataset
|
|
1150 | |
gtzan_genre | Gtzan-Genre |
|
1000 | |
giantsteps_tempo | Giantsteps EDM
tempo Dataset
|
|
664 | |
giantsteps_key | Giantsteps EDM key |
|
|
500 |
guitarset | GuitarSet |
|
360 | |
ikala | iKala |
|
252 | |
maestro | MAESTRO |
|
|
1282 |
medley_solos_db | Medley-solos-DB |
|
21571 | |
medleydb_melody | MedleyDB
Melody Subset
|
|
108 | |
medleydb_pitch | MedleyDB Pitch
Tracking Subset
|
|
103 | |
orchset | Orchset |
|
64 | |
rwc_classical | RWC Classical |
|
50 | |
rwc_jazz | RWC Jazz |
|
50 | |
rwc_popular | RWC Pop |
|
100 | |
salami | Salami |
|
1359 | |
tinysol | TinySOL |
|
2913 |
Annotation Type Descriptions¶
The table above provides annotation types as a guide for choosing appropriate datasets, but it is difficult to generically categorize annotation types, as they depend on varying definitions and their meaning can change depending on the type of music they correspond to. Here we provide a rough guide to the types in this table, but we strongly recommend reading the dataset specific documentation to ensure the data is as you expect.
Beats¶
Musical beats, typically encoded as sequence of timestamps and corresponding beat positions. This implicitly includes downbeat information (the beginning of a musical measure).
Chords¶
Musical chords, e.g. as might be played on a guitar. Typically encoded as a sequence of labeled events, where each event has a start time, end time, and a label. The label taxonomy varies per dataset, but typically encode a chord’s root and its quality, e.g. A:m7 for “A minor 7”.
Drums¶
Transcription of the drums, typically encoded as a sequence of labeled events, where the labels indicate which drum instrument (e.g. cymbal, snare drum) is played. These events often overlap with one another, as multiple drums can be played at the same time.
F0¶
Musical pitch contours, typically encoded as time series indidcating the musical pitch over time. The time series typically have evenly spaced timestamps, each with a correspoinding pitch value which may be encoded in a number of formats/granularities, including midi note numbers and Hertz.
Genre¶
A typically global “tag”, indicating the genre of a recording. Note that the concept of genre is highly subjective and we refer those new to this task to this article.
Instruments¶
Labels indicating which instrument is present in a musical recording. This may refer to recordings of solo instruments, or to recordings with multiple instruments. The labels may be global to a recording, or they may vary over time, indicating the presence/absence of a particular instrument as a time series.
Key¶
Musical key. This can be defined globally for an audio file or as a sequence of events.
Lyrics¶
Lyrics corresponding to the singing voice of the audio. These may be raw text with no time information, or they may be time-aligned events. They may have varying levels of granularity (paragraph, line, word, phoneme, character) depending on the dataset.
Melody¶
The musical melody of a song. Melody has no universal definition and is typically defined per dataset. It is typically enocoded as F0 or as Notes. Other types of annotations such as Vocal F0 or Vocal Notes can often be considered as melody annotations as well.
Notes¶
Musical note events, typically encoded as sequences of start time, end time, label. The label typically indicates a musical pitch, which may be in a number of formats/granularities, including midi note numbers, Hertz, or pitch class.
Sections¶
Musical sections, which may be “flat” or “hierarchical”, typically encoded by a sequence of timestamps indicating musical section boundary times. Section annotations sometimes also include labels for sections, which may indicate repetitions and/or the section type (e.g. Chorus, Verse).
Technique¶
The playing technique used by a particular instrument, for example “Pizzicato”. This label may be global for a given recording or encoded as a sequence of labeled events.
Tempo¶
The tempo of a song, typical in units of beats-per-minute (bpm). This is often indicated globally per track, but in practice tracks may have tempos that change, and some datasets encode tempo as time-varying quantity. Additionally, there may be multiple reasonable tempos at any given time (for example, often 2x or 0.5x a tempo value will also be “correct”). For this reason, some datasets provide two or more different tempo values.