Initializing¶

mirdata.initialize(dataset_name, data_home=None)[source]¶

Load a mirdata dataset by name

Example

orchset = mirdata.initialize('orchset')  # get the orchset dataset
orchset.download()  # download orchset
orchset.validate()  # validate orchset
track = orchset.choice_track()  # load a random track
print(track)  # see what data a track contains
orchset.track_ids()  # load all track ids

Parameters

dataset_name (str) – the dataset’s name see mirdata.DATASETS for a complete list of possibilities
data_home (str or None) – path where the data lives. If None uses the default location.

Returns

Dataset – a mirdata.core.Dataset object

mirdata.list_datasets()[source]¶

Get a list of all mirdata dataset names

Returns: list – list of dataset names as strings

Dataset Loaders¶

acousticbrainz_genre¶

Acoustic Brainz Genre dataset

Dataset Info

The AcousticBrainz Genre Dataset consists of four datasets of genre annotations and music features extracted from audio suited for evaluation of hierarchical multi-label genre classification systems.

Description about the music features can be found here: https://essentia.upf.edu/streaming_extractor_music.html

The datasets are used within the MediaEval AcousticBrainz Genre Task. The task is focused on content-based music genre recognition using genre annotations from multiple sources and large-scale music features data available in the AcousticBrainz database. The goal of our task is to explore how the same music pieces can be annotated differently by different communities following different genre taxonomies, and how this should be addressed by content-based genre r ecognition systems.

We provide four datasets containing genre and subgenre annotations extracted from four different online metadata sources:

AllMusic and Discogs are based on editorial metadata databases maintained by music experts and enthusiasts. These sources contain explicit genre/subgenre annotations of music releases (albums) following a predefined genre namespace and taxonomy. We propagated release-level annotations to recordings (tracks) in AcousticBrainz to build the datasets.
Lastfm and Tagtraum are based on collaborative music tagging platforms with large amounts of genre labels provided by their users for music recordings (tracks). We have automatically inferred a genre/subgenre taxonomy and annotations from these labels.

For details on format and contents, please refer to the data webpage.

Note, that the AllMusic ground-truth annotations are distributed separately at https://zenodo.org/record/2554044.

If you use the MediaEval AcousticBrainz Genre dataset or part of it, please cite our ISMIR 2019 overview paper:

Bogdanov, D., Porter A., Schreiber H., Urbano J., & Oramas S. (2019).
The AcousticBrainz Genre Dataset: Multi-Source, Multi-Level, Multi-Label, and Large-Scale.
20th International Society for Music Information Retrieval Conference (ISMIR 2019).

This work is partially supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688382 AudioCommons.

class mirdata.datasets.acousticbrainz_genre.Dataset(data_home=None)[source]¶

The acousticbrainz genre dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

filter_index(search_key)[source]¶

Load from AcousticBrainz genre dataset the indexes that match with search_key.

Parameters: search_key (str) – regex to match with folds, mbid or genres
Returns: dict – {track_id: track data}

license()[source]¶: Print the license

load_all_train()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for training across the four different datasets.

Returns: dict – {track_id: track data}

load_all_validation()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for validating across the four different datasets.

Returns: dict – {track_id: track data}

load_allmusic_train()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for validation in allmusic dataset.

Returns: dict – {track_id: track data}

load_allmusic_validation()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for validation in allmusic dataset.

Returns: dict – {track_id: track data}

load_discogs_train()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for training in discogs dataset.

Returns: dict – {track_id: track data}

load_discogs_validation()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for validation in tagtraum dataset.

Returns: dict – {track_id: track data}

load_extractor(*args, **kwargs)[source]¶

Load a AcousticBrainz Dataset json file with all the features and metadata.

Parameters

fhandle (str or file-like) – path or file-like object pointing to a json file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_lastfm_train()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for training in lastfm dataset.

Returns: dict – {track_id: track data}

load_lastfm_validation()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for validation in lastfm dataset.

Returns: dict – {track_id: track data}

load_tagtraum_train()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for training in tagtraum dataset.

Returns: dict – {track_id: track data}

load_tagtraum_validation()[source]¶

Load from AcousticBrainz genre dataset the tracks that are used for validating in tagtraum dataset.

Returns: dict – {track_id: track data}

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.acousticbrainz_genre.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

AcousticBrainz Genre Dataset track class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets

Variables

track_id (str) – track id
genre (list) – human-labeled genre and subgenres list
mbid (str) – musicbrainz id
mbid_group (str) – musicbrainz id group
artist (list) – the track’s artist/s
title (list) – the track’s title
date (list) – the track’s release date/s
filename (str) – the track’s filename
album (list) – the track’s album/s
track_number (list) – the track number/s
tonal (dict) – dictionary of acousticbrainz tonal features
low_level (dict) – dictionary of acousticbrainz low-level features
rhythm (dict) – dictionary of acousticbrainz rhythm features

Other Parameters

acousticbrainz_metadata (dict) – dictionary of metadata provided by AcousticBrainz

property album¶

metadata album annotation

Returns: list – album

property artist¶

metadata artist annotation

Returns: list – artist

property date¶

metadata date annotation

Returns: list – date

property file_name¶

metadata file_name annotation

Returns: str – file name

property low_level¶

low_level track descriptors.

Returns

dict –

‘average_loudness’: dynamic range descriptor. It rescales average loudness, computed on 2sec windows with 1 sec overlap, into the [0,1] interval. The value of 0 corresponds to signals with large dynamic range, 1 corresponds to signal with little dynamic range. Algorithms: Loudness
’dynamic_complexity’: dynamic complexity computed on 2sec windows with 1sec overlap. Algorithms: DynamicComplexity
’silence_rate_20dB’, ‘silence_rate_30dB’, ‘silence_rate_60dB’: rate of silent frames in a signal for thresholds of 20, 30, and 60 dBs. Algorithms: SilenceRate
’spectral_rms’: spectral RMS. Algorithms: RMS
’spectral_flux’: spectral flux of a signal computed using L2-norm. Algorithms: Flux
’spectral_centroid’, ‘spectral_kurtosis’, ‘spectral_spread’, ‘spectral_skewness’: centroid and central moments statistics describing the spectral shape. Algorithms: Centroid, CentralMoments
’spectral_rolloff’: the roll-off frequency of a spectrum. Algorithms: RollOff
’spectral_decrease’: spectral decrease. Algorithms: Decrease
’hfc’: high frequency content descriptor as proposed by Masri. Algorithms: HFC
’zerocrossingrate’ zero-crossing rate. Algorithms: ZeroCrossingRate
’spectral_energy’: spectral energy. Algorithms: Energy
’spectral_energyband_low’, ‘spectral_energyband_middle_low’, ‘spectral_energyband_middle_high’,
’spectral_energyband_high’: spectral energy in frequency bands [20Hz, 150Hz], [150Hz, 800Hz], [800Hz, 4kHz], and [4kHz, 20kHz]. Algorithms EnergyBand
’barkbands’: spectral energy in 27 Bark bands. Algorithms: BarkBands
’melbands’: spectral energy in 40 mel bands. Algorithms: MFCC
’erbbands’: spectral energy in 40 ERB bands. Algorithms: ERBBands
’mfcc’: the first 13 mel frequency cepstrum coefficients. See algorithm: MFCC
’gfcc’: the first 13 gammatone feature cepstrum coefficients. Algorithms: GFCC
’barkbands_crest’, ‘barkbands_flatness_db’: crest and flatness computed over energies in Bark bands. Algorithms: Crest, FlatnessDB
’barkbands_kurtosis’, ‘barkbands_skewness’, ‘barkbands_spread’: central moments statistics over energies in Bark bands. Algorithms: CentralMoments
’melbands_crest’, ‘melbands_flatness_db’: crest and flatness computed over energies in mel bands. Algorithms: Crest, FlatnessDB
’melbands_kurtosis’, ‘melbands_skewness’, ‘melbands_spread’: central moments statistics over energies in mel bands. Algorithms: CentralMoments
’erbbands_crest’, ‘erbbands_flatness_db’: crest and flatness computed over energies in ERB bands. Algorithms: Crest, FlatnessDB
’erbbands_kurtosis’, ‘erbbands_skewness’, ‘erbbands_spread’: central moments statistics over energies in ERB bands. Algorithms: CentralMoments
’dissonance’: sensory dissonance of a spectrum. Algorithms: Dissonance
’spectral_entropy’: Shannon entropy of a spectrum. Algorithms: Entropy
’pitch_salience’: pitch salience of a spectrum. Algorithms: PitchSalience
’spectral_complexity’: spectral complexity. Algorithms: SpectralComplexity
’spectral_contrast_coeffs’, ‘spectral_contrast_valleys’: spectral contrast features. Algorithms: SpectralContrast

property rhythm¶

rhythm essentia extractor descriptors

Returns

dict –

‘beats_position’: time positions [sec] of detected beats using beat tracking algorithm by Degara et al., 2012. Algorithms: RhythmExtractor2013, BeatTrackerDegara
’beats_count’: number of detected beats
’bpm’: BPM value according to detected beats
’bpm_histogram_first_peak_bpm’, ‘bpm_histogram_first_peak_spread’, ‘bpm_histogram_first_peak_weight’,
’bpm_histogram_second_peak_bpm’, ‘bpm_histogram_second_peak_spread’, ‘bpm_histogram_second_peak_weight’: descriptors characterizing highest and second highest peak of the BPM histogram. Algorithms: BpmHistogramDescriptors
’beats_loudness’, ‘beats_loudness_band_ratio’: spectral energy computed on beats segments of audio across the whole spectrum, and ratios of energy in 6 frequency bands. Algorithms: BeatsLoudness, SingleBeatLoudness
’onset_rate’: number of detected onsets per second. Algorithms: OnsetRate
’danceability’: danceability estimate. Algorithms: Danceability

property title¶

metadata title annotation

Returns: list – title

to_jams()[source]¶

the track’s data in jams format

Returns: jams.JAMS – return track data in jam format

property tonal¶

tonal features

Returns

dict –

‘tuning_frequency’: estimated tuning frequency [Hz]. Algorithms: TuningFrequency
’tuning_nontempered_energy_ratio’ and ‘tuning_equal_tempered_deviation’
’hpcp’, ‘thpcp’: 32-dimensional harmonic pitch class profile (HPCP) and its transposed version. Algorithms: HPCP
’hpcp_entropy’: Shannon entropy of a HPCP vector. Algorithms: Entropy
’key_key’, ‘key_scale’: Global key feature. Algorithms: Key
’chords_key’, ‘chords_scale’: Global key extracted from chords detection.
’chords_strength’, ‘chords_histogram’: : strength of estimated chords and normalized histogram of their progression; Algorithms: ChordsDetection, ChordsDescriptors
’chords_changes_rate’, ‘chords_number_rate’: chords change rate in the progression; ratio of different chords from the total number of chords in the progression; Algorithms: ChordsDetection, ChordsDescriptors

property tracknumber¶

metadata tracknumber annotation

Returns: list – tracknumber

mirdata.datasets.acousticbrainz_genre.load_extractor(fhandle)[source]¶

Load a AcousticBrainz Dataset json file with all the features and metadata.

Parameters

fhandle (str or file-like) – path or file-like object pointing to a json file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

beatles¶

Beatles Dataset Loader

Dataset Info

The Beatles Dataset includes beat and metric position, chord, key, and segmentation annotations for 179 Beatles songs. Details can be found in http://matthiasmauch.net/_pdf/mauch_omp_2009.pdf and http://isophonics.net/content/reference-annotations-beatles.

class mirdata.datasets.beatles.Dataset(data_home=None)[source]¶

The beatles dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Beatles audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]¶

Load Beatles format beat data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a beat annotation file
Returns: BeatData – loaded beat data

load_chords(*args, **kwargs)[source]¶

Load Beatles format chord data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a chord annotation file
Returns: ChordData – loaded chord data

load_sections(*args, **kwargs)[source]¶

Load Beatles format section data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a section annotation file
Returns: SectionData – loaded section data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.beatles.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

Beatles track class

Parameters

track_id (str) – track id of the track
data_home (str) – path where the data lives

Variables

audio_path (str) – track audio path
beats_path (str) – beat annotation path
chords_path (str) – chord annotation path
keys_path (str) – key annotation path
sections_path (str) – sections annotation path
title (str) – title of the track
track_id (str) – track id

Other Parameters

beats (BeatData) – human-labeled beat annotations
chords (ChordData) – human-labeled chord annotations
key (KeyData) – local key annotations
sections (SectionData) – section annotations

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

the track’s data in jams format

Returns: jams.JAMS – return track data in jam format

mirdata.datasets.beatles.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a Beatles audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.beatles.load_beats(fhandle: TextIO) → mirdata.annotations.BeatData [source]¶

Load Beatles format beat data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a beat annotation file
Returns: BeatData – loaded beat data

mirdata.datasets.beatles.load_chords(fhandle: TextIO) → mirdata.annotations.ChordData [source]¶

Load Beatles format chord data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a chord annotation file
Returns: ChordData – loaded chord data

mirdata.datasets.beatles.load_key(fhandle: TextIO) → mirdata.annotations.KeyData [source]¶

Load Beatles format key data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a key annotation file
Returns: KeyData – loaded key data

mirdata.datasets.beatles.load_sections(fhandle: TextIO) → mirdata.annotations.SectionData [source]¶

Load Beatles format section data from a file

Parameters: fhandle (str or file-like) – path or file-like object pointing to a section annotation file
Returns: SectionData – loaded section data

beatport_key¶

beatport_key Dataset Loader

Dataset Info

The Beatport EDM Key Dataset includes 1486 two-minute sound excerpts from various EDM subgenres, annotated with single-key labels, comments and confidence levels generously provided by Eduard Mas Marín, and thoroughly revised and expanded by Ángel Faraldo.

The original audio samples belong to online audio snippets from Beatport, an online music store for DJ’s and Electronic Dance Music Producers (<http:www.beatport.com>). If this dataset were used in further research, we would appreciate the citation of the current DOI (10.5281/zenodo.1101082) and the following doctoral dissertation, where a detailed description of the properties of this dataset can be found:

Ángel Faraldo (2017). Tonality Estimation in Electronic Dance Music: A Computational and Musically Informed
Examination. PhD Thesis. Universitat Pompeu Fabra, Barcelona.

This dataset is mainly intended to assess the performance of computational key estimation algorithms in electronic dance music subgenres.

Data License: Creative Commons Attribution Share Alike 4.0 International

class mirdata.datasets.beatport_key.Dataset(data_home=None)[source]¶

The beatport_key dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download the dataset

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_artist(*args, **kwargs)[source]¶

Load beatport_key tempo data from a file

Parameters: metadata_path (str) – path to metadata annotation file
Returns: list – list of artists involved in the track.

load_audio(*args, **kwargs)[source]¶

Load a beatport_key audio file.

Parameters

audio_path (str) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_genre(*args, **kwargs)[source]¶

Load beatport_key genre data from a file

Parameters: metadata_path (str) – path to metadata annotation file
Returns: dict – with the list with genres [‘genres’] and list with sub-genres [‘sub_genres’]

load_key(*args, **kwargs)[source]¶

Load beatport_key format key data from a file

Parameters: keys_path (str) – path to key annotation file
Returns: list – list of annotated keys

load_tempo(*args, **kwargs)[source]¶

Load beatport_key tempo data from a file

Parameters: metadata_path (str) – path to metadata annotation file
Returns: str – tempo in beats per minute

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.beatport_key.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

beatport_key track class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored.

Variables

audio_path (str) – track audio path
keys_path (str) – key annotation path
metadata_path (str) – sections annotation path
title (str) – title of the track
track_id (str) – track id

Other Parameters

key (list) – list of annotated musical keys
artists (list) – artists involved in the track
genre (dict) – genres and subgenres
tempo (int) – tempo in beats per minute

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.beatport_key.load_artist(metadata_path)[source]¶

Load beatport_key tempo data from a file

Parameters: metadata_path (str) – path to metadata annotation file
Returns: list – list of artists involved in the track.

mirdata.datasets.beatport_key.load_audio(audio_path)[source]¶

Load a beatport_key audio file.

Parameters

audio_path (str) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.beatport_key.load_genre(metadata_path)[source]¶

Load beatport_key genre data from a file

Parameters: metadata_path (str) – path to metadata annotation file
Returns: dict – with the list with genres [‘genres’] and list with sub-genres [‘sub_genres’]

mirdata.datasets.beatport_key.load_key(keys_path)[source]¶

Load beatport_key format key data from a file

Parameters: keys_path (str) – path to key annotation file
Returns: list – list of annotated keys

mirdata.datasets.beatport_key.load_tempo(metadata_path)[source]¶

Load beatport_key tempo data from a file

Parameters: metadata_path (str) – path to metadata annotation file
Returns: str – tempo in beats per minute

cante100¶

cante100 Loader

Dataset Info

The cante100 dataset contains 100 tracks taken from the COFLA corpus. We defined 10 style families of which 10 tracks each are included. Apart from the style family, we manually annotated the sections of the track in which the vocals are present. In addition, we provide a number of low-level descriptors and the fundamental frequency corresponding to the predominant melody for each track. The meta-information includes editoral meta-data and the musicBrainz ID.

Total tracks: 100

cante100 audio is only available upon request. To download the audio request access in this link: https://zenodo.org/record/1324183. Then unzip the audio into the cante100 general dataset folder for the rest of annotations and files.

Audio specifications:

Sampling frequency: 44.1 kHz
Bit-depth: 16 bit
Audio format: .mp3

cante100 dataset has spectrogram available, in csv format. spectrogram is available to download without request needed, so at first instance, cante100 loader uses the spectrogram of the tracks.

The available annotations are:

F0 (predominant melody)
Automatic transcription of notes (of singing voice)

CANTE100 LICENSE (COPIED FROM ZENODO PAGE)

The provided datasets are offered free of charge for internal non-commercial use.
We do not grant any rights for redistribution or modification. All data collections were gathered
by the COFLA team.
© COFLA 2015. All rights reserved.

For more details, please visit: http://www.cofla-project.com/?page_id=134

class mirdata.datasets.cante100.Dataset(data_home=None)[source]¶

The cante100 dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a cante100 audio file.

Parameters

fhandle (str) – path to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_melody(*args, **kwargs)[source]¶

Load cante100 f0 annotations

Parameters: fhandle (str or file-like) – path or file-like object pointing to melody annotation file
Returns: F0Data – predominant melody

load_notes(*args, **kwargs)[source]¶

Load note data from the annotation files

Parameters: fhandle (str or file-like) – path or file-like object pointing to a notes annotation file
Returns: NoteData – note annotations

load_spectrogram(*args, **kwargs)[source]¶

Load a cante100 dataset spectrogram file.

Parameters: fhandle (str or file-like) – path or file-like object pointing to an audio file
Returns: np.ndarray – spectrogram

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.cante100.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

cante100 track class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets/cante100

Variables

track_id (str) – track id
identifier (str) – musicbrainz id of the track
artist (str) – performing artists
title (str) – title of the track song
release (str) – release where the track can be found
duration (str) – duration in seconds of the track

Other Parameters

melody (F0Data) – annotated melody
notes (NoteData) – annotated notes

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

property spectrogram¶

spectrogram of The track’s audio

Returns: np.ndarray – spectrogram

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.cante100.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]¶

Load a cante100 audio file.

Parameters

fhandle (str) – path to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.cante100.load_melody(fhandle: TextIO) → Optional[mirdata.annotations.F0Data][source]¶

Load cante100 f0 annotations

Parameters: fhandle (str or file-like) – path or file-like object pointing to melody annotation file
Returns: F0Data – predominant melody

mirdata.datasets.cante100.load_notes(fhandle: TextIO) → mirdata.annotations.NoteData [source]¶

Load note data from the annotation files

Parameters: fhandle (str or file-like) – path or file-like object pointing to a notes annotation file
Returns: NoteData – note annotations

mirdata.datasets.cante100.load_spectrogram(fhandle: TextIO) → numpy.ndarray [source]¶

Load a cante100 dataset spectrogram file.

Parameters: fhandle (str or file-like) – path or file-like object pointing to an audio file
Returns: np.ndarray – spectrogram

dali¶

DALI Dataset Loader

Dataset Info

DALI contains 5358 audio files with their time-aligned vocal melody. It also contains time-aligned lyrics at four levels of granularity: notes, words, lines, and paragraphs.

For each song, DALI also provides additional metadata: genre, language, musician, album covers, or links to video clips.

For more details, please visit: https://github.com/gabolsgabs/DALI

class mirdata.datasets.dali.Dataset(data_home=None)[source]¶

The dali dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_annotations_class(*args, **kwargs)[source]¶

Load full annotations into the DALI class object

Parameters: annotations_path (str) – path to a DALI annotation file
Returns: DALI.annotations – DALI annotations object

load_annotations_granularity(*args, **kwargs)[source]¶

Load annotations at the specified level of granularity

Parameters

annotations_path (str) – path to a DALI annotation file
granularity (str) – one of ‘notes’, ‘words’, ‘lines’, ‘paragraphs’

Returns

NoteData for granularity=’notes’ or LyricData otherwise

load_audio(*args, **kwargs)[source]¶

Load a DALI audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.dali.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

DALI melody Track class

Parameters

track_id (str) – track id of the track

Variables

album (str) – the track’s album
annotation_path (str) – path to the track’s annotation file
artist (str) – the track’s artist
audio_path (str) – path to the track’s audio file
audio_url (str) – youtube ID
dataset_version (int) – dataset annotation version
ground_truth (bool) – True if the annotation is verified
language (str) – sung language
release_date (str) – year the track was released
scores_manual (int) – manual score annotations
scores_ncc (float) – ncc score annotations
title (str) – the track’s title
track_id (str) – the unique track id
url_working (bool) – True if the youtube url was valid

Other Parameters

notes (NoteData) – vocal notes
words (LyricData) – word-level lyrics
lines (LyricData) – line-level lyrics
paragraphs (LyricData) – paragraph-level lyrics
annotation-object (DALI.Annotations) – DALI annotation object

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.dali.load_annotations_class(annotations_path)[source]¶

Load full annotations into the DALI class object

Parameters: annotations_path (str) – path to a DALI annotation file
Returns: DALI.annotations – DALI annotations object

mirdata.datasets.dali.load_annotations_granularity(annotations_path, granularity)[source]¶

Load annotations at the specified level of granularity

Parameters

annotations_path (str) – path to a DALI annotation file
granularity (str) – one of ‘notes’, ‘words’, ‘lines’, ‘paragraphs’

Returns

NoteData for granularity=’notes’ or LyricData otherwise

mirdata.datasets.dali.load_audio(fhandle: BinaryIO) → Optional[Tuple[numpy.ndarray, float]][source]¶

Load a DALI audio file.

Parameters

fhandle (str or file-like) – path or file-like object pointing to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

giantsteps_key¶

giantsteps_key Dataset Loader

Dataset Info

The GiantSteps+ EDM Key Dataset includes 600 two-minute sound excerpts from various EDM subgenres, annotated with single-key labels, comments and confidence levels by Daniel G. Camhi, and thoroughly revised and expanded by Ángel Faraldo at MTG UPF. Additionally, 500 tracks have been thoroughly analysed, containing pitch-class set descriptions, key changes, and additional modal changes. This dataset is a revision of the original GiantSteps Key Dataset, available in Github (<https://github.com/GiantSteps/giantsteps-key-dataset>) and initially described in:

Knees, P., Faraldo, Á., Herrera, P., Vogl, R., Böck, S., Hörschläger, F., Le Goff, M. (2015).
Two Datasets for Tempo Estimation and Key Detection in Electronic Dance Music Annotated from User Corrections.
In Proceedings of the 16th International Society for Music Information Retrieval Conference, 364–370. Málaga, Spain.

The original audio samples belong to online audio snippets from Beatport, an online music store for DJ’s and Electronic Dance Music Producers (<http:www.beatport.com>). If this dataset were used in further research, we would appreciate the citation of the current DOI (10.5281/zenodo.1101082) and the following doctoral dissertation, where a detailed description of the properties of this dataset can be found:

Ángel Faraldo (2017). Tonality Estimation in Electronic Dance Music: A Computational and Musically Informed Examination.
PhD Thesis. Universitat Pompeu Fabra, Barcelona.

This dataset is mainly intended to assess the performance of computational key estimation algorithms in electronic dance music subgenres.

All the data of this dataset is licensed with Creative Commons Attribution Share Alike 4.0 International.

class mirdata.datasets.giantsteps_key.Dataset(data_home=None)[source]¶

The giantsteps_key dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_artist(*args, **kwargs)[source]¶

Load giantsteps_key tempo data from a file

Parameters: fhandle (str or file-like) – File-like object or path pointing to metadata annotation file
Returns: list – list of artists involved in the track.

load_audio(*args, **kwargs)[source]¶

Load a giantsteps_key audio file.

Parameters

fhandle (str or file-like) – path pointing to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_genre(*args, **kwargs)[source]¶

Load giantsteps_key genre data from a file

Parameters: fhandle (str or file-like) – File-like object or path pointing to metadata annotation file
Returns: dict – {‘genres’: […], ‘subgenres’: […]}

load_key(*args, **kwargs)[source]¶

Load giantsteps_key format key data from a file

Parameters: fhandle (str or file-like) – File like object or string pointing to key annotation file
Returns: str – loaded key data

load_tempo(*args, **kwargs)[source]¶

Load giantsteps_key tempo data from a file

Parameters: fhandle (str or file-like) – File-like object or string pointing to metadata annotation file
Returns: str – loaded tempo data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.giantsteps_key.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

giantsteps_key track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – track audio path
keys_path (str) – key annotation path
metadata_path (str) – sections annotation path
title (str) – title of the track
track_id (str) – track id

Other Parameters

key (str) – musical key annotation
artists (list) – list of artists involved
genres (dict) – genres and subgenres
tempo (int) – crowdsourced tempo annotations in beats per minute

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.giantsteps_key.load_artist(fhandle: TextIO) → List[str][source]¶

Load giantsteps_key tempo data from a file

Parameters: fhandle (str or file-like) – File-like object or path pointing to metadata annotation file
Returns: list – list of artists involved in the track.

mirdata.datasets.giantsteps_key.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]¶

Load a giantsteps_key audio file.

Parameters

fhandle (str or file-like) – path pointing to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.giantsteps_key.load_genre(fhandle: TextIO) → Dict[str, List[str]][source]¶

Load giantsteps_key genre data from a file

Parameters: fhandle (str or file-like) – File-like object or path pointing to metadata annotation file
Returns: dict – {‘genres’: […], ‘subgenres’: […]}

mirdata.datasets.giantsteps_key.load_key(fhandle: TextIO) → str[source]¶

Load giantsteps_key format key data from a file

Parameters: fhandle (str or file-like) – File like object or string pointing to key annotation file
Returns: str – loaded key data

mirdata.datasets.giantsteps_key.load_tempo(fhandle: TextIO) → str[source]¶

Load giantsteps_key tempo data from a file

Parameters: fhandle (str or file-like) – File-like object or string pointing to metadata annotation file
Returns: str – loaded tempo data

giantsteps_tempo¶

giantsteps_tempo Dataset Loader

Dataset Info

GiantSteps tempo + genre is a collection of annotations for 664 2min(1) audio previews from www.beatport.com, created by Richard Vogl <richard.vogl@tuwien.ac.at> and Peter Knees <peter.knees@tuwien.ac.at>

references:

giantsteps_tempo_cit_1: Peter Knees, Ángel Faraldo, Perfecto Herrera, Richard Vogl, Sebastian Böck, Florian Hörschläger, Mickael Le Goff: “Two data sets for tempo estimation and key detection in electronic dance music annotated from user corrections”, Proc. of the 16th Conference of the International Society for Music Information Retrieval (ISMIR’15), Oct. 2015, Malaga, Spain.
giantsteps_tempo_cit_2: Hendrik Schreiber, Meinard Müller: “A Crowdsourced Experiment for Tempo Estimation of Electronic Dance Music”, Proc. of the 19th Conference of the International Society for Music Information Retrieval (ISMIR’18), Sept. 2018, Paris, France.

The audio files (664 files, size ~1gb) can be downloaded from http://www.beatport.com/ using the bash script:

https://github.com/GiantSteps/giantsteps-tempo-dataset/blob/master/audio_dl.sh

To download the files manually use links of the following form: http://geo-samples.beatport.com/lofi/<name of mp3 file> e.g.: http://geo-samples.beatport.com/lofi/5377710.LOFI.mp3

To convert the audio files to .wav use the script found at https://github.com/GiantSteps/giantsteps-tempo-dataset/blob/master/convert_audio.sh and run:

./convert_audio.sh

To retrieve the genre information, the JSON contained within the website was parsed. The tempo annotation was extracted from forum entries of people correcting the bpm values (i.e. manual annotation of tempo). For more information please refer to the publication [giantsteps_tempo_cit_1].

[giantsteps_tempo_cit_2] found some files without tempo. There are:

3041381.LOFI.mp3
3041383.LOFI.mp3
1327052.LOFI.mp3

Their v2 tempo is denoted as 0.0 in tempo and mirex and has no annotation in the JAMS format.

Most of the audio files are 120 seconds long. Exceptions are:

name              length (sec)
906760.LOFI.mp3   62
1327052.LOFI.mp3  70
4416506.LOFI.mp3  80
1855660.LOFI.mp3  119
3419452.LOFI.mp3  119
3577631.LOFI.mp3  119

class mirdata.datasets.giantsteps_tempo.Dataset(data_home=None)[source]¶

The giantsteps_tempo dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a giantsteps_tempo audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_genre(*args, **kwargs)[source]¶

Load genre data from a file

Parameters: path (str) – path to metadata annotation file
Returns: str – loaded genre data

load_tempo(*args, **kwargs)[source]¶

Load giantsteps_tempo tempo data from a file ordered by confidence

Parameters: fhandle (str or file-like) – File-like object or path to tempo annotation file
Returns: annotations.TempoData – Tempo data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.giantsteps_tempo.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

giantsteps_tempo track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – track audio path
title (str) – title of the track
track_id (str) – track id
annotation_v1_path (str) – track annotation v1 path
annotation_v2_path (str) – track annotation v2 path

Other Parameters

genre (dict) – Human-labeled metadata annotation
tempo (list) – List of annotations.TempoData, ordered by confidence
tempo_v2 (list) – List of annotations.TempoData for version 2, ordered by confidence

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

to_jams_v2()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.giantsteps_tempo.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]¶

Load a giantsteps_tempo audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.giantsteps_tempo.load_genre(fhandle: TextIO) → str[source]¶

Load genre data from a file

Parameters: path (str) – path to metadata annotation file
Returns: str – loaded genre data

mirdata.datasets.giantsteps_tempo.load_tempo(fhandle: TextIO) → mirdata.annotations.TempoData [source]¶

Load giantsteps_tempo tempo data from a file ordered by confidence

Parameters: fhandle (str or file-like) – File-like object or path to tempo annotation file
Returns: annotations.TempoData – Tempo data

groove_midi¶

Groove MIDI Loader

Dataset Info

The Groove MIDI Dataset (GMD) is composed of 13.6 hours of aligned MIDI and synthesized audio of human-performed, tempo-aligned expressive drumming. The dataset contains 1,150 MIDI files and over 22,000 measures of drumming.

To enable a wide range of experiments and encourage comparisons between methods on the same data, Gillick et al. created a new dataset of drum performances recorded in MIDI format. They hired professional drummers and asked them to perform in multiple styles to a click track on a Roland TD-11 electronic drum kit. They also recorded the aligned, high-quality synthesized audio from the TD-11 and include it in the release.

The Groove MIDI Dataset (GMD), has several attributes that distinguish it from existing ones:

The dataset contains about 13.6 hours, 1,150 MIDI files, and over 22,000 measures of drumming.
Each performance was played along with a metronome set at a specific tempo by the drummer.
The data includes performances by a total of 10 drummers, with more than 80% of duration coming from hired professionals. The professionals were able to improvise in a wide range of styles, resulting in a diverse dataset.
The drummers were instructed to play a mix of long sequences (several minutes of continuous playing) and short beats and fills.
Each performance is annotated with a genre (provided by the drummer), tempo, and anonymized drummer ID.
Most of the performances are in 4/4 time, with a few examples from other time signatures.
Four drummers were asked to record the same set of 10 beats in their own style. These are included in the test set split, labeled eval-session/groove1-10.
In addition to the MIDI recordings that are the primary source of data for the experiments in this work, the authors captured the synthesized audio outputs of the drum set and aligned them to within 2ms of the corresponding MIDI files.

A train/validation/test split configuration is provided for easier comparison of model accuracy on various tasks.

The dataset is made available by Google LLC under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.

For more details, please visit: http://magenta.tensorflow.org/datasets/groove

class mirdata.datasets.groove_midi.Dataset(data_home=None)[source]¶

The groove_midi dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Groove MIDI audio file.

Parameters

path – path to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]¶

Load beat data from the midi file.

Parameters

midi_path (str) – path to midi file
midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.BeatData – machine generated beat data

load_drum_events(*args, **kwargs)[source]¶

Load drum events from the midi file.

Parameters

midi_path (str) – path to midi file
midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.EventData – drum event data

load_midi(*args, **kwargs)[source]¶

Load a Groove MIDI midi file.

Parameters: fhandle (str or file-like) – File-like object or path to midi file
Returns: midi_data (pretty_midi.PrettyMIDI) – pretty_midi object

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.groove_midi.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

Groove MIDI Track class

Parameters

track_id (str) – track id of the track

Variables

drummer (str) – Drummer id of the track (ex. ‘drummer1’)
session (str) – Type of session (ex. ‘session1’, ‘eval_session’)
track_id (str) – track id of the track (ex. ‘drummer1/eval_session/1’)
style (str) – Style (genre, groove type) of the track (ex. ‘funk/groove1’)
tempo (int) – track tempo in beats per minute (ex. 138)
beat_type (str) – Whether the track is a beat or a fill (ex. ‘beat’)
time_signature (str) – Time signature of the track (ex. ‘4-4’, ‘6-8’)
midi_path (str) – Path to the midi file
audio_path (str) – Path to the audio file
duration (float) – Duration of the midi file in seconds
split (str) – Whether the track is for a train/valid/test set. One of ‘train’, ‘valid’ or ‘test’.

Other Parameters

beats (BeatData) – Machine-generated beat annotations
drum_events (EventData) – Annotated drum kit events
midi (pretty_midi.PrettyMIDI) – object containing MIDI information

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.groove_midi.load_audio(path: str) → Tuple[Optional[numpy.ndarray], Optional[float]][source]¶

Load a Groove MIDI audio file.

Parameters

path – path to an audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.groove_midi.load_beats(midi_path, midi=None)[source]¶

Load beat data from the midi file.

Parameters

midi_path (str) – path to midi file
midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.BeatData – machine generated beat data

mirdata.datasets.groove_midi.load_drum_events(midi_path, midi=None)[source]¶

Load drum events from the midi file.

Parameters

midi_path (str) – path to midi file
midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

annotations.EventData – drum event data

mirdata.datasets.groove_midi.load_midi(fhandle: BinaryIO) → Optional[pretty_midi.PrettyMIDI][source]¶

Load a Groove MIDI midi file.

Parameters: fhandle (str or file-like) – File-like object or path to midi file
Returns: midi_data (pretty_midi.PrettyMIDI) – pretty_midi object

gtzan_genre¶

GTZAN-Genre Dataset Loader

Dataset Info

This dataset was used for the well known genre classification paper:

"Musical genre classification of audio signals " by G. Tzanetakis and
P. Cook in IEEE Transactions on Audio and Speech Processing 2002.

The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres, each represented by 100 tracks. The tracks are all 22050 Hz mono 16-bit audio files in .wav format.

class mirdata.datasets.gtzan_genre.Dataset(data_home=None)[source]¶

The gtzan_genre dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a GTZAN audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.gtzan_genre.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

gtzan_genre Track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – path to the audio file
genre (str) – annotated genre
track_id (str) – track id

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.gtzan_genre.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a GTZAN audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

guitarset¶

GuitarSet Loader

Dataset Info

GuitarSet provides audio recordings of a variety of musical excerpts played on an acoustic guitar, along with time-aligned annotations including pitch contours, string and fret positions, chords, beats, downbeats, and keys.

GuitarSet contains 360 excerpts that are close to 30 seconds in length. The 360 excerpts are the result of the following combinations:

6 players
2 versions: comping (harmonic accompaniment) and soloing (melodic improvisation)
5 styles: Rock, Singer-Songwriter, Bossa Nova, Jazz, and Funk
3 Progressions: 12 Bar Blues, Autumn Leaves, and Pachelbel Canon.
2 Tempi: slow and fast.

The tonality (key) of each excerpt is sampled uniformly at random.

GuitarSet was recorded with the help of a hexaphonic pickup, which outputs signals for each string separately, allowing automated note-level annotation. Excerpts are recorded with both the hexaphonic pickup and a Neumann U-87 condenser microphone as reference. 3 audio recordings are provided with each excerpt with the following suffix:

hex: original 6 channel wave file from hexaphonic pickup
hex_cln: hex wave files with interference removal applied
mic: monophonic recording from reference microphone
mix: monophonic mixture of original 6 channel file

Each of the 360 excerpts has an accompanying JAMS file which stores 16 annotations. Pitch:

6 pitch_contour annotations (1 per string)
6 midi_note annotations (1 per string)

Beat and Tempo:

1 beat_position annotation
1 tempo annotation

Chords:

2 chord annotations: instructed and performed. The instructed chord annotation is a digital version of the lead sheet that’s provided to the player, and the performed chord annotations are inferred from note annotations, using segmentation and root from the digital lead sheet annotation.

For more details, please visit: http://github.com/marl/guitarset/

class mirdata.datasets.guitarset.Dataset(data_home=None)[source]¶

The guitarset dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Guitarset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]¶

Load a Guitarset beats annotation.

Parameters: fhandle (str or file-like) – File-like object or path of the jams annotation file
Returns: BeatData – Beat data

load_chords(*args, **kwargs)[source]¶

Load a guitarset chord annotation.

Parameters

jams_path (str) – Path of the jams annotation file
leadsheet_version (Bool) – Whether or not to load the leadsheet version of the chord annotation If False, load the infered version.

Returns

ChordData – Chord data

load_key_mode(*args, **kwargs)[source]¶

Load a Guitarset key-mode annotation.

Parameters: fhandle (str or file-like) – File-like object or path of the jams annotation file
Returns: KeyData – Key data

load_multitrack_audio(*args, **kwargs)[source]¶

Load a Guitarset multitrack audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_notes(*args, **kwargs)[source]¶

Load a guitarset note annotation for a given string

Parameters

jams_path (str) – Path of the jams annotation file
string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

NoteData – Note data for the given string

load_pitch_contour(*args, **kwargs)[source]¶

Load a guitarset pitch contour annotation for a given string

Parameters

jams_path (str) – Path of the jams annotation file
string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

F0Data – Pitch contour data for the given string

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.guitarset.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

guitarset Track class

Parameters

track_id (str) – track id of the track

Variables

audio_hex_cln_path (str) – path to the debleeded hex wave file
audio_hex_path (str) – path to the original hex wave file
audio_mic_path (str) – path to the mono wave via microphone
audio_mix_path (str) – path to the mono wave via downmixing hex pickup
jams_path (str) – path to the jams file
mode (str) – one of [‘solo’, ‘comp’] For each excerpt, players are asked to first play in ‘comp’ mode and later play a ‘solo’ version on top of the already recorded comp.
player_id (str) – ID of the different players. one of [‘00’, ‘01’, … , ‘05’]
style (str) – one of [‘Jazz’, ‘Bossa Nova’, ‘Rock’, ‘Singer-Songwriter’, ‘Funk’]
tempo (float) – BPM of the track
track_id (str) – track id

Other Parameters

beats (BeatData) – beat positions
leadsheet_chords (ChordData) – chords as written in the leadsheet
inferred_chords (ChordData) – chords inferred from played transcription
key_mode (KeyData) – key and mode
pitch_contours (dict) – Pitch contours per string - ‘E’: F0Data(…) - ‘A’: F0Data(…) - ‘D’: F0Data(…) - ‘G’: F0Data(…) - ‘B’: F0Data(…) - ‘e’: F0Data(…)
notes (dict) – Notes per string - ‘E’: NoteData(…) - ‘A’: NoteData(…) - ‘D’: NoteData(…) - ‘G’: NoteData(…) - ‘B’: NoteData(…) - ‘e’: NoteData(…)

property audio_hex¶

Hexaphonic audio (6-channels) with one channel per string

Returns

np.ndarray - audio signal
float - sample rate

property audio_hex_cln¶

Hexaphonic audio (6-channels) with one channel per string: after bleed removal

Returns

np.ndarray - audio signal
float - sample rate

property audio_mic¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

property audio_mix¶

Mixture audio (mono)

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.guitarset.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a Guitarset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.guitarset.load_beats(fhandle: TextIO) → mirdata.annotations.BeatData [source]¶

Load a Guitarset beats annotation.

Parameters: fhandle (str or file-like) – File-like object or path of the jams annotation file
Returns: BeatData – Beat data

mirdata.datasets.guitarset.load_chords(jams_path, leadsheet_version=True)[source]¶

Load a guitarset chord annotation.

Parameters

jams_path (str) – Path of the jams annotation file
leadsheet_version (Bool) – Whether or not to load the leadsheet version of the chord annotation If False, load the infered version.

Returns

ChordData – Chord data

mirdata.datasets.guitarset.load_key_mode(fhandle: TextIO) → mirdata.annotations.KeyData [source]¶

Load a Guitarset key-mode annotation.

Parameters: fhandle (str or file-like) – File-like object or path of the jams annotation file
Returns: KeyData – Key data

mirdata.datasets.guitarset.load_multitrack_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a Guitarset multitrack audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.guitarset.load_notes(jams_path, string_num)[source]¶

Load a guitarset note annotation for a given string

Parameters

jams_path (str) – Path of the jams annotation file
string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

NoteData – Note data for the given string

mirdata.datasets.guitarset.load_pitch_contour(jams_path, string_num)[source]¶

Load a guitarset pitch contour annotation for a given string

Parameters

jams_path (str) – Path of the jams annotation file
string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.

Returns

F0Data – Pitch contour data for the given string

ikala¶

iKala Dataset Loader

Dataset Info

The iKala dataset is comprised of 252 30-second excerpts sampled from 206 iKala songs (plus 100 hidden excerpts reserved for MIREX). The music accompaniment and the singing voice are recorded at the left and right channels respectively and can be found under the Wavfile directory. In addition, the human-labeled pitch contours and timestamped lyrics can be found under PitchLabel and Lyrics respectively.

For more details, please visit: http://mac.citi.sinica.edu.tw/ikala/

class mirdata.datasets.ikala.Dataset(data_home=None)[source]¶

The ikala dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_f0(*args, **kwargs)[source]¶

Load an ikala f0 annotation

Parameters: fhandle (str or file-like) – File-like object or path to f0 annotation file
Raises: IOError – If f0_path does not exist
Returns: F0Data – the f0 annotation data

load_instrumental_audio(*args, **kwargs)[source]¶

Load ikala instrumental audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - audio signal
float - sample rate

load_lyrics(*args, **kwargs)[source]¶

Load an ikala lyrics annotation

Parameters: fhandle (str or file-like) – File-like object or path to lyric annotation file
Raises: IOError – if lyrics_path does not exist
Returns: LyricData – lyric annotation data

load_mix_audio(*args, **kwargs)[source]¶

Load an ikala mix.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - audio signal
float - sample rate

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

load_vocal_audio(*args, **kwargs)[source]¶

Load ikala vocal audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - audio signal
float - sample rate

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.ikala.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

ikala Track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – path to the track’s audio file
f0_path (str) – path to the track’s f0 annotation file
lyrics_path (str) – path to the track’s lyric annotation file
section (str) – section. Either ‘verse’ or ‘chorus’
singer_id (str) – singer id
song_id (str) – song id of the track
track_id (str) – track id

Other Parameters

f0 (F0Data) – human-annotated singing voice pitch
lyrics (LyricsData) – human-annotated lyrics

property instrumental_audio¶

instrumental audio (mono)

Returns

np.ndarray - audio signal
float - sample rate

property mix_audio¶

mixture audio (mono)

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

property vocal_audio¶

solo vocal audio (mono)

Returns

np.ndarray - audio signal
float - sample rate

mirdata.datasets.ikala.load_f0(fhandle: TextIO) → mirdata.annotations.F0Data [source]¶

Load an ikala f0 annotation

Parameters: fhandle (str or file-like) – File-like object or path to f0 annotation file
Raises: IOError – If f0_path does not exist
Returns: F0Data – the f0 annotation data

mirdata.datasets.ikala.load_instrumental_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load ikala instrumental audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - audio signal
float - sample rate

mirdata.datasets.ikala.load_lyrics(fhandle: TextIO) → mirdata.annotations.LyricData [source]¶

Load an ikala lyrics annotation

Parameters: fhandle (str or file-like) – File-like object or path to lyric annotation file
Raises: IOError – if lyrics_path does not exist
Returns: LyricData – lyric annotation data

mirdata.datasets.ikala.load_mix_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load an ikala mix.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - audio signal
float - sample rate

mirdata.datasets.ikala.load_vocal_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load ikala vocal audio

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - audio signal
float - sample rate

irmas¶

IRMAS Loader

Dataset Info

IRMAS: a dataset for instrument recognition in musical audio signals

This dataset includes musical audio excerpts with annotations of the predominant instrument(s) present. It was used for the evaluation in the following article:

Bosch, J. J., Janer, J., Fuhrmann, F., & Herrera, P. “A Comparison of Sound Segregation Techniques for
Predominant Instrument Recognition in Musical Audio Signals”, in Proc. ISMIR (pp. 559-564), 2012.

IRMAS is intended to be used for training and testing methods for the automatic recognition of predominant instruments in musical audio. The instruments considered are: cello, clarinet, flute, acoustic guitar, electric guitar, organ, piano, saxophone, trumpet, violin, and human singing voice. This dataset is derived from the one compiled by Ferdinand Fuhrmann in his PhD thesis, with the difference that we provide audio data in stereo format, the annotations in the testing dataset are limited to specific pitched instruments, and there is a different amount and lenght of excerpts from the original dataset.

The dataset is split into training and test data.

Training data

Total audio samples: 6705 They are excerpts of 3 seconds from more than 2000 distinct recordings.

Audio specifications

Sampling frequency: 44.1 kHz
Bit-depth: 16 bit
Audio format: .wav

IRMAS Dataset trainig samples are annotated by storing the information of each track in their filenames.

Predominant instrument:
- The annotation of the predominant instrument of each excerpt is both in the name of the containing folder, and in the file name: cello (cel), clarinet (cla), flute (flu), acoustic guitar (gac), electric guitar (gel), organ (org), piano (pia), saxophone (sax), trumpet (tru), violin (vio), and human singing voice (voi).
- The number of files per instrument are: cel(388), cla(505), flu(451), gac(637), gel(760), org(682), pia(721), sax(626), tru(577), vio(580), voi(778).
Drum presence
- Additionally, some of the files have annotation in the filename regarding the presence ([dru]) or non presence([nod]) of drums.
The annotation of the musical genre:
- country-folk ([cou_fol])
- classical ([cla]),
- pop-rock ([pop_roc])
- latin-soul ([lat_sou])
- jazz-blues ([jaz_blu]).

Testing data

Total audio samples: 2874

Audio specifications

Sampling frequency: 44.1 kHz
Bit-depth: 16 bit
Audio format: .wav

IRMAS Dataset testing samples are annotated by the following basis:

Predominant instrument:

The annotations for an excerpt named: “excerptName.wav” are given in “excerptName.txt”. More than one instrument may be annotated in each excerpt, one label per line. This part of the dataset contains excerpts from a diversity of western musical genres, with varied instrumentations, and it is derived from the original testing dataset from Fuhrmann (http://www.dtic.upf.edu/~ffuhrmann/PhD/). Instrument nomenclatures are the same as the training dataset.

Dataset compiled by Juan J. Bosch, Ferdinand Fuhrmann, Perfecto Herrera, Music Technology Group - Universitat Pompeu Fabra (Barcelona).

The IRMAS dataset is offered free of charge for non-commercial use only. You can not redistribute it nor modify it. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License

For more details, please visit: https://www.upf.edu/web/mtg/irmas

class mirdata.datasets.irmas.Dataset(data_home=None)[source]¶

The irmas dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a IRMAS dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_pred_inst(*args, **kwargs)[source]¶

Load predominant instrument of track

Parameters: fhandle (str or file-like) – File-like object or path where the test annotations are stored.
Returns: list(str) – test track predominant instrument(s) annotations

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.irmas.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

IRMAS track class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets/Mridangam-Stroke

Variables

track_id (str) – track id
predominant_instrument (list) – Training tracks predominant instrument
train (bool) – flag to identify if the track is from the training of the testing dataset
genre (str) – string containing the namecode of the genre of the track.
drum (bool) – flag to identify if the track contains drums or not.

Other Parameters

instrument (list) – list of predominant instruments as str

property audio¶

The track’s audio signal

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

to_jams()[source]¶

the track’s data in jams format

Returns: jams.JAMS – return track data in jam format

mirdata.datasets.irmas.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a IRMAS dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.irmas.load_pred_inst(fhandle: TextIO) → List[str][source]¶

Load predominant instrument of track

Parameters: fhandle (str or file-like) – File-like object or path where the test annotations are stored.
Returns: list(str) – test track predominant instrument(s) annotations

maestro¶

MAESTRO Dataset Loader

Dataset Info

MAESTRO (MIDI and Audio Edited for Synchronous TRacks and Organization) is a dataset composed of over 200 hours of virtuosic piano performances captured with fine alignment (~3 ms) between note labels and audio waveforms.

The dataset is created and released by Google’s Magenta team.

The dataset contains over 200 hours of paired audio and MIDI recordings from ten years of International Piano-e-Competition. The MIDI data includes key strike velocities and sustain/sostenuto/una corda pedal positions. Audio and MIDI files are aligned with ∼3 ms accuracy and sliced to individual musical pieces, which are annotated with composer, title, and year of performance. Uncompressed audio is of CD quality or higher (44.1–48 kHz 16-bit PCM stereo).

A train/validation/test split configuration is also proposed, so that the same composition, even if performed by multiple contestants, does not appear in multiple subsets. Repertoire is mostly classical, including composers from the 17th to early 20th century.

The dataset is made available by Google LLC under a Creative Commons Attribution Non-Commercial Share-Alike 4.0 (CC BY-NC-SA 4.0) license.

This loader supports MAESTRO version 2.

For more details, please visit: https://magenta.tensorflow.org/datasets/maestro

class mirdata.datasets.maestro.Dataset(data_home=None)[source]¶

The maestro dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download the dataset

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a MAESTRO audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_midi(*args, **kwargs)[source]¶

Load a MAESTRO midi file.

Parameters: fhandle (str or file-like) – File-like object or path to midi file
Returns: pretty_midi.PrettyMIDI – pretty_midi object

load_notes(*args, **kwargs)[source]¶

Load note data from the midi file.

Parameters

midi_path (str) – path to midi file
midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

NoteData – note annotations

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.maestro.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

MAESTRO Track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – Path to the track’s audio file
canonical_composer (str) – Composer of the piece, standardized on a single spelling for a given name.
canonical_title (str) – Title of the piece. Not guaranteed to be standardized to a single representation.
duration (float) – Duration in seconds, based on the MIDI file.
midi_path (str) – Path to the track’s MIDI file
split (str) – Suggested train/validation/test split.
track_id (str) – track id
year (int) – Year of performance.

Cached Property:: midi (pretty_midi.PrettyMIDI): object containing MIDI annotations notes (NoteData): annotated piano notes

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.maestro.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a MAESTRO audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.maestro.load_midi(fhandle: BinaryIO) → pretty_midi.PrettyMIDI [source]¶

Load a MAESTRO midi file.

Parameters: fhandle (str or file-like) – File-like object or path to midi file
Returns: pretty_midi.PrettyMIDI – pretty_midi object

mirdata.datasets.maestro.load_notes(midi_path, midi=None)[source]¶

Load note data from the midi file.

Parameters

midi_path (str) – path to midi file
midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path

Returns

NoteData – note annotations

medley_solos_db¶

Medley-solos-DB Dataset Loader.

Dataset Info

Medley-solos-DB is a cross-collection dataset for automatic musical instrument recognition in solo recordings. It consists of a training set of 3-second audio clips, which are extracted from the MedleyDB dataset (Bittner et al., ISMIR 2014) as well as a test set of 3-second clips, which are extracted from the solosDB dataset (Essid et al., IEEE TASLP 2009).

Each of these clips contains a single instrument among a taxonomy of eight:

clarinet,

distorted electric guitar,

female singer,

flute,

piano,

tenor saxophone,

trumpet, and

violin.

The Medley-solos-DB dataset is the dataset that is used in the benchmarks of musical instrument recognition in the publications of Lostanlen and Cella (ISMIR 2016) and Andén et al. (IEEE TSP 2019).

class mirdata.datasets.medley_solos_db.Dataset(data_home=None)[source]¶

The medley_solos_db dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Medley Solos DB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.medley_solos_db.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

medley_solos_db Track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – path to the track’s audio file
instrument (str) – instrument encoded by its English name
instrument_id (int) – instrument encoded as an integer
song_id (int) – song encoded as an integer
subset (str) – either equal to ‘train’, ‘validation’, or ‘test’
track_id (str) – track id

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.medley_solos_db.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a Medley Solos DB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

medleydb_melody¶

MedleyDB melody Dataset Loader

Dataset Info

MedleyDB melody is a subset of the MedleyDB dataset containing only the mixtures and melody annotations.

MedleyDB is a dataset of annotated, royalty-free multitrack recordings. MedleyDB was curated primarily to support research on melody extraction, addressing important shortcomings of existing collections. For each song we provide melody f0 annotations as well as instrument activations for evaluating automatic instrument recognition.

For more details, please visit: https://medleydb.weebly.com

class mirdata.datasets.medleydb_melody.Dataset(data_home=None)[source]¶

The medleydb_melody dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_melody(*args, **kwargs)[source]¶

Load a MedleyDB melody1 or melody2 annotation file

Parameters: fhandle (str or file-like) – File-like object or path to a melody annotation file
Raises: IOError – if melody_path does not exist
Returns: F0Data – melody data

load_melody3(*args, **kwargs)[source]¶

Load a MedleyDB melody3 annotation file

Parameters: fhandle (str or file-like) – File-like object or melody 3 melody annotation path
Raises: IOError – if melody_path does not exist
Returns: MultiF0Data – melody 3 annotation data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.medleydb_melody.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

medleydb_melody Track class

Parameters

track_id (str) – track id of the track

Variables

artist (str) – artist
audio_path (str) – path to the audio file
genre (str) – genre
is_excerpt (bool) – True if the track is an excerpt
is_instrumental (bool) – True of the track does not contain vocals
melody1_path (str) – path to the melody1 annotation file
melody2_path (str) – path to the melody2 annotation file
melody3_path (str) – path to the melody3 annotation file
n_sources (int) – Number of instruments in the track
title (str) – title
track_id (str) – track id

Other Parameters

melody1 (F0Data) – the pitch of the single most predominant source (often the voice)
melody2 (F0Data) – the pitch of the predominant source for each point in time
melody3 (MultiF0Data) – the pitch of any melodic source. Allows for more than one f0 value at a time

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.medleydb_melody.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.medleydb_melody.load_melody(fhandle: TextIO) → mirdata.annotations.F0Data [source]¶

Load a MedleyDB melody1 or melody2 annotation file

Parameters: fhandle (str or file-like) – File-like object or path to a melody annotation file
Raises: IOError – if melody_path does not exist
Returns: F0Data – melody data

mirdata.datasets.medleydb_melody.load_melody3(fhandle: TextIO) → mirdata.annotations.MultiF0Data [source]¶

Load a MedleyDB melody3 annotation file

Parameters: fhandle (str or file-like) – File-like object or melody 3 melody annotation path
Raises: IOError – if melody_path does not exist
Returns: MultiF0Data – melody 3 annotation data

medleydb_pitch¶

MedleyDB pitch Dataset Loader

Dataset Info

MedleyDB Pitch is a pitch-tracking subset of the MedleyDB dataset containing only f0-annotated, monophonic stems.

MedleyDB is a dataset of annotated, royalty-free multitrack recordings. MedleyDB was curated primarily to support research on melody extraction, addressing important shortcomings of existing collections. For each song we provide melody f0 annotations as well as instrument activations for evaluating automatic instrument recognition.

For more details, please visit: https://medleydb.weebly.com

class mirdata.datasets.medleydb_pitch.Dataset(data_home=None)[source]¶

The medleydb_pitch dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_pitch(*args, **kwargs)[source]¶

load a MedleyDB pitch annotation file

Parameters: pitch_path (str) – path to pitch annotation file
Raises: IOError – if pitch_path doesn’t exist
Returns: F0Data – pitch annotation

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.medleydb_pitch.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

medleydb_pitch Track class

Parameters

track_id (str) – track id of the track

Variables

artist (str) – artist
audio_path (str) – path to the audio file
genre (str) – genre
instrument (str) – instrument of the track
pitch_path (str) – path to the pitch annotation file
title (str) – title
track_id (str) – track id

Other Parameters

pitch (F0Data) – human annotated pitch

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.medleydb_pitch.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a MedleyDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.medleydb_pitch.load_pitch(fhandle: TextIO) → mirdata.annotations.F0Data [source]¶

load a MedleyDB pitch annotation file

Parameters: pitch_path (str) – path to pitch annotation file
Raises: IOError – if pitch_path doesn’t exist
Returns: F0Data – pitch annotation

mridangam_stroke¶

Mridangam Stroke Dataset Loader

Dataset Info

The Mridangam Stroke dataset is a collection of individual strokes of the Mridangam in various tonics. The dataset comprises of 10 different strokes played on Mridangams with 6 different tonic values. The audio examples were recorded from a professional Carnatic percussionist in a semi-anechoic studio conditions by Akshay Anantapadmanabhan.

Total audio samples: 6977

Used microphones:

SM-58 microphones
H4n ZOOM recorder.

Audio specifications:

Sampling frequency: 44.1 kHz
Bit-depth: 16 bit
Audio format: .wav

The dataset can be used for training models for each Mridangam stroke. The presentation of the dataset took place on the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013) on May 2013. You can read the full publication here: https://repositori.upf.edu/handle/10230/25756

Mridangam Dataset is annotated by storing the informat of each track in their filenames. The structure of the filename is:

<TrackID>__<AuthorName>__<StrokeName>-<Tonic>-<InstanceNum>.wav

The dataset is made available by CompMusic under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) License.

For more details, please visit: https://compmusic.upf.edu/mridangam-stroke-dataset

class mirdata.datasets.mridangam_stroke.Dataset(data_home=None)[source]¶

The mridangam_stroke dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Mridangam Stroke Dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.mridangam_stroke.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

Mridangam Stroke track class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored.

Variables

track_id (str) – track id
audio_path (str) – audio path
stroke_name (str) – name of the Mridangam stroke present in Track
tonic (str) – tonic of the stroke in the Track

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.mridangam_stroke.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a Mridangam Stroke Dataset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

orchset¶

ORCHSET Dataset Loader

Dataset Info

Orchset is intended to be used as a dataset for the development and evaluation of melody extraction algorithms. This collection contains 64 audio excerpts focused on symphonic music with their corresponding annotation of the melody.

For more details, please visit: https://zenodo.org/record/1289786#.XREpzaeZPx6

class mirdata.datasets.orchset.Dataset(data_home=None)[source]¶

The orchset dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio_mono(*args, **kwargs)[source]¶

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_audio_stereo(*args, **kwargs)[source]¶

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the stereo audio signal
float - The sample rate of the audio file

load_melody(*args, **kwargs)[source]¶

Load an Orchset melody annotation file

Parameters: fhandle (str or file-like) – File-like object or path to melody annotation file
Raises: IOError – if melody_path doesn’t exist
Returns: F0Data – melody annotation data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.orchset.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

orchset Track class

Parameters

track_id (str) – track id of the track

Variables

alternating_melody (bool) – True if the melody alternates between instruments
audio_path_mono (str) – path to the mono audio file
audio_path_stereo (str) – path to the stereo audio file
composer (str) – the work’s composer
contains_brass (bool) – True if the track contains any brass instrument
contains_strings (bool) – True if the track contains any string instrument
contains_winds (bool) – True if the track contains any wind instrument
excerpt (str) – True if the track is an excerpt
melody_path (str) – path to the melody annotation file
only_brass (bool) – True if the track contains brass instruments only
only_strings (bool) – True if the track contains string instruments only
only_winds (bool) – True if the track contains wind instruments only
predominant_melodic_instruments (list) – List of instruments which play the melody
track_id (str) – track id
work (str) – The musical work

Other Parameters

melody (F0Data) – melody annotation

property audio_mono¶

the track’s audio (mono)

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

property audio_stereo¶

the track’s audio (stereo)

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.orchset.load_audio_mono(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.orchset.load_audio_stereo(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load an Orchset audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the stereo audio signal
float - The sample rate of the audio file

mirdata.datasets.orchset.load_melody(fhandle: TextIO) → mirdata.annotations.F0Data [source]¶

Load an Orchset melody annotation file

Parameters: fhandle (str or file-like) – File-like object or path to melody annotation file
Raises: IOError – if melody_path doesn’t exist
Returns: F0Data – melody annotation data

rwc_classical¶

RWC Classical Dataset Loader

Dataset Info

The Classical Music Database consists of 50 pieces

Symphonies: 4 pieces
Concerti: 2 pieces
Orchestral music: 4 pieces
Chamber music: 10 pieces
Solo performances: 24 pieces
Vocal performances: 6 pieces

A note about the Beat annotations:

48 corresponds to the duration of a quarter note (crotchet)
24 corresponds to the duration of an eighth note (quaver)
384 corresponds to the position of a downbeat

In 4/4 time signature, they correspond as follows:

1st beat in a measure (i.e., downbeat position)
2nd beat
3rd beat
4th beat

In 3/4 time signature, they correspond as follows:

1st beat in a measure (i.e., downbeat position)
2nd beat
3rd beat

In 6/8 time signature, they correspond as follows:

1st beat in a measure (i.e., downbeat position)
2nd beat
3rd beat
4th beat
5th beat
6th beat

For more details, please visit: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-c.html

class mirdata.datasets.rwc_classical.Dataset(data_home=None)[source]¶

The rwc_classical dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]¶

Load rwc beat data from a file

Parameters: fhandle (str or file-like) – File-like object or path to beats annotation file
Returns: BeatData – beat data

load_sections(*args, **kwargs)[source]¶

Load rwc section data from a file

Parameters: fhandle (str or file-like) – File-like object or path to sections annotation file
Returns: SectionData – section data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.rwc_classical.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

rwc_classical Track class

Parameters

track_id (str) – track id of the track

Variables

artist (str) – the track’s artist
audio_path (str) – path of the audio file
beats_path (str) – path of the beat annotation file
category (str) – One of ‘Symphony’, ‘Concerto’, ‘Orchestral’, ‘Solo’, ‘Chamber’, ‘Vocal’, or blank.
composer (str) – Composer of this Track.
duration (float) – Duration of the track in seconds
piece_number (str) – Piece number of this Track, [1-50]
sections_path (str) – path of the section annotation file
suffix (str) – string within M01-M06
title (str) – Title of The track.
track_id (str) – track id
track_number (str) – CD track number of this Track

Other Parameters

sections (SectionData) – human-labeled section annotations
beats (BeatData) – human-labeled beat annotations

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.rwc_classical.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.rwc_classical.load_beats(fhandle: TextIO) → mirdata.annotations.BeatData [source]¶

Load rwc beat data from a file

Parameters: fhandle (str or file-like) – File-like object or path to beats annotation file
Returns: BeatData – beat data

mirdata.datasets.rwc_classical.load_sections(fhandle: TextIO) → Optional[mirdata.annotations.SectionData][source]¶

Load rwc section data from a file

Parameters: fhandle (str or file-like) – File-like object or path to sections annotation file
Returns: SectionData – section data

rwc_jazz¶

RWC Jazz Dataset Loader.

Dataset Info

The Jazz Music Database consists of 50 pieces:

Instrumentation variations: 35 pieces (5 pieces × 7 instrumentations).
The instrumentation-variation pieces were recorded to obtain different versions of the same piece; i.e., different arrangements performed by different player instrumentations. Five standard-style jazz pieces were originally composed and then performed in modern-jazz style using the following seven instrumentations:
1. Piano solo
2. Guitar solo
3. Duo: Vibraphone + Piano, Flute + Piano, and Piano + Bass
4. Piano trio: Piano + Bass + Drums
5. Piano trio + Trumpet or Tenor saxophone
6. Octet: Piano trio + Guitar + Alto saxophone + Baritone saxophone + Tenor saxophone × 2
7. Piano trio + Vibraphone or Flute
Style variations: 9 pieces
The style-variation pieces were recorded to represent various styles of jazz. They include four well-known public-domain pieces and consist of
1. Vocal jazz: 2 pieces (including “Aura Lee”)
2. Big band jazz: 2 pieces (including “The Entertainer”)
3. Modal jazz: 2 pieces
4. Funky jazz: 2 pieces (including “Silent Night”)
5. Free jazz: 1 piece (including “Joyful, Joyful, We Adore Thee”)
Fusion (crossover): 6 pieces

The fusion pieces were recorded to obtain music that combines elements of jazz with other styles such as popular, rock, and latin. They include music with an eighth-note feel, music with a sixteenth-note feel, and Latin jazz music.

For more details, please visit: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-j.html

class mirdata.datasets.rwc_jazz.Dataset(data_home=None)[source]¶

The rwc_jazz dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]¶

Load rwc beat data from a file

Parameters: fhandle (str or file-like) – File-like object or path to beats annotation file
Returns: BeatData – beat data

load_sections(*args, **kwargs)[source]¶

Load rwc section data from a file

Parameters: fhandle (str or file-like) – File-like object or path to sections annotation file
Returns: SectionData – section data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.rwc_jazz.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

rwc_jazz Track class

Parameters

track_id (str) – track id of the track

Variables

artist (str) – Artist name
audio_path (str) – path of the audio file
beats_path (str) – path of the beat annotation file
duration (float) – Duration of the track in seconds
instruments (str) – list of used instruments.
piece_number (str) – Piece number of this Track, [1-50]
sections_path (str) – path of the section annotation file
suffix (str) – M01-M04
title (str) – Title of The track.
track_id (str) – track id
track_number (str) – CD track number of this Track
variation (str) – style variations

Other Parameters

sections (SectionData) – human-labeled section data
beats (BeatData) – human-labeled beat data

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

rwc_popular¶

RWC Popular Dataset Loader

Dataset Info

The Popular Music Database consists of 100 songs — 20 songs with English lyrics performed in the style of popular music typical of songs on the American hit charts in the 1980s, and 80 songs with Japanese lyrics performed in the style of modern Japanese popular music typical of songs on the Japanese hit charts in the 1990s.

For more details, please visit: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-p.html

class mirdata.datasets.rwc_popular.Dataset(data_home=None)[source]¶

The rwc_popular dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a RWC audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_beats(*args, **kwargs)[source]¶

Load rwc beat data from a file

Parameters: fhandle (str or file-like) – File-like object or path to beats annotation file
Returns: BeatData – beat data

load_chords(*args, **kwargs)[source]¶

Load rwc chord data from a file

Parameters: fhandle (str or file-like) – File-like object or path to chord annotation file
Returns: ChordData – chord data

load_sections(*args, **kwargs)[source]¶

Load rwc section data from a file

Parameters: fhandle (str or file-like) – File-like object or path to sections annotation file
Returns: SectionData – section data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

load_vocal_activity(*args, **kwargs)[source]¶

Load rwc vocal activity data from a file

Parameters: fhandle (str or file-like) – File-like object or path to vocal activity annotation file
Returns: EventData – vocal activity data

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.rwc_popular.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

rwc_popular Track class

Parameters

track_id (str) – track id of the track

Variables

artist (str) – artist
audio_path (str) – path of the audio file
beats_path (str) – path of the beat annotation file
chords_path (str) – path of the chord annotation file
drum_information (str) – If the drum is ‘Drum sequences’, ‘Live drums’, or ‘Drum loops’
duration (float) – Duration of the track in seconds
instruments (str) – List of used instruments
piece_number (str) – Piece number, [1-50]
sections_path (str) – path of the section annotation file
singer_information (str) – could be male, female or vocal group
suffix (str) – M01-M04
tempo (str) – Tempo of the track in BPM
title (str) – title
track_id (str) – track id
track_number (str) – CD track number
voca_inst_path (str) – path of the vocal/instrumental annotation file

Other Parameters

sections (SectionData) – human-labeled section annotation
beats (BeatData) – human-labeled beat annotation
chords (ChordData) – human-labeled chord annotation
vocal_instrument_activity (EventData) – human-labeled vocal/instrument activity

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.rwc_popular.load_chords(fhandle: TextIO) → mirdata.annotations.ChordData [source]¶

Load rwc chord data from a file

Parameters: fhandle (str or file-like) – File-like object or path to chord annotation file
Returns: ChordData – chord data

mirdata.datasets.rwc_popular.load_vocal_activity(fhandle: TextIO) → mirdata.annotations.EventData [source]¶

Load rwc vocal activity data from a file

Parameters: fhandle (str or file-like) – File-like object or path to vocal activity annotation file
Returns: EventData – vocal activity data

salami¶

SALAMI Dataset Loader

Dataset Info

The SALAMI dataset contains Structural Annotations of a Large Amount of Music Information: the public portion contains over 2200 annotations of over 1300 unique tracks.

NB: mirdata relies on the corrected version of the 2.0 annotations: Details can be found at https://github.com/bmcfee/salami-data-public/tree/hierarchy-corrections and https://github.com/DDMAL/salami-data-public/pull/15.

For more details, please visit: https://github.com/DDMAL/salami-data-public

class mirdata.datasets.salami.Dataset(data_home=None)[source]¶

The salami dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Salami audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_sections(*args, **kwargs)[source]¶

Load salami sections data from a file

Parameters: fhandle (str or file-like) – File-like object or path to sectin annotation file
Returns: SectionData – section data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.salami.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

salami Track class

Parameters

track_id (str) – track id of the track

Variables

annotator_1_id (str) – number that identifies annotator 1
annotator_1_time (str) – time that the annotator 1 took to complete the annotation
annotator_2_id (str) – number that identifies annotator 1
annotator_2_time (str) – time that the annotator 1 took to complete the annotation
artist (str) – song artist
audio_path (str) – path to the audio file
broad_genre (str) – broad genre of the song
duration (float) – duration of song in seconds
genre (str) – genre of the song
sections_annotator1_lowercase_path (str) – path to annotations in hierarchy level 1 from annotator 1
sections_annotator1_uppercase_path (str) – path to annotations in hierarchy level 0 from annotator 1
sections_annotator2_lowercase_path (str) – path to annotations in hierarchy level 1 from annotator 2
sections_annotator2_uppercase_path (str) – path to annotations in hierarchy level 0 from annotator 2
source (str) – dataset or source of song
title (str) – title of the song

Other Parameters

sections_annotator_1_uppercase (SectionData) – annotations in hierarchy level 0 from annotator 1
sections_annotator_1_lowercase (SectionData) – annotations in hierarchy level 1 from annotator 1
sections_annotator_2_uppercase (SectionData) – annotations in hierarchy level 0 from annotator 2
sections_annotator_2_lowercase (SectionData) – annotations in hierarchy level 1 from annotator 2

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.salami.load_audio(fhandle: str) → Tuple[numpy.ndarray, float][source]¶

Load a Salami audio file.

Parameters

fhandle (str or file-like) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.salami.load_sections(fhandle: TextIO) → mirdata.annotations.SectionData [source]¶

Load salami sections data from a file

Parameters: fhandle (str or file-like) – File-like object or path to sectin annotation file
Returns: SectionData – section data

saraga_carnatic¶

Saraga Dataset Loader

Dataset Info

This dataset contains time aligned melody, rhythm and structural annotations of Carnatic Music tracks, extracted from the large open Indian Art Music corpora of CompMusic.

The dataset contains the following manual annotations referring to audio files:

Section and tempo annotations stored as start and end timestamps together with the name of the section and tempo during the section (in a separate file)
Sama annotations referring to rhythmic cycle boundaries stored as timestamps.
Phrase annotations stored as timestamps and transcription of the phrases using solfège symbols ({S, r, R, g, G, m, M, P, d, D, n, N}).
Audio features automatically extracted and stored: pitch and tonic.
The annotations are stored in text files, named as the audio filename but with the respective extension at the end, for instance: “Bhuvini Dasudane.tempo-manual.txt”.

The dataset contains a total of 249 tracks. A total of 168 tracks have multitrack audio.

The files of this dataset are shared with the following license: Creative Commons Attribution Non Commercial Share Alike 4.0 International

Dataset compiled by: Bozkurt, B.; Srinivasamurthy, A.; Gulati, S. and Serra, X.

For more information about the dataset as well as IAM and annotations, please refer to: https://mtg.github.io/saraga/, where a really detailed explanation of the data and annotations is published.

class mirdata.datasets.saraga_carnatic.Dataset(data_home=None)[source]¶

The saraga_carnatic dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Saraga Carnatic audio file.

Parameters

audio_path (str) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_metadata(*args, **kwargs)[source]¶

Load a Saraga Carnatic metadata file

Parameters

metadata_path (str) – path to metadata json file

Returns

dict –

metadata with the following fields

title (str): Title of the piece in the track

mbid (str): MusicBrainz ID of the track

album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

artists (list, dicts): list of dicts containing information of the featuring artists in the track

raaga (list, dict): list of dicts containing information about the raagas present in the track

form (list, dict): list of dicts containing information about the forms present in the track

work (list, dicts): list of dicts containing the work present in the piece, and its mbid

taala (list, dicts): list of dicts containing the talas present in the track and its uuid

concert (list, dicts): list of dicts containing the concert where the track is present and its mbid

load_phrases(*args, **kwargs)[source]¶

Load phrases

Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.
Returns: EventData – phrases annotation for track

load_pitch(*args, **kwargs)[source]¶

Load pitch

Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.
Returns: F0Data – pitch annotation

load_sama(*args, **kwargs)[source]¶

Load sama

Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None.
Returns: BeatData – sama annotations

load_sections(*args, **kwargs)[source]¶

Load sections from carnatic collection

Parameters: sections_path (str) – Local path where the section annotation is stored.
Returns: SectionData – section annotations for track

load_tempo(*args, **kwargs)[source]¶

Load tempo from carnatic collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict –

Dictionary of tempo information with the following keys:

tempo_apm: tempo in aksharas per minute (APM)

tempo_bpm: tempo in beats per minute (BPM)

sama_interval: median duration (in seconds) of one tāla cycle

beats_per_cycle: number of beats in one cycle of the tāla

subdivisions: number of aksharas per beat of the tāla

load_tonic(*args, **kwargs)[source]¶

Load track absolute tonic

Parameters: tonic_path (str) – Local path where the tonic path is stored. If None, returns None.
Returns: int – Tonic annotation in Hz

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.saraga_carnatic.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

Saraga Track Carnatic class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored. default=None If None, looks for the data in the default directory, ~/mir_datasets

Variables

audio_path (str) – path to audio file
audio_ghatam_path (str) – path to ghatam audio file
audio_mridangam_left_path (str) – path to mridangam left audio file
audio_mridangam_right_path (str) – path to mridangam right audio file
audio_violin_path (str) – path to violin audio file
audio_vocal_s_path (str) – path to vocal s audio file
audio_vocal_pat (str) – path to vocal pat audio file
ctonic_path (srt) – path to ctonic annotation file
pitch_path (srt) – path to pitch annotation file
pitch_vocal_path (srt) – path to vocal pitch annotation file
tempo_path (srt) – path to tempo annotation file
sama_path (srt) – path to sama annotation file
sections_path (srt) – path to sections annotation file
phrases_path (srt) – path to phrases annotation file
metadata_path (srt) – path to metadata file

Other Parameters

tonic (float) – tonic annotation
pitch (F0Data) – pitch annotation
pitch_vocal (F0Data) – vocal pitch annotation
tempo (dict) – tempo annotations
sama (BeatData) – sama section annotations
sections (SectionData) – track section annotations
phrases (SectionData) – phrase annotations
metadata (dict) – track metadata with the following fields:
- title (str): Title of the piece in the track
- mbid (str): MusicBrainz ID of the track
- album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid
- artists (list, dicts): list of dicts containing information of the featuring artists in the track
- raaga (list, dict): list of dicts containing information about the raagas present in the track
- form (list, dict): list of dicts containing information about the forms present in the track
- work (list, dicts): list of dicts containing the work present in the piece, and its mbid
- taala (list, dicts): list of dicts containing the talas present in the track and its uuid
- concert (list, dicts): list of dicts containing the concert where the track is present and its mbid

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.saraga_carnatic.load_audio(audio_path)[source]¶

Load a Saraga Carnatic audio file.

Parameters

audio_path (str) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.saraga_carnatic.load_metadata(metadata_path)[source]¶

Load a Saraga Carnatic metadata file

Parameters

metadata_path (str) – path to metadata json file

Returns

dict –

metadata with the following fields

title (str): Title of the piece in the track

mbid (str): MusicBrainz ID of the track

album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

artists (list, dicts): list of dicts containing information of the featuring artists in the track

raaga (list, dict): list of dicts containing information about the raagas present in the track

form (list, dict): list of dicts containing information about the forms present in the track

work (list, dicts): list of dicts containing the work present in the piece, and its mbid

taala (list, dicts): list of dicts containing the talas present in the track and its uuid

concert (list, dicts): list of dicts containing the concert where the track is present and its mbid

mirdata.datasets.saraga_carnatic.load_phrases(phrases_path)[source]¶

Load phrases

Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.
Returns: EventData – phrases annotation for track

mirdata.datasets.saraga_carnatic.load_pitch(pitch_path)[source]¶

Load pitch

Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.
Returns: F0Data – pitch annotation

mirdata.datasets.saraga_carnatic.load_sama(sama_path)[source]¶

Load sama

Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None.
Returns: BeatData – sama annotations

mirdata.datasets.saraga_carnatic.load_sections(sections_path)[source]¶

Load sections from carnatic collection

Parameters: sections_path (str) – Local path where the section annotation is stored.
Returns: SectionData – section annotations for track

mirdata.datasets.saraga_carnatic.load_tempo(tempo_path)[source]¶

Load tempo from carnatic collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict –

Dictionary of tempo information with the following keys:

tempo_apm: tempo in aksharas per minute (APM)

tempo_bpm: tempo in beats per minute (BPM)

sama_interval: median duration (in seconds) of one tāla cycle

beats_per_cycle: number of beats in one cycle of the tāla

subdivisions: number of aksharas per beat of the tāla

mirdata.datasets.saraga_carnatic.load_tonic(tonic_path)[source]¶

Load track absolute tonic

Parameters: tonic_path (str) – Local path where the tonic path is stored. If None, returns None.
Returns: int – Tonic annotation in Hz

saraga_hindustani¶

Saraga Dataset Loader

Dataset Info

This dataset contains time aligned melody, rhythm and structural annotations of Hindustani Music tracks, extracted from the large open Indian Art Music corpora of CompMusic.

The dataset contains the following manual annotations referring to audio files:

Section and tempo annotations stored as start and end timestamps together with the name of the section and tempo during the section (in a separate file)
Sama annotations referring to rhythmic cycle boundaries stored as timestamps
Phrase annotations stored as timestamps and transcription of the phrases using solfège symbols ({S, r, R, g, G, m, M, P, d, D, n, N})
Audio features automatically extracted and stored: pitch and tonic.
The annotations are stored in text files, named as the audio filename but with the respective extension at the end, for instance: “Bhuvini Dasudane.tempo-manual.txt”.

The dataset contains a total of 108 tracks.

The files of this dataset are shared with the following license: Creative Commons Attribution Non Commercial Share Alike 4.0 International

Dataset compiled by: Bozkurt, B.; Srinivasamurthy, A.; Gulati, S. and Serra, X.

For more information about the dataset as well as IAM and annotations, please refer to: https://mtg.github.io/saraga/, where a really detailed explanation of the data and annotations is published.

class mirdata.datasets.saraga_hindustani.Dataset(data_home=None)[source]¶

The saraga_hindustani dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Saraga Hindustani audio file.

Parameters

audio_path (str) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_phrases(*args, **kwargs)[source]¶

Load phrases

Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.
Returns: EventData – phrases annotation for track

load_pitch(*args, **kwargs)[source]¶

Load automatic extracted pitch or melody

Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.
Returns: F0Data – pitch annotation

load_sama(*args, **kwargs)[source]¶

Load sama

Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None.
Returns: SectionData – sama annotations

load_sections(*args, **kwargs)[source]¶

Load tracks sections

Parameters: sections_path (str) – Local path where the section annotation is stored.
Returns: SectionData – section annotations for track

load_tempo(*args, **kwargs)[source]¶

Load tempo from hindustani collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict – Dictionary of tempo information with the following keys:

tempo: median tempo for the section in mātrās per minute (MPM)
matra_interval: tempo expressed as the duration of the mātra (essentially dividing 60 by tempo, expressed in seconds)
sama_interval: median duration of one tāl cycle in the section
matras_per_cycle: indicator of the structure of the tāl, showing the number of mātrā in a cycle of the tāl of the recording
start_time: start time of the section
duration: duration of the section

load_tonic(*args, **kwargs)[source]¶

Load track absolute tonic

Parameters: tonic_path (str) – Local path where the tonic path is stored. If None, returns None.
Returns: int – Tonic annotation in Hz

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.saraga_hindustani.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

Saraga Hindustani Track class

Parameters

track_id (str) – track id of the track
data_home (str) – Local path where the dataset is stored. default=None If None, looks for the data in the default directory, ~/mir_datasets

Variables

audio_path (str) – path to audio file
ctonic_path (str) – path to ctonic annotation file
pitch_path (str) – path to pitch annotation file
tempo_path (str) – path to tempo annotation file
sama_path (str) – path to sama annotation file
sections_path (str) – path to sections annotation file
phrases_path (str) – path to phrases annotation file
metadata_path (str) – path to metadata annotation file

Other Parameters

tonic (float) – tonic annotation
pitch (F0Data) – pitch annotation
tempo (dict) – tempo annotations
sama (BeatData) – Sama section annotations
sections (SectionData) – track section annotations
phrases (EventData) – phrase annotations
metadata (dict) – track metadata with the following fields
- title (str): Title of the piece in the track
- mbid (str): MusicBrainz ID of the track
- album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid
- artists (list, dicts): list of dicts containing information of the featuring artists in the track
- raags (list, dict): list of dicts containing information about the raags present in the track
- forms (list, dict): list of dicts containing information about the forms present in the track
- release (list, dicts): list of dicts containing information of the release where the track is found
- works (list, dicts): list of dicts containing the work present in the piece, and its mbid
- taals (list, dicts): list of dicts containing the taals present in the track and its uuid
- layas (list, dicts): list of dicts containing the layas present in the track and its uuid

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.saraga_hindustani.load_audio(audio_path)[source]¶

Load a Saraga Hindustani audio file.

Parameters

audio_path (str) – path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.saraga_hindustani.load_metadata(metadata_path)[source]¶

Load a Saraga Hindustani metadata file

Parameters

metadata_path (str) – path to metadata json file

Returns

dict –

metadata with the following fields

title (str): Title of the piece in the track

mbid (str): MusicBrainz ID of the track

album_artists (list, dicts): list of dicts containing the album artists present in the track and its mbid

artists (list, dicts): list of dicts containing information of the featuring artists in the track

raags (list, dict): list of dicts containing information about the raags present in the track

forms (list, dict): list of dicts containing information about the forms present in the track

release (list, dicts): list of dicts containing information of the release where the track is found

works (list, dicts): list of dicts containing the work present in the piece, and its mbid

taals (list, dicts): list of dicts containing the taals present in the track and its uuid

layas (list, dicts): list of dicts containing the layas present in the track and its uuid

mirdata.datasets.saraga_hindustani.load_phrases(phrases_path)[source]¶

Load phrases

Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None.
Returns: EventData – phrases annotation for track

mirdata.datasets.saraga_hindustani.load_pitch(pitch_path)[source]¶

Load automatic extracted pitch or melody

Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None.
Returns: F0Data – pitch annotation

mirdata.datasets.saraga_hindustani.load_sama(sama_path)[source]¶

Load sama

Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None.
Returns: SectionData – sama annotations

mirdata.datasets.saraga_hindustani.load_sections(sections_path)[source]¶

Load tracks sections

Parameters: sections_path (str) – Local path where the section annotation is stored.
Returns: SectionData – section annotations for track

mirdata.datasets.saraga_hindustani.load_tempo(tempo_path)[source]¶

Load tempo from hindustani collection

Parameters

tempo_path (str) – Local path where the tempo annotation is stored.

Returns

dict – Dictionary of tempo information with the following keys:

tempo: median tempo for the section in mātrās per minute (MPM)
matra_interval: tempo expressed as the duration of the mātra (essentially dividing 60 by tempo, expressed in seconds)
sama_interval: median duration of one tāl cycle in the section
matras_per_cycle: indicator of the structure of the tāl, showing the number of mātrā in a cycle of the tāl of the recording
start_time: start time of the section
duration: duration of the section

mirdata.datasets.saraga_hindustani.load_tonic(tonic_path)[source]¶

Load track absolute tonic

Parameters: tonic_path (str) – Local path where the tonic path is stored. If None, returns None.
Returns: int – Tonic annotation in Hz

tinysol¶

TinySOL Dataset Loader.

Dataset Info

TinySOL is a dataset of 2913 samples, each containing a single musical note from one of 14 different instruments:

Bass Tuba
French Horn
Trombone
Trumpet in C
Accordion
Contrabass
Violin
Viola
Violoncello
Bassoon
Clarinet in B-flat
Flute
Oboe
Alto Saxophone

These sounds were originally recorded at Ircam in Paris (France) between 1996 and 1999, as part of a larger project named Studio On Line (SOL). Although SOL contains many combinations of mutes and extended playing techniques, TinySOL purely consists of sounds played in the so-called “ordinary” style, and in absence of mute.

TinySOL can be used for education and research purposes. In particular, it can be employed as a dataset for training and/or evaluating music information retrieval (MIR) systems, for tasks such as instrument recognition or fundamental frequency estimation. For this purpose, we provide an official 5-fold split of TinySOL as a metadata attribute. This split has been carefully balanced in terms of instrumentation, pitch range, and dynamics. For the sake of research reproducibility, we encourage users of TinySOL to adopt this split and report their results in terms of average performance across folds.

We encourage TinySOL users to subscribe to the Ircam Forum so that they can have access to larger versions of SOL.

For more details, please visit: https://www.orch-idea.org/

class mirdata.datasets.tinysol.Dataset(data_home=None)[source]¶

The tinysol dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a TinySOL audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.tinysol.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

tinysol Track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – path of the audio file
dynamics (str) – dynamics abbreviation. Ex: pp, mf, ff, etc.
dynamics_id (int) – pp=0, p=1, mf=2, f=3, ff=4
family (str) – instrument family encoded by its English name
instance_id (int) – instance ID. Either equal to 0, 1, 2, or 3.
instrument_abbr (str) – instrument abbreviation
instrument_full (str) – instrument encoded by its English name
is_resampled (bool) – True if this sample was pitch-shifted from a neighbor; False if it was genuinely recorded.
pitch (str) – string containing English pitch class and octave number
pitch_id (int) – MIDI note index, where middle C (“C4”) corresponds to 60
string_id (NoneType) – string ID. By musical convention, the first string is the highest. On wind instruments, this is replaced by None.
technique_abbr (str) – playing technique abbreviation
technique_full (str) – playing technique encoded by its English name
track_id (str) – track id

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.tinysol.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a TinySOL audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

tonality_classicaldb¶

Tonality classicalDB Dataset Loader

Dataset Info

The Tonality classicalDB Dataset includes 881 classical musical pieces across different styles from s.XVII to s.XX annotated with single-key labels.

Tonality classicalDB Dataset was created as part of:

Gómez, E. (2006). PhD Thesis. Tonal description of music audio signals.
Department of Information and Communication Technologies.

This dataset is mainly intended to assess the performance of computational key estimation algorithms in classical music.

2020 note: The audio is privates. If you don’t have the original audio collection, you could create it from your private collection because most of the recordings are well known. To this end, we provide musicbrainz metadata. Moreover, we have added the spectrum and HPCP chromagram of each audio.

This dataset can be used with mirdata library: https://github.com/mir-dataset-loaders/mirdata

Spectrum features have been computed as is shown here: https://github.com/mir-dataset-loaders/mirdata-notebooks/blob/master/Tonality_classicalDB/ClassicalDB_spectrum_features.ipynb

HPCP chromagram has been computed as is shown here: https://github.com/mir-dataset-loaders/mirdata-notebooks/blob/master/Tonality_classicalDB/ClassicalDB_HPCP_features.ipynb

Musicbrainz metadata has been computed as is shown here: https://github.com/mir-dataset-loaders/mirdata-notebooks/blob/master/Tonality_classicalDB/ClassicalDB_musicbrainz_metadata.ipynb

class mirdata.datasets.tonality_classicaldb.Dataset(data_home=None)[source]¶

The tonality_classicaldb dataset

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_audio(*args, **kwargs)[source]¶

Load a Tonality classicalDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

load_hpcp(*args, **kwargs)[source]¶

Load Tonality classicalDB HPCP feature from a file

Parameters: fhandle (str or file-like) – File-like object or path to HPCP file
Returns: np.ndarray – loaded HPCP data

load_key(*args, **kwargs)[source]¶

Load Tonality classicalDB format key data from a file

Parameters: fhandle (str or file-like) – File-like object or path to key annotation file
Returns: str – musical key data

load_musicbrainz(*args, **kwargs)[source]¶

Load Tonality classicalDB musicbraiz metadata from a file

Parameters: fhandle (str or file-like) – File-like object or path to musicbrainz metadata file
Returns: dict – musicbrainz metadata

load_spectrum(*args, **kwargs)[source]¶

Load Tonality classicalDB spectrum data from a file

Parameters: fhandle (str or file-like) – File-like object or path to spectrum file
Returns: np.ndarray – spectrum data

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.datasets.tonality_classicaldb.Track(track_id, data_home, dataset_name, index, metadata)[source]¶

tonality_classicaldb track class

Parameters

track_id (str) – track id of the track

Variables

audio_path (str) – track audio path
key_path (str) – key annotation path
title (str) – title of the track
track_id (str) – track id

Other Parameters

key (str) – key annotation
spectrum (np.array) – computed audio spectrum
hpcp (np.array) – computed hpcp
musicbrainz_metadata (dict) – MusicBrainz metadata

property audio¶

The track’s audio

Returns

np.ndarray - audio signal
float - sample rate

to_jams()[source]¶

Get the track’s data in jams format

Returns: jams.JAMS – the track’s data in jams format

mirdata.datasets.tonality_classicaldb.load_audio(fhandle: BinaryIO) → Tuple[numpy.ndarray, float][source]¶

Load a Tonality classicalDB audio file.

Parameters

fhandle (str or file-like) – File-like object or path to audio file

Returns

np.ndarray - the mono audio signal
float - The sample rate of the audio file

mirdata.datasets.tonality_classicaldb.load_hpcp(fhandle: TextIO) → numpy.ndarray [source]¶

Load Tonality classicalDB HPCP feature from a file

Parameters: fhandle (str or file-like) – File-like object or path to HPCP file
Returns: np.ndarray – loaded HPCP data

mirdata.datasets.tonality_classicaldb.load_key(fhandle: TextIO) → str[source]¶

Load Tonality classicalDB format key data from a file

Parameters: fhandle (str or file-like) – File-like object or path to key annotation file
Returns: str – musical key data

mirdata.datasets.tonality_classicaldb.load_musicbrainz(fhandle: TextIO) → Dict[Any, Any][source]¶

Load Tonality classicalDB musicbraiz metadata from a file

Parameters: fhandle (str or file-like) – File-like object or path to musicbrainz metadata file
Returns: dict – musicbrainz metadata

mirdata.datasets.tonality_classicaldb.load_spectrum(fhandle: TextIO) → numpy.ndarray [source]¶

Load Tonality classicalDB spectrum data from a file

Parameters: fhandle (str or file-like) – File-like object or path to spectrum file
Returns: np.ndarray – spectrum data

Core¶

Core mirdata classes

class mirdata.core.Dataset(data_home=None, name=None, track_class=None, bibtex=None, remotes=None, download_info=None, license_info=None, custom_index_path=None)[source]¶

mirdata Dataset class

Variables

data_home (str) – path where mirdata will look for the dataset
name (str) – the identifier of the dataset
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
readme (str) – information about the dataset
track (function) – a function mapping a track_id to a mirdata.core.Track

__init__(data_home=None, name=None, track_class=None, bibtex=None, remotes=None, download_info=None, license_info=None, custom_index_path=None)[source]¶

Dataset init method

Parameters

data_home (str or None) – path where mirdata will look for the dataset
name (str or None) – the identifier of the dataset
track_class (mirdata.core.Track or None) – a Track class
bibtex (str or None) – dataset citation/s in bibtex format
remotes (dict or None) – data to be downloaded
download_info (str or None) – download instructions or caveats
license_info (str or None) – license of the dataset
custom_index_path (str or None) – overwrites the default index path for remote indexes

choice_track()[source]¶

Choose a random track

Returns: Track – a Track object instantiated by a random track_id

cite()[source]¶: Print the reference

property default_path¶

Get the default path for the dataset

Returns: str – Local path to the dataset

download(partial_download=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally print a message.

Parameters

partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete any zip/tar files after extracting.

Raises

ValueError – if invalid keys are passed to partial_download
IOError – if a downloaded file’s checksum is different from expected

license()[source]¶: Print the license

load_tracks()[source]¶

Load all tracks in the dataset

Returns: dict – {track_id: track data}
Raises: NotImplementedError – If the dataset does not support Tracks

track_ids[source]¶

Return track ids

Returns: list – A list of track ids

validate(verbose=True)[source]¶

Validate if the stored dataset is a valid version

Parameters

verbose (bool) – If False, don’t print output

Returns

list - files in the index but are missing locally
list - files which have an invalid checksum

class mirdata.core.MultiTrack(track_id, data_home, dataset_name, index, metadata=None)[source]¶

MultiTrack class.

A multitrack class is a collection of track objects and their associated audio that can be mixed together. A multitrack is iteslf a Track, and can have its own associated audio (such as a mastered mix), its own metadata and its own annotations.

get_mix()[source]¶

Create a linear mixture given a subset of tracks.

Parameters: track_keys (list) – list of track keys to mix together
Returns: np.ndarray – mixture audio with shape (n_samples, n_channels)

get_random_target(n_tracks=None, min_weight=0.3, max_weight=1.0)[source]¶

Get a random target by combining a random selection of tracks with random weights

Parameters

n_tracks (int or None) – number of tracks to randomly mix. If None, uses all tracks
min_weight (float) – minimum possible weight when mixing
max_weight (float) – maximum possible weight when mixing

Returns

np.ndarray - mixture audio with shape (n_samples, n_channels)
list - list of keys of included tracks
list - list of weights used to mix tracks

get_target(track_keys, weights=None, average=True, enforce_length=True)[source]¶

Get target which is a linear mixture of tracks

Parameters

track_keys (list) – list of track keys to mix together
weights (list or None) – list of positive scalars to be used in the average
average (bool) – if True, computes a weighted average of the tracks if False, computes a weighted sum of the tracks
enforce_length (bool) – If True, raises ValueError if the tracks are not the same length. If False, pads audio with zeros to match the length of the longest track

Returns

np.ndarray – target audio with shape (n_channels, n_samples)

Raises

ValueError – if sample rates of the tracks are not equal if enforce_length=True and lengths are not equal

class mirdata.core.Track(track_id, data_home, dataset_name, index, metadata=None)[source]¶

Track base class

See the docs for each dataset loader’s Track class for details

__init__(track_id, data_home, dataset_name, index, metadata=None)[source]¶

Track init method. Sets boilerplate attributes, including:

track_id
_dataset_name
_data_home
_track_paths
_track_metadata

Parameters

track_id (str) – track id
data_home (str) – path where mirdata will look for the dataset
dataset_name (str) – the identifier of the dataset
index (dict) – the dataset’s file index
metadata (dict or None) – a dictionary of metadata or None

class mirdata.core.cached_property(func)[source]¶

Cached propery decorator

A property that is only computed once per instance and then replaces itself with an ordinary attribute. Deleting the attribute resets the property. Source: https://github.com/bottlepy/bottle/commit/fa7733e075da0d790d809aa3d2f53071897e6f76

mirdata.core.copy_docs(original)[source]¶: Decorator function to copy docs from one function to another

mirdata.core.docstring_inherit(parent)[source]¶

Decorator function to inherit docstrings from the parent class.

Adds documented Attributes from the parent to the child docs.

mirdata.core.none_path_join(partial_path_list)[source]¶

Join a list of partial paths. If any part of the path is None, returns None.

Parameters: partial_path_list (list) – List of partial paths
Returns: str or None – joined path string or None

Annotations¶

mirdata annotation data types

class mirdata.annotations.Annotation[source]¶: Annotation base class

class mirdata.annotations.BeatData(times, positions=None)[source]¶

BeatData class

Variables

times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values
positions (np.ndarray or None) – array of beat positions (as ints) e.g. 1, 2, 3, 4

class mirdata.annotations.ChordData(intervals, labels, confidence=None)[source]¶

ChordData class

Variables

intervals (np.ndarray or None) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
labels (list) – list chord labels (as strings)
confidence (np.ndarray or None) – array of confidence values between 0 and 1

class mirdata.annotations.EventData(intervals, events)[source]¶

TempoData class

Variables

intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
events (list) – list of event labels (as strings)

class mirdata.annotations.F0Data(times, frequencies, confidence=None)[source]¶

F0Data class

Variables

times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values
frequencies (np.ndarray) – array of frequency values (as floats) in Hz
confidence (np.ndarray or None) – array of confidence values between 0 and 1

class mirdata.annotations.KeyData(intervals, keys)[source]¶

KeyData class

Variables

intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
keys (list) – list key labels (as strings)

class mirdata.annotations.LyricData(intervals, lyrics, pronunciations=None)[source]¶

LyricData class

Variables

intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
lyrics (list) – list of lyrics (as strings)
pronunciations (list or None) – list of pronunciations (as strings)

class mirdata.annotations.MultiF0Data(times, frequency_list, confidence_list=None)[source]¶

MultiF0Data class

Variables

times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values
frequency_list (list) – list of lists of frequency values (as floats) in Hz
confidence_list (list or None) – list of lists of confidence values between 0 and 1

class mirdata.annotations.NoteData(intervals, notes, confidence=None)[source]¶

NoteData class

Variables

intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
notes (np.ndarray) – array of notes (as floats) in Hz
confidence (np.ndarray or None) – array of confidence values between 0 and 1

class mirdata.annotations.SectionData(intervals, labels=None)[source]¶

SectionData class

Variables

intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] times should be positive and intervals should have non-negative duration
labels (list or None) – list of labels (as strings)

class mirdata.annotations.TempoData(intervals, value, confidence=None)[source]¶

TempoData class

Variables

intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
value (list) – array of tempo values (as floats)
confidence (np.ndarray or None) – array of confidence values between 0 and 1

mirdata.annotations.validate_array_like(array_like, expected_type, expected_dtype, none_allowed=False)[source]¶

Validate that array-like object is well formed

If array_like is None, validation passes automatically.

Parameters

array_like (array-like) – object to validate
expected_type (type) – expected type, either list or np.ndarray
expected_dtype (type) – expected dtype
none_allowed (bool) – if True, allows array to be None

Raises

TypeError – if type/dtype does not match expected_type/expected_dtype
ValueError – if array

mirdata.annotations.validate_confidence(confidence)[source]¶

Validate if confidence is well-formed.

If confidence is None, validation passes automatically

Parameters: confidence (np.ndarray) – an array of confidence values
Raises: ValueError – if confidence are not between 0 and 1

mirdata.annotations.validate_intervals(intervals)[source]¶

Validate if intervals are well-formed.

If intervals is None, validation passes automatically

Parameters

intervals (np.ndarray) – (n x 2) array

Raises

ValueError – if intervals have an invalid shape, have negative values
or if end times are smaller than start times. –

mirdata.annotations.validate_lengths_equal(array_list)[source]¶

Validate that arrays in list are equal in length

Some arrays may be None, and the validation for these are skipped.

Parameters: array_list (list) – list of array-like objects
Raises: ValueError – if arrays are not equal in length

mirdata.annotations.validate_times(times)[source]¶

Validate if times are well-formed.

If times is None, validation passes automatically

Parameters: times (np.ndarray) – an array of time stamps
Raises: ValueError – if times have negative values or are non-increasing

Advanced¶

mirdata.validate¶

Utility functions for mirdata

mirdata.validate.log_message(message, verbose=True)[source]¶

Helper function to log message

Parameters

message (str) – message to log
verbose (bool) – if false, the message is not logged

mirdata.validate.md5(file_path)[source]¶

Get md5 hash of a file.

Parameters: file_path (str) – File path
Returns: str – md5 hash of data in file_path

mirdata.validate.validate(local_path, checksum)[source]¶

Validate that a file exists and has the correct checksum

Parameters

local_path (str) – file path
checksum (str) – md5 checksum

Returns

bool - True if file exists
bool - True if checksum matches

mirdata.validate.validate_files(file_dict, data_home, verbose)[source]¶

Validate files

Parameters

file_dict (dict) – dictionary of file information
data_home (str) – path where the data lives
verbose (bool) – if True, show progress

Returns

dict - missing files
dict - files with invalid checksums

mirdata.validate.validate_index(dataset_index, data_home, verbose=True)[source]¶

Validate files in a dataset’s index

Parameters

dataset_index (list) – dataset indices
data_home (str) – Local home path that the dataset is being stored
verbose (bool) – if true, prints validation status while running

Returns

dict - file paths that are in the index but missing locally
dict - file paths with differing checksums

mirdata.validate.validate_metadata(file_dict, data_home, verbose)[source]¶

Validate files

Parameters

file_dict (dict) – dictionary of file information
data_home (str) – path where the data lives
verbose (bool) – if True, show progress

Returns

dict - missing files
dict - files with invalid checksums

mirdata.validate.validator(dataset_index, data_home, verbose=True)[source]¶

Checks the existence and validity of files stored locally with respect to the paths and file checksums stored in the reference index. Logs invalid checksums and missing files.

Parameters

dataset_index (list) – dataset indices
data_home (str) – Local home path that the dataset is being stored
verbose (bool) – if True (default), prints missing and invalid files to stdout. Otherwise, this function is equivalent to validate_index.

Returns

missing_files (list) –

List of file paths that are in the dataset index: but missing locally.
invalid_checksums (list): List of file paths that file exists in the: dataset index but has a different checksum compare to the reference checksum.

mirdata.download_utils¶

Utilities for downloading from the web.

class mirdata.download_utils.DownloadProgressBar(*_, **__)[source]¶: Wrap tqdm to show download progress

class mirdata.download_utils.RemoteFileMetadata(filename, url, checksum, destination_dir=None, unpack_directories=None)[source]¶

The metadata for a remote file

Variables

filename (str) – the remote file’s basename
url (str) – the remote file’s url
checksum (str) – the remote file’s md5 checksum
destination_dir (str or None) – the relative path for where to save the file
unpack_directories (list or None) – list of relative directories. For each directory the contents will be moved to destination_dir (or data_home if not provieds)

mirdata.download_utils.download_from_remote(remote, save_dir, force_overwrite)[source]¶

Download a remote dataset into path Fetch a dataset pointed by remote’s url, save into path using remote’s filename and ensure its integrity based on the MD5 Checksum of the downloaded file.

Adapted from scikit-learn’s sklearn.datasets.base._fetch_remote.

Parameters

remote (RemoteFileMetadata) – Named tuple containing remote dataset meta information: url, filename and checksum
save_dir (str) – Directory to save the file to. Usually data_home
force_overwrite (bool) – If True, overwrite existing file with the downloaded file. If False, does not overwrite, but checks that checksum is consistent.

Returns

str – Full path of the created file.

mirdata.download_utils.download_tar_file(tar_remote, save_dir, force_overwrite, cleanup)[source]¶

Download and untar a tar file.

Parameters

tar_remote (RemoteFileMetadata) – Object containing download information
save_dir (str) – Path to save downloaded file
force_overwrite (bool) – If True, overwrites existing files
cleanup (bool) – If True, remove tarfile after untarring

mirdata.download_utils.download_zip_file(zip_remote, save_dir, force_overwrite, cleanup)[source]¶

Download and unzip a zip file.

Parameters

zip_remote (RemoteFileMetadata) – Object containing download information
save_dir (str) – Path to save downloaded file
force_overwrite (bool) – If True, overwrites existing files
cleanup (bool) – If True, remove zipfile after unziping

mirdata.download_utils.downloader(save_dir, remotes=None, partial_download=None, info_message=None, force_overwrite=False, cleanup=False)[source]¶

Download data to save_dir and optionally log a message.

Parameters

save_dir (str) – The directory to download the data
remotes (dict or None) – A dictionary of RemoteFileMetadata tuples of data in zip format. If None, there is no data to download
partial_download (list or None) – A list of keys to partially download the remote objects of the download dict. If None, all data is downloaded
info_message (str or None) – A string of info to log when this function is called. If None, no string is logged.
force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
cleanup (bool) – Whether to delete the zip/tar file after extracting.

mirdata.download_utils.extractall_unicode(zfile, out_dir)[source]¶

Extract all files inside a zip archive to a output directory.

In comparison to the zipfile, it checks for correct file name encoding

Parameters

zfile (obj) – Zip file object created with zipfile.ZipFile
out_dir (str) – Output folder

mirdata.download_utils.move_directory_contents(source_dir, target_dir)[source]¶

Move the contents of source_dir into target_dir, and delete source_dir

Parameters

source_dir (str) – path to source directory
target_dir (str) – path to target directory

mirdata.download_utils.untar(tar_path, cleanup)[source]¶

Untar a tar file inside it’s current directory.

Parameters

tar_path (str) – Path to tar file
cleanup (bool) – If True, remove tarfile after untarring

mirdata.download_utils.unzip(zip_path, cleanup)[source]¶

Unzip a zip file inside it’s current directory.

Parameters

zip_path (str) – Path to zip file
cleanup (bool) – If True, remove zipfile after unzipping

mirdata.jams_utils¶

Utilities for converting mirdata Annotation classes to jams format.

mirdata.jams_utils.beats_to_jams(beat_data, description=None)[source]¶

Convert beat annotations into jams format.

Parameters

beat_data (annotations.BeatData) – beat data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.chords_to_jams(chord_data, description=None)[source]¶

Convert chord annotations into jams format.

Parameters

chord_data (annotations.ChordData) – chord data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.events_to_jams(event_data, description=None)[source]¶

Convert events annotations into jams format.

Parameters

event_data (annotations.EventData) – event data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.f0s_to_jams(f0_data, description=None)[source]¶

Convert f0 annotations into jams format.

Parameters

f0_data (annotations.F0Data) – f0 annotation object
description (str) – annotation descriptoin

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.jams_converter(audio_path=None, spectrogram_path=None, beat_data=None, chord_data=None, note_data=None, f0_data=None, section_data=None, multi_section_data=None, tempo_data=None, event_data=None, key_data=None, lyrics_data=None, tags_gtzan_data=None, tags_open_data=None, metadata=None)[source]¶

Convert annotations from a track to JAMS format.

Parameters

audio_path (str or None) – A path to the corresponding audio file, or None. If provided, the audio file will be read to compute the duration. If None, ‘duration’ must be a field in the metadata dictionary, or the resulting jam object will not validate.
spectrum_cante100_path (str or None) – A path to the corresponding spectrum file, or None.
beat_data (list or None) – A list of tuples of (annotations.BeatData, str), where str describes the annotation (e.g. ‘beats_1’).
chord_data (list or None) – A list of tuples of (annotations.ChordData, str), where str describes the annotation.
note_data (list or None) – A list of tuples of (annotations.NoteData, str), where str describes the annotation.
f0_data (list or None) – A list of tuples of (annotations.F0Data, str), where str describes the annotation.
section_data (list or None) – A list of tuples of (annotations.SectionData, str), where str describes the annotation.
multi_section_data (list or None) – A list of tuples. Tuples in multi_section_data should contain another list of tuples, indicating annotations in the different levels e.g. ([(segments0, level0), ‘(segments1, level1)], annotator) and a str indicating the annotator
tempo_data (list or None) – A list of tuples of (float, str), where float gives the tempo in bpm and str describes the annotation.
event_data (list or None) – A list of tuples of (annotations.EventData, str), where str describes the annotation.
key_data (list or None) – A list of tuples of (annotations.KeyData, str), where str describes the annotation.
lyrics_data (list or None) – A list of tuples of (annotations.LyricData, str), where str describes the annotation.
tags_gtzan_data (list or None) – A list of tuples of (str, str), where the first srt is the tag and the second is a descriptor of the annotation.
tags_open_data (list or None) – A list of tuples of (str, str), where the first srt is the tag and the second is a descriptor of the annotation.
metadata (dict or None) – A dictionary containing the track metadata.

Returns

jams.JAMS – A JAMS object containing the annotations.

mirdata.jams_utils.keys_to_jams(key_data, description)[source]¶

Convert key annotations into jams format.

Parameters

key_data (annotations.KeyData) – key data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.lyrics_to_jams(lyric_data, description=None)[source]¶

Convert lyric annotations into jams format.

Parameters

lyric_data (annotations.LyricData) – lyric annotation object
description (str) – annotation descriptoin

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.multi_sections_to_jams(multisection_data, description)[source]¶

Convert multi-section annotations into jams format.

Parameters

multisection_data (list) – list of tuples of the form [(SectionData, int)]
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.notes_to_jams(note_data, description)[source]¶

Convert note annotations into jams format.

Parameters

note_data (annotations.NoteData) – note data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.sections_to_jams(section_data, description=None)[source]¶

Convert section annotations into jams format.

Parameters

section_data (annotations.SectionData) – section data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.tag_to_jams(tag_data, namespace='tag_open', description=None)[source]¶

Convert lyric annotations into jams format.

Parameters

lyric_data (annotations.LyricData) – lyric annotation object
namespace (str) – the jams-compatible tag namespace
description (str) – annotation descriptoin

Returns

jams.Annotation – jams annotation object.

mirdata.jams_utils.tempos_to_jams(tempo_data, description=None)[source]¶

Convert tempo annotations into jams format.

Parameters

tempo_data (annotations.TempoData) – tempo data object
description (str) – annotation description

Returns

jams.Annotation – jams annotation object.