Initializing¶
-
mirdata.
initialize
(dataset_name, data_home=None)[source]¶ Load a mirdata dataset by name
Example
orchset = mirdata.initialize('orchset') # get the orchset dataset orchset.download() # download orchset orchset.validate() # validate orchset track = orchset.choice_track() # load a random track print(track) # see what data a track contains orchset.track_ids() # load all track ids
Parameters: - dataset_name (str) – the dataset’s name see mirdata.DATASETS for a complete list of possibilities
- data_home (str or None) – path where the data lives. If None uses the default location.
Returns: Dataset – a mirdata.core.Dataset object
Dataset Loaders¶
acousticbrainz_genre¶
Acoustic Brainz Genre dataset
Dataset Info
The AcousticBrainz Genre Dataset consists of four datasets of genre annotations and music features extracted from audio suited for evaluation of hierarchical multi-label genre classification systems.
Description about the music features can be found here: https://essentia.upf.edu/streaming_extractor_music.html
The datasets are used within the MediaEval AcousticBrainz Genre Task. The task is focused on content-based music genre recognition using genre annotations from multiple sources and large-scale music features data available in the AcousticBrainz database. The goal of our task is to explore how the same music pieces can be annotated differently by different communities following different genre taxonomies, and how this should be addressed by content-based genre r ecognition systems.
We provide four datasets containing genre and subgenre annotations extracted from four different online metadata sources:
- AllMusic and Discogs are based on editorial metadata databases maintained by music experts and enthusiasts. These sources contain explicit genre/subgenre annotations of music releases (albums) following a predefined genre namespace and taxonomy. We propagated release-level annotations to recordings (tracks) in AcousticBrainz to build the datasets.
- Lastfm and Tagtraum are based on collaborative music tagging platforms with large amounts of genre labels provided by their users for music recordings (tracks). We have automatically inferred a genre/subgenre taxonomy and annotations from these labels.
For details on format and contents, please refer to the data webpage.
Note, that the AllMusic ground-truth annotations are distributed separately at https://zenodo.org/record/2554044.
If you use the MediaEval AcousticBrainz Genre dataset or part of it, please cite our ISMIR 2019 overview paper:
Bogdanov, D., Porter A., Schreiber H., Urbano J., & Oramas S. (2019).
The AcousticBrainz Genre Dataset: Multi-Source, Multi-Level, Multi-Label, and Large-Scale.
20th International Society for Music Information Retrieval Conference (ISMIR 2019).
This work is partially supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688382 AudioCommons.
-
class
mirdata.datasets.acousticbrainz_genre.
Dataset
(data_home=None, index=None)[source]¶ The acousticbrainz genre dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download the dataset
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files. By default False.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
filter_index
(search_key)[source]¶ Load from AcousticBrainz genre dataset the indexes that match with search_key.
Parameters: search_key (str) – regex to match with folds, mbid or genres Returns: dict – {track_id: track data}
-
load_all_train
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for training across the four different datasets.
Returns: dict – {track_id: track data}
-
load_all_validation
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for validating across the four different datasets.
Returns: dict – {track_id: track data}
-
load_allmusic_train
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for validation in allmusic dataset.
Returns: dict – {track_id: track data}
-
load_allmusic_validation
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for validation in allmusic dataset.
Returns: dict – {track_id: track data}
-
load_discogs_train
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for training in discogs dataset.
Returns: dict – {track_id: track data}
-
load_discogs_validation
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for validation in tagtraum dataset.
Returns: dict – {track_id: track data}
-
load_extractor
(*args, **kwargs)[source]¶ Load a AcousticBrainz Dataset json file with all the features and metadata.
Parameters: path (str) – path to features and metadata path Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_lastfm_train
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for training in lastfm dataset.
Returns: dict – {track_id: track data}
-
load_lastfm_validation
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for validation in lastfm dataset.
Returns: dict – {track_id: track data}
-
load_tagtraum_train
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for training in tagtraum dataset.
Returns: dict – {track_id: track data}
-
load_tagtraum_validation
()[source]¶ Load from AcousticBrainz genre dataset the tracks that are used for validating in tagtraum dataset.
Returns: dict – {track_id: track data}
-
class
mirdata.datasets.acousticbrainz_genre.
Track
(track_id, data_home, remote_index=None, remote_index_name=None)[source]¶ AcousticBrainz Genre Dataset track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets
Variables: track_id (str) – track id
-
album
¶ metadata album annotation
Returns: str – album
-
artist
¶ metadata artist annotation
Returns: str – artist
-
date
¶ metadata date annotation
Returns: str – date
-
file_name
¶ metadata file_name annotation
Returns: str – file name
-
genre
¶ human-labeled genre and subgenres list
Returns: list – human-labeled genre and subgenres list
-
low_level
¶ low_level track descritors.
Returns: dict – - ‘average_loudness’: dynamic range descriptor. It rescales average loudness, computed on 2sec windows with 1 sec overlap, into the [0,1] interval. The value of 0 corresponds to signals with large dynamic range, 1 corresponds to signal with little dynamic range. Algorithms: Loudness
- ’dynamic_complexity’: dynamic complexity computed on 2sec windows with 1sec overlap. Algorithms: DynamicComplexity
- ’silence_rate_20dB’, ‘silence_rate_30dB’, ‘silence_rate_60dB’: rate of silent frames in a signal for thresholds of 20, 30, and 60 dBs. Algorithms: SilenceRate
- ’spectral_rms’: spectral RMS. Algorithms: RMS
- ’spectral_flux’: spectral flux of a signal computed using L2-norm. Algorithms: Flux
- ’spectral_centroid’, ‘spectral_kurtosis’, ‘spectral_spread’, ‘spectral_skewness’: centroid and central moments statistics describing the spectral shape. Algorithms: Centroid, CentralMoments
- ’spectral_rolloff’: the roll-off frequency of a spectrum. Algorithms: RollOff
- ’spectral_decrease’: spectral decrease. Algorithms: Decrease
- ’hfc’: high frequency content descriptor as proposed by Masri. Algorithms: HFC
- ’zerocrossingrate’ zero-crossing rate. Algorithms: ZeroCrossingRate
- ’spectral_energy’: spectral energy. Algorithms: Energy
- ’spectral_energyband_low’, ‘spectral_energyband_middle_low’, ‘spectral_energyband_middle_high’,
- ’spectral_energyband_high’: spectral energy in frequency bands [20Hz, 150Hz], [150Hz, 800Hz], [800Hz, 4kHz], and [4kHz, 20kHz]. Algorithms EnergyBand
- ’barkbands’: spectral energy in 27 Bark bands. Algorithms: BarkBands
- ’melbands’: spectral energy in 40 mel bands. Algorithms: MFCC
- ’erbbands’: spectral energy in 40 ERB bands. Algorithms: ERBBands
- ’mfcc’: the first 13 mel frequency cepstrum coefficients. See algorithm: MFCC
- ’gfcc’: the first 13 gammatone feature cepstrum coefficients. Algorithms: GFCC
- ’barkbands_crest’, ‘barkbands_flatness_db’: crest and flatness computed over energies in Bark bands. Algorithms: Crest, FlatnessDB
- ’barkbands_kurtosis’, ‘barkbands_skewness’, ‘barkbands_spread’: central moments statistics over energies in Bark bands. Algorithms: CentralMoments
- ’melbands_crest’, ‘melbands_flatness_db’: crest and flatness computed over energies in mel bands. Algorithms: Crest, FlatnessDB
- ’melbands_kurtosis’, ‘melbands_skewness’, ‘melbands_spread’: central moments statistics over energies in mel bands. Algorithms: CentralMoments
- ’erbbands_crest’, ‘erbbands_flatness_db’: crest and flatness computed over energies in ERB bands. Algorithms: Crest, FlatnessDB
- ’erbbands_kurtosis’, ‘erbbands_skewness’, ‘erbbands_spread’: central moments statistics over energies in ERB bands. Algorithms: CentralMoments
- ’dissonance’: sensory dissonance of a spectrum. Algorithms: Dissonance
- ’spectral_entropy’: Shannon entropy of a spectrum. Algorithms: Entropy
- ’pitch_salience’: pitch salience of a spectrum. Algorithms: PitchSalience
- ’spectral_complexity’: spectral complexity. Algorithms: SpectralComplexity
- ’spectral_contrast_coeffs’, ‘spectral_contrast_valleys’: spectral contrast features. Algorithms: SpectralContrast
-
mbid
¶ musicbrainz id
Returns: str – mbid
-
mbid_group
¶ musicbrainz id group
Returns: str – mbid group
-
rhythm
¶ rhythm essentia extractor descriptors
Returns: dict – - ‘beats_position’: time positions [sec] of detected beats using beat tracking algorithm by Degara et al., 2012. Algorithms: RhythmExtractor2013, BeatTrackerDegara
- ’beats_count’: number of detected beats
- ’bpm’: BPM value according to detected beats
- ’bpm_histogram_first_peak_bpm’, ‘bpm_histogram_first_peak_spread’, ‘bpm_histogram_first_peak_weight’,
- ’bpm_histogram_second_peak_bpm’, ‘bpm_histogram_second_peak_spread’, ‘bpm_histogram_second_peak_weight’: descriptors characterizing highest and second highest peak of the BPM histogram. Algorithms: BpmHistogramDescriptors
- ’beats_loudness’, ‘beats_loudness_band_ratio’: spectral energy computed on beats segments of audio across the whole spectrum, and ratios of energy in 6 frequency bands. Algorithms: BeatsLoudness, SingleBeatLoudness
- ’onset_rate’: number of detected onsets per second. Algorithms: OnsetRate
- ’danceability’: danceability estimate. Algorithms: Danceability
-
title
¶ metadata title annotation
Returns: str – title
-
to_jams
()[source]¶ the track’s data in jams format
Returns: jams.JAMS – return track data in jam format
-
tonal
¶ tonal features
Returns: dict – - ‘tuning_frequency’: estimated tuning frequency [Hz]. Algorithms: TuningFrequency
- ’tuning_nontempered_energy_ratio’ and ‘tuning_equal_tempered_deviation’
- ’hpcp’, ‘thpcp’: 32-dimensional harmonic pitch class profile (HPCP) and its transposed version. Algorithms: HPCP
- ’hpcp_entropy’: Shannon entropy of a HPCP vector. Algorithms: Entropy
- ’key_key’, ‘key_scale’: Global key feature. Algorithms: Key
- ’chords_key’, ‘chords_scale’: Global key extracted from chords detection.
- ’chords_strength’, ‘chords_histogram’: : strength of estimated chords and normalized histogram of their progression; Algorithms: ChordsDetection, ChordsDescriptors
- ’chords_changes_rate’, ‘chords_number_rate’: chords change rate in the progression; ratio of different chords from the total number of chords in the progression; Algorithms: ChordsDetection, ChordsDescriptors
-
tracknumber
¶ metadata tracknumber annotation
Returns: str – tracknumber
beatles¶
Beatles Dataset Loader
Dataset Info
The Beatles Dataset includes beat and metric position, chord, key, and segmentation annotations for 179 Beatles songs. Details can be found in http://matthiasmauch.net/_pdf/mauch_omp_2009.pdf and http://isophonics.net/content/reference-annotations-beatles.
-
class
mirdata.datasets.beatles.
Dataset
(data_home=None)[source]¶ The beatles dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Beatles audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_beats
(*args, **kwargs)[source]¶ Load Beatles format beat data from a file
Parameters: beats_path (str) – path to beat annotation file Returns: BeatData – loaded beat data
-
load_chords
(*args, **kwargs)[source]¶ Load Beatles format chord data from a file
Parameters: chords_path (str) – path to chord annotation file Returns: ChordData – loaded chord data
-
load_sections
(*args, **kwargs)[source]¶ Load Beatles format section data from a file
Parameters: sections_path (str) – path to section annotation file Returns: SectionData – loaded section data
-
class
mirdata.datasets.beatles.
Track
(track_id, data_home)[source]¶ Beatles track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – path where the data lives
Variables: - audio_path (str) – track audio path
- beats_path (str) – beat annotation path
- chords_path (str) – chord annotation path
- keys_path (str) – key annotation path
- sections_path (str) – sections annotation path
- title (str) – title of the track
- track_id (str) – track id
Other Parameters: - beats (BeatData) – human-labeled beat annotations
- chords (ChordData) – human-labeled chord annotations
- key (KeyData) – local key annotations
- sections (SectionData) – section annotations
-
audio
¶ The track’s audio
Returns: np.ndarray – audio signal float: sample rate
-
mirdata.datasets.beatles.
load_audio
(audio_path)[source]¶ Load a Beatles audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.beatles.
load_beats
(beats_path)[source]¶ Load Beatles format beat data from a file
Parameters: beats_path (str) – path to beat annotation file Returns: BeatData – loaded beat data
-
mirdata.datasets.beatles.
load_chords
(chords_path)[source]¶ Load Beatles format chord data from a file
Parameters: chords_path (str) – path to chord annotation file Returns: ChordData – loaded chord data
beatport_key¶
beatport_key Dataset Loader
Dataset Info
The Beatport EDM Key Dataset includes 1486 two-minute sound excerpts from various EDM subgenres, annotated with single-key labels, comments and confidence levels generously provided by Eduard Mas Marín, and thoroughly revised and expanded by Ángel Faraldo.
The original audio samples belong to online audio snippets from Beatport, an online music store for DJ’s and Electronic Dance Music Producers (<http:www.beatport.com>). If this dataset were used in further research, we would appreciate the citation of the current DOI (10.5281/zenodo.1101082) and the following doctoral dissertation, where a detailed description of the properties of this dataset can be found:
Ángel Faraldo (2017). Tonality Estimation in Electronic Dance Music: A Computational and Musically Informed
Examination. PhD Thesis. Universitat Pompeu Fabra, Barcelona.
This dataset is mainly intended to assess the performance of computational key estimation algorithms in electronic dance music subgenres.
Data License: Creative Commons Attribution Share Alike 4.0 International
-
class
mirdata.datasets.beatport_key.
Dataset
(data_home=None)[source]¶ The beatport_key dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download the dataset
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_artist
(*args, **kwargs)[source]¶ Load beatport_key tempo data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: list – list of artists involved in the track.
-
load_audio
(*args, **kwargs)[source]¶ Load a beatport_key audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_genre
(*args, **kwargs)[source]¶ Load beatport_key genre data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: dict – with the list with genres [‘genres’] and list with sub-genres [‘sub_genres’]
-
load_key
(*args, **kwargs)[source]¶ Load beatport_key format key data from a file
Parameters: keys_path (str) – path to key annotation file Returns: list – list of annotated keys
-
load_tempo
(*args, **kwargs)[source]¶ Load beatport_key tempo data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: str – tempo in beats per minute
-
class
mirdata.datasets.beatport_key.
Track
(track_id, data_home)[source]¶ beatport_key track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored.
Variables: - audio_path (str) – track audio path
- keys_path (str) – key annotation path
- metadata_path (str) – sections annotation path
- title (str) – title of the track
- track_id (str) – track id
Other Parameters: - key (list) – list of annotated musical keys
- artists (list) – artists involved in the track
- genre (dict) – genres and subgenres
- tempo (int) – tempo in beats per minute
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.beatport_key.
load_artist
(metadata_path)[source]¶ Load beatport_key tempo data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: list – list of artists involved in the track.
-
mirdata.datasets.beatport_key.
load_audio
(audio_path)[source]¶ Load a beatport_key audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.beatport_key.
load_genre
(metadata_path)[source]¶ Load beatport_key genre data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: dict – with the list with genres [‘genres’] and list with sub-genres [‘sub_genres’]
cante100¶
cante100 Loader
Dataset Info
The cante100 dataset contains 100 tracks taken from the COFLA corpus. We defined 10 style families of which 10 tracks each are included. Apart from the style family, we manually annotated the sections of the track in which the vocals are present. In addition, we provide a number of low-level descriptors and the fundamental frequency corresponding to the predominant melody for each track. The meta-information includes editoral meta-data and the musicBrainz ID.
Total tracks: 100
cante100 audio is only available upon request. To download the audio request access in this link: https://zenodo.org/record/1324183. Then unzip the audio into the cante100 general dataset folder for the rest of annotations and files.
Audio specifications:
- Sampling frequency: 44.1 kHz
- Bit-depth: 16 bit
- Audio format: .mp3
cante100 dataset has spectrogram available, in csv format. spectrogram is available to download without request needed, so at first instance, cante100 loader uses the spectrogram of the tracks.
The available annotations are:
- F0 (predominant melody)
- Automatic transcription of notes (of singing voice)
CANTE100 LICENSE (COPIED FROM ZENODO PAGE)
The provided datasets are offered free of charge for internal non-commercial use.
We do not grant any rights for redistribution or modification. All data collections were gathered
by the COFLA team.
© COFLA 2015. All rights reserved.
For more details, please visit: http://www.cofla-project.com/?page_id=134
-
class
mirdata.datasets.cante100.
Dataset
(data_home=None)[source]¶ The cante100 dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a cante100 audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_melody
(*args, **kwargs)[source]¶ Load cante100 f0 annotations
Parameters: f0_path (str) – path to audio file Returns: F0Data – predominant melody
-
load_notes
(*args, **kwargs)[source]¶ Load note data from the annotation files
Parameters: notes_path (str) – path to notes file Returns: NoteData – note annotations
-
load_spectrogram
(*args, **kwargs)[source]¶ Load a cante100 dataset spectrogram file.
Parameters: spectrogram_path (str) – path to audio file Returns: np.ndarray – spectrogram
-
class
mirdata.datasets.cante100.
Track
(track_id, data_home)[source]¶ cante100 track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets/cante100
Variables: Other Parameters: - melody (F0Data) – annotated melody
- notes (NoteData) – annotated notes
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
spectrogram
¶ spectrogram of The track’s audio
Returns: (np.ndarray) – spectrogram
-
mirdata.datasets.cante100.
load_audio
(audio_path)[source]¶ Load a cante100 audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.cante100.
load_melody
(f0_path)[source]¶ Load cante100 f0 annotations
Parameters: f0_path (str) – path to audio file Returns: F0Data – predominant melody
dali¶
DALI Dataset Loader
Dataset Info
DALI contains 5358 audio files with their time-aligned vocal melody. It also contains time-aligned lyrics at four levels of granularity: notes, words, lines, and paragraphs.
For each song, DALI also provides additional metadata: genre, language, musician, album covers, or links to video clips.
For more details, please visit: https://github.com/gabolsgabs/DALI
-
class
mirdata.datasets.dali.
Dataset
(data_home=None)[source]¶ The dali dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_annotations_class
(*args, **kwargs)[source]¶ Load full annotations into the DALI class object
Parameters: annotations_path (str) – path to a DALI annotation file Returns: DALI.annotations – DALI annotations object
-
load_annotations_granularity
(*args, **kwargs)[source]¶ Load annotations at the specified level of granularity
Parameters: - annotations_path (str) – path to a DALI annotation file
- granularity (str) – one of ‘notes’, ‘words’, ‘lines’, ‘paragraphs’
Returns: NoteData for granularity=’notes’ or LyricData otherwise
-
load_audio
(*args, **kwargs)[source]¶ Load a DALI audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
class
mirdata.datasets.dali.
Track
(track_id, data_home)[source]¶ DALI melody Track class
Parameters: track_id (str) – track id of the track
Variables: - album (str) – the track’s album
- annotation_path (str) – path to the track’s annotation file
- artist (str) – the track’s artist
- audio_path (str) – path to the track’s audio file
- audio_url (str) – youtube ID
- dataset_version (int) – dataset annotation version
- ground_truth (bool) – True if the annotation is verified
- language (str) – sung language
- release_date (str) – year the track was released
- scores_manual (int) – manual score annotations
- scores_ncc (float) – ncc score annotations
- title (str) – the track’s title
- track_id (str) – the unique track id
- url_working (bool) – True if the youtube url was valid
Other Parameters: - notes (NoteData) – vocal notes
- words (LyricData) – word-level lyrics
- lines (LyricData) – line-level lyrics
- paragraphs (LyricData) – paragraph-level lyrics
- annotation-object (DALI.Annotations) – DALI annotation object
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.dali.
load_annotations_class
(annotations_path)[source]¶ Load full annotations into the DALI class object
Parameters: annotations_path (str) – path to a DALI annotation file Returns: DALI.annotations – DALI annotations object
-
mirdata.datasets.dali.
load_annotations_granularity
(annotations_path, granularity)[source]¶ Load annotations at the specified level of granularity
Parameters: - annotations_path (str) – path to a DALI annotation file
- granularity (str) – one of ‘notes’, ‘words’, ‘lines’, ‘paragraphs’
Returns: NoteData for granularity=’notes’ or LyricData otherwise
giantsteps_key¶
giantsteps_key Dataset Loader
Dataset Info
The GiantSteps+ EDM Key Dataset includes 600 two-minute sound excerpts from various EDM subgenres, annotated with single-key labels, comments and confidence levels by Daniel G. Camhi, and thoroughly revised and expanded by Ángel Faraldo at MTG UPF. Additionally, 500 tracks have been thoroughly analysed, containing pitch-class set descriptions, key changes, and additional modal changes. This dataset is a revision of the original GiantSteps Key Dataset, available in Github (<https://github.com/GiantSteps/giantsteps-key-dataset>) and initially described in:
Knees, P., Faraldo, Á., Herrera, P., Vogl, R., Böck, S., Hörschläger, F., Le Goff, M. (2015).
Two Datasets for Tempo Estimation and Key Detection in Electronic Dance Music Annotated from User Corrections.
In Proceedings of the 16th International Society for Music Information Retrieval Conference, 364–370. Málaga, Spain.
The original audio samples belong to online audio snippets from Beatport, an online music store for DJ’s and Electronic Dance Music Producers (<http:www.beatport.com>). If this dataset were used in further research, we would appreciate the citation of the current DOI (10.5281/zenodo.1101082) and the following doctoral dissertation, where a detailed description of the properties of this dataset can be found:
Ángel Faraldo (2017). Tonality Estimation in Electronic Dance Music: A Computational and Musically Informed Examination.
PhD Thesis. Universitat Pompeu Fabra, Barcelona.
This dataset is mainly intended to assess the performance of computational key estimation algorithms in electronic dance music subgenres.
All the data of this dataset is licensed with Creative Commons Attribution Share Alike 4.0 International.
-
class
mirdata.datasets.giantsteps_key.
Dataset
(data_home=None)[source]¶ The giantsteps_key dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_artist
(*args, **kwargs)[source]¶ Load giantsteps_key tempo data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: list – list of artists involved in the track.
-
load_audio
(*args, **kwargs)[source]¶ Load a giantsteps_key audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_genre
(*args, **kwargs)[source]¶ Load giantsteps_key genre data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: dict – {‘genres’: […], ‘subgenres’: […]}
-
load_key
(*args, **kwargs)[source]¶ Load giantsteps_key format key data from a file
Parameters: keys_path (str) – path to key annotation file Returns: str – loaded key data
-
load_tempo
(*args, **kwargs)[source]¶ Load giantsteps_key tempo data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: str – loaded tempo data
-
class
mirdata.datasets.giantsteps_key.
Track
(track_id, data_home)[source]¶ giantsteps_key track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – track audio path
- keys_path (str) – key annotation path
- metadata_path (str) – sections annotation path
- title (str) – title of the track
- track_id (str) – track id
Other Parameters: - key (str) – musical key annotation
- artists (list) – list of artists involved
- genres (dict) – genres and subgenres
- tempo (int) – crowdsourced tempo annotations in beats per minute
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.giantsteps_key.
load_artist
(metadata_path)[source]¶ Load giantsteps_key tempo data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: list – list of artists involved in the track.
-
mirdata.datasets.giantsteps_key.
load_audio
(audio_path)[source]¶ Load a giantsteps_key audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.giantsteps_key.
load_genre
(metadata_path)[source]¶ Load giantsteps_key genre data from a file
Parameters: metadata_path (str) – path to metadata annotation file Returns: dict – {‘genres’: […], ‘subgenres’: […]}
giantsteps_tempo¶
giantsteps_tempo Dataset Loader
Dataset Info
GiantSteps tempo + genre is a collection of annotations for 664 2min(1) audio previews from www.beatport.com, created by Richard Vogl <richard.vogl@tuwien.ac.at> and Peter Knees <peter.knees@tuwien.ac.at>
references:
[giantsteps_tempo_cit_1] | Peter Knees, Ángel Faraldo, Perfecto Herrera, Richard Vogl, Sebastian Böck, Florian Hörschläger, Mickael Le Goff: “Two data sets for tempo estimation and key detection in electronic dance music annotated from user corrections”, Proc. of the 16th Conference of the International Society for Music Information Retrieval (ISMIR’15), Oct. 2015, Malaga, Spain. |
[giantsteps_tempo_cit_2] | Hendrik Schreiber, Meinard Müller: “A Crowdsourced Experiment for Tempo Estimation of Electronic Dance Music”, Proc. of the 19th Conference of the International Society for Music Information Retrieval (ISMIR’18), Sept. 2018, Paris, France. |
The audio files (664 files, size ~1gb) can be downloaded from http://www.beatport.com/ using the bash script:
https://github.com/GiantSteps/giantsteps-tempo-dataset/blob/master/audio_dl.sh
To download the files manually use links of the following form: http://geo-samples.beatport.com/lofi/<name of mp3 file> e.g.: http://geo-samples.beatport.com/lofi/5377710.LOFI.mp3
To convert the audio files to .wav use the script found at https://github.com/GiantSteps/giantsteps-tempo-dataset/blob/master/convert_audio.sh and run:
./convert_audio.sh
To retrieve the genre information, the JSON contained within the website was parsed. The tempo annotation was extracted from forum entries of people correcting the bpm values (i.e. manual annotation of tempo). For more information please refer to the publication [giantsteps_tempo_cit_1].
[giantsteps_tempo_cit_2] found some files without tempo. There are:
3041381.LOFI.mp3
3041383.LOFI.mp3
1327052.LOFI.mp3
Their v2 tempo is denoted as 0.0 in tempo and mirex and has no annotation in the JAMS format.
Most of the audio files are 120 seconds long. Exceptions are:
name length (sec)
906760.LOFI.mp3 62
1327052.LOFI.mp3 70
4416506.LOFI.mp3 80
1855660.LOFI.mp3 119
3419452.LOFI.mp3 119
3577631.LOFI.mp3 119
-
class
mirdata.datasets.giantsteps_tempo.
Dataset
(data_home=None)[source]¶ The giantsteps_tempo dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a giantsteps_tempo audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_genre
(*args, **kwargs)[source]¶ Load genre data from a file
Parameters: path (str) – path to metadata annotation file Returns: str – loaded genre data
-
load_tempo
(*args, **kwargs)[source]¶ Load giantsteps_tempo tempo data from a file ordered by confidence
Parameters: tempo_path (str) – path to tempo annotation file Returns: list – list of annotations.TempoData
-
class
mirdata.datasets.giantsteps_tempo.
Track
(track_id, data_home)[source]¶ giantsteps_tempo track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – track audio path
- title (str) – title of the track
- track_id (str) – track id
- annotation_v1_path (str) – track annotation v1 path
- annotation_v2_path (str) – track annotation v2 path
Other Parameters: - genre (dict) – Human-labeled metadata annotation
- tempo (list) – List of annotations.TempoData, ordered by confidence
- tempo_v2 (list) – List of annotations.TempoData for version 2, ordered by confidence
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.giantsteps_tempo.
load_audio
(audio_path)[source]¶ Load a giantsteps_tempo audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
groove_midi¶
Groove MIDI Loader
Dataset Info
The Groove MIDI Dataset (GMD) is composed of 13.6 hours of aligned MIDI and synthesized audio of human-performed, tempo-aligned expressive drumming. The dataset contains 1,150 MIDI files and over 22,000 measures of drumming.
To enable a wide range of experiments and encourage comparisons between methods on the same data, Gillick et al. created a new dataset of drum performances recorded in MIDI format. They hired professional drummers and asked them to perform in multiple styles to a click track on a Roland TD-11 electronic drum kit. They also recorded the aligned, high-quality synthesized audio from the TD-11 and include it in the release.
The Groove MIDI Dataset (GMD), has several attributes that distinguish it from existing ones:
- The dataset contains about 13.6 hours, 1,150 MIDI files, and over 22,000 measures of drumming.
- Each performance was played along with a metronome set at a specific tempo by the drummer.
- The data includes performances by a total of 10 drummers, with more than 80% of duration coming from hired professionals. The professionals were able to improvise in a wide range of styles, resulting in a diverse dataset.
- The drummers were instructed to play a mix of long sequences (several minutes of continuous playing) and short beats and fills.
- Each performance is annotated with a genre (provided by the drummer), tempo, and anonymized drummer ID.
- Most of the performances are in 4/4 time, with a few examples from other time signatures.
- Four drummers were asked to record the same set of 10 beats in their own style. These are included in the test set split, labeled eval-session/groove1-10.
- In addition to the MIDI recordings that are the primary source of data for the experiments in this work, the authors captured the synthesized audio outputs of the drum set and aligned them to within 2ms of the corresponding MIDI files.
A train/validation/test split configuration is provided for easier comparison of model accuracy on various tasks.
The dataset is made available by Google LLC under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.
For more details, please visit: http://magenta.tensorflow.org/datasets/groove
-
class
mirdata.datasets.groove_midi.
Dataset
(data_home=None)[source]¶ The groove_midi dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download the dataset
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Groove MIDI audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_beats
(*args, **kwargs)[source]¶ Load beat data from the midi file.
Parameters: - midi_path (str) – path to midi file
- midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path
Returns: annotations.BeatData – machine generated beat data
-
load_drum_events
(*args, **kwargs)[source]¶ Load drum events from the midi file.
Parameters: - midi_path (str) – path to midi file
- midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path
Returns: annotations.EventData – drum event data
-
load_midi
(*args, **kwargs)[source]¶ Load a Groove MIDI midi file.
Parameters: midi_path (str) – path to midi file Returns: midi_data (pretty_midi.PrettyMIDI) – pretty_midi object
-
class
mirdata.datasets.groove_midi.
Track
(track_id, data_home)[source]¶ Groove MIDI Track class
Parameters: track_id (str) – track id of the track
Variables: - drummer (str) – Drummer id of the track (ex. ‘drummer1’)
- session (str) – Type of session (ex. ‘session1’, ‘eval_session’)
- track_id (str) – track id of the track (ex. ‘drummer1/eval_session/1’)
- style (str) – Style (genre, groove type) of the track (ex. ‘funk/groove1’)
- tempo (int) – track tempo in beats per minute (ex. 138)
- beat_type (str) – Whether the track is a beat or a fill (ex. ‘beat’)
- time_signature (str) – Time signature of the track (ex. ‘4-4’, ‘6-8’)
- midi_path (str) – Path to the midi file
- audio_path (str) – Path to the audio file
- duration (float) – Duration of the midi file in seconds
- split (str) – Whether the track is for a train/valid/test set. One of ‘train’, ‘valid’ or ‘test’.
Other Parameters: - beats (BeatData) – Machine-generated beat annotations
- drum_events (EventData) – Annotated drum kit events
- midi (pretty_midi.PrettyMIDI) – object containing MIDI information
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.groove_midi.
load_audio
(audio_path)[source]¶ Load a Groove MIDI audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.groove_midi.
load_beats
(midi_path, midi=None)[source]¶ Load beat data from the midi file.
Parameters: - midi_path (str) – path to midi file
- midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path
Returns: annotations.BeatData – machine generated beat data
-
mirdata.datasets.groove_midi.
load_drum_events
(midi_path, midi=None)[source]¶ Load drum events from the midi file.
Parameters: - midi_path (str) – path to midi file
- midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path
Returns: annotations.EventData – drum event data
gtzan_genre¶
GTZAN-Genre Dataset Loader
Dataset Info
This dataset was used for the well known genre classification paper:
"Musical genre classification of audio signals " by G. Tzanetakis and
P. Cook in IEEE Transactions on Audio and Speech Processing 2002.
The dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres, each represented by 100 tracks. The tracks are all 22050 Hz mono 16-bit audio files in .wav format.
-
class
mirdata.datasets.gtzan_genre.
Dataset
(data_home=None)[source]¶ The gtzan_genre dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a GTZAN audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
class
mirdata.datasets.gtzan_genre.
Track
(track_id, data_home)[source]¶ gtzan_genre Track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – path to the audio file
- genre (str) – annotated genre
- track_id (str) – track id
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
guitarset¶
GuitarSet Loader
Dataset Info
GuitarSet provides audio recordings of a variety of musical excerpts played on an acoustic guitar, along with time-aligned annotations including pitch contours, string and fret positions, chords, beats, downbeats, and keys.
GuitarSet contains 360 excerpts that are close to 30 seconds in length. The 360 excerpts are the result of the following combinations:
- 6 players
- 2 versions: comping (harmonic accompaniment) and soloing (melodic improvisation)
- 5 styles: Rock, Singer-Songwriter, Bossa Nova, Jazz, and Funk
- 3 Progressions: 12 Bar Blues, Autumn Leaves, and Pachelbel Canon.
- 2 Tempi: slow and fast.
The tonality (key) of each excerpt is sampled uniformly at random.
GuitarSet was recorded with the help of a hexaphonic pickup, which outputs signals for each string separately, allowing automated note-level annotation. Excerpts are recorded with both the hexaphonic pickup and a Neumann U-87 condenser microphone as reference. 3 audio recordings are provided with each excerpt with the following suffix:
- hex: original 6 channel wave file from hexaphonic pickup
- hex_cln: hex wave files with interference removal applied
- mic: monophonic recording from reference microphone
- mix: monophonic mixture of original 6 channel file
Each of the 360 excerpts has an accompanying JAMS file which stores 16 annotations. Pitch:
- 6 pitch_contour annotations (1 per string)
- 6 midi_note annotations (1 per string)
Beat and Tempo:
- 1 beat_position annotation
- 1 tempo annotation
Chords:
- 2 chord annotations: instructed and performed. The instructed chord annotation is a digital version of the lead sheet that’s provided to the player, and the performed chord annotations are inferred from note annotations, using segmentation and root from the digital lead sheet annotation.
For more details, please visit: http://github.com/marl/guitarset/
-
class
mirdata.datasets.guitarset.
Dataset
(data_home=None)[source]¶ The guitarset dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Guitarset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_beats
(*args, **kwargs)[source]¶ Load a Guitarset beats annotation.
Parameters: jams_path (str) – Path of the jams annotation file Returns: BeatData – Beat data
-
load_chords
(*args, **kwargs)[source]¶ Load a guitarset chord annotation.
Parameters: - jams_path (str) – Path of the jams annotation file
- leadsheet_version (Bool) – Whether or not to load the leadsheet version of the chord annotation If False, load the infered version.
Returns: ChordData – Chord data
-
load_key_mode
(*args, **kwargs)[source]¶ Load a Guitarset key-mode annotation.
Parameters: jams_path (str) – Path of the jams annotation file Returns: KeyData – Key data
-
load_multitrack_audio
(*args, **kwargs)[source]¶ Load a Guitarset multitrack audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_notes
(*args, **kwargs)[source]¶ Load a guitarset note annotation for a given string
Parameters: - jams_path (str) – Path of the jams annotation file
- string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.
Returns: NoteData – Note data for the given string
-
load_pitch_contour
(*args, **kwargs)[source]¶ Load a guitarset pitch contour annotation for a given string
Parameters: - jams_path (str) – Path of the jams annotation file
- string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.
Returns: F0Data – Pitch contour data for the given string
-
class
mirdata.datasets.guitarset.
Track
(track_id, data_home)[source]¶ guitarset Track class
Parameters: track_id (str) – track id of the track
Variables: - audio_hex_cln_path (str) – path to the debleeded hex wave file
- audio_hex_path (str) – path to the original hex wave file
- audio_mic_path (str) – path to the mono wave via microphone
- audio_mix_path (str) – path to the mono wave via downmixing hex pickup
- jams_path (str) – path to the jams file
- mode (str) – one of [‘solo’, ‘comp’] For each excerpt, players are asked to first play in ‘comp’ mode and later play a ‘solo’ version on top of the already recorded comp.
- player_id (str) – ID of the different players. one of [‘00’, ‘01’, … , ‘05’]
- style (str) – one of [‘Jazz’, ‘Bossa Nova’, ‘Rock’, ‘Singer-Songwriter’, ‘Funk’]
- tempo (float) – BPM of the track
- track_id (str) – track id
Other Parameters: - beats (BeatData) – beat positions
- leadsheet_chords (ChordData) – chords as written in the leadsheet
- inferred_chords (ChordData) – chords inferred from played transcription
- key_mode (KeyData) – key and mode
- pitch_contours (dict) – Pitch contours per string - ‘E’: F0Data(…) - ‘A’: F0Data(…) - ‘D’: F0Data(…) - ‘G’: F0Data(…) - ‘B’: F0Data(…) - ‘e’: F0Data(…)
- notes (dict) – Notes per string - ‘E’: NoteData(…) - ‘A’: NoteData(…) - ‘D’: NoteData(…) - ‘G’: NoteData(…) - ‘B’: NoteData(…) - ‘e’: NoteData(…)
-
audio_hex
¶ Hexaphonic audio (6-channels) with one channel per string
Returns: - np.ndarray - audio signal
- float - sample rate
-
audio_hex_cln
¶ - Hexaphonic audio (6-channels) with one channel per string
- after bleed removal
Returns: - np.ndarray - audio signal
- float - sample rate
-
audio_mic
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
audio_mix
¶ Mixture audio (mono)
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.guitarset.
load_audio
(audio_path)[source]¶ Load a Guitarset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.guitarset.
load_beats
(jams_path)[source]¶ Load a Guitarset beats annotation.
Parameters: jams_path (str) – Path of the jams annotation file Returns: BeatData – Beat data
-
mirdata.datasets.guitarset.
load_chords
(jams_path, leadsheet_version=True)[source]¶ Load a guitarset chord annotation.
Parameters: - jams_path (str) – Path of the jams annotation file
- leadsheet_version (Bool) – Whether or not to load the leadsheet version of the chord annotation If False, load the infered version.
Returns: ChordData – Chord data
-
mirdata.datasets.guitarset.
load_key_mode
(jams_path)[source]¶ Load a Guitarset key-mode annotation.
Parameters: jams_path (str) – Path of the jams annotation file Returns: KeyData – Key data
-
mirdata.datasets.guitarset.
load_multitrack_audio
(audio_path)[source]¶ Load a Guitarset multitrack audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.guitarset.
load_notes
(jams_path, string_num)[source]¶ Load a guitarset note annotation for a given string
Parameters: - jams_path (str) – Path of the jams annotation file
- string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.
Returns: NoteData – Note data for the given string
-
mirdata.datasets.guitarset.
load_pitch_contour
(jams_path, string_num)[source]¶ Load a guitarset pitch contour annotation for a given string
Parameters: - jams_path (str) – Path of the jams annotation file
- string_num (int), in range(6) – Which string to load. 0 is the Low E string, 5 is the high e string.
Returns: F0Data – Pitch contour data for the given string
ikala¶
iKala Dataset Loader
Dataset Info
The iKala dataset is comprised of 252 30-second excerpts sampled from 206 iKala songs (plus 100 hidden excerpts reserved for MIREX). The music accompaniment and the singing voice are recorded at the left and right channels respectively and can be found under the Wavfile directory. In addition, the human-labeled pitch contours and timestamped lyrics can be found under PitchLabel and Lyrics respectively.
For more details, please visit: http://mac.citi.sinica.edu.tw/ikala/
-
class
mirdata.datasets.ikala.
Dataset
(data_home=None)[source]¶ The ikala dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_f0
(*args, **kwargs)[source]¶ Load an ikala f0 annotation
Parameters: f0_path (str) – path to f0 annotation file Raises: IOError
– If f0_path does not existReturns: F0Data – the f0 annotation data
-
load_instrumental_audio
(*args, **kwargs)[source]¶ Load ikala instrumental audio
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_lyrics
(*args, **kwargs)[source]¶ Load an ikala lyrics annotation
Parameters: lyrics_path (str) – path to lyric annotation file Raises: IOError
– if lyrics_path does not existReturns: LyricData – lyric annotation data
-
load_mix_audio
(*args, **kwargs)[source]¶ Load an ikala mix.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_tracks
()[source]¶ Load all tracks in the dataset
Returns: dict – {track_id: track data} Raises: NotImplementedError
– If the dataset does not support Track objects
-
class
mirdata.datasets.ikala.
Track
(track_id, data_home)[source]¶ ikala Track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – path to the track’s audio file
- f0_path (str) – path to the track’s f0 annotation file
- lyrics_path (str) – path to the track’s lyric annotation file
- section (str) – section. Either ‘verse’ or ‘chorus’
- singer_id (str) – singer id
- song_id (str) – song id of the track
- track_id (str) – track id
Other Parameters: - f0 (F0Data) – human-annotated singing voice pitch
- lyrics (LyricsData) – human-annotated lyrics
-
instrumental_audio
¶ instrumental audio (mono)
Returns: - np.ndarray - audio signal
- float - sample rate
-
mix_audio
¶ mixture audio (mono)
Returns: - np.ndarray - audio signal
- float - sample rate
-
to_jams
()[source]¶ Get the track’s data in jams format
Returns: jams.JAMS – the track’s data in jams format
-
vocal_audio
¶ solo vocal audio (mono)
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.ikala.
load_f0
(f0_path)[source]¶ Load an ikala f0 annotation
Parameters: f0_path (str) – path to f0 annotation file Raises: IOError
– If f0_path does not existReturns: F0Data – the f0 annotation data
-
mirdata.datasets.ikala.
load_instrumental_audio
(audio_path)[source]¶ Load ikala instrumental audio
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.ikala.
load_lyrics
(lyrics_path)[source]¶ Load an ikala lyrics annotation
Parameters: lyrics_path (str) – path to lyric annotation file Raises: IOError
– if lyrics_path does not existReturns: LyricData – lyric annotation data
irmas¶
IRMAS Loader
Dataset Info
IRMAS: a dataset for instrument recognition in musical audio signals
This dataset includes musical audio excerpts with annotations of the predominant instrument(s) present. It was used for the evaluation in the following article:
Bosch, J. J., Janer, J., Fuhrmann, F., & Herrera, P. “A Comparison of Sound Segregation Techniques for
Predominant Instrument Recognition in Musical Audio Signals”, in Proc. ISMIR (pp. 559-564), 2012.
IRMAS is intended to be used for training and testing methods for the automatic recognition of predominant instruments in musical audio. The instruments considered are: cello, clarinet, flute, acoustic guitar, electric guitar, organ, piano, saxophone, trumpet, violin, and human singing voice. This dataset is derived from the one compiled by Ferdinand Fuhrmann in his PhD thesis, with the difference that we provide audio data in stereo format, the annotations in the testing dataset are limited to specific pitched instruments, and there is a different amount and lenght of excerpts from the original dataset.
The dataset is split into training and test data.
Training data
Total audio samples: 6705 They are excerpts of 3 seconds from more than 2000 distinct recordings.
Audio specifications
- Sampling frequency: 44.1 kHz
- Bit-depth: 16 bit
- Audio format: .wav
IRMAS Dataset trainig samples are annotated by storing the information of each track in their filenames.
Predominant instrument:
- The annotation of the predominant instrument of each excerpt is both in the name of the containing folder, and in the file name: cello (cel), clarinet (cla), flute (flu), acoustic guitar (gac), electric guitar (gel), organ (org), piano (pia), saxophone (sax), trumpet (tru), violin (vio), and human singing voice (voi).
- The number of files per instrument are: cel(388), cla(505), flu(451), gac(637), gel(760), org(682), pia(721), sax(626), tru(577), vio(580), voi(778).
Drum presence
- Additionally, some of the files have annotation in the filename regarding the presence ([dru]) or non presence([nod]) of drums.
The annotation of the musical genre:
- country-folk ([cou_fol])
- classical ([cla]),
- pop-rock ([pop_roc])
- latin-soul ([lat_sou])
- jazz-blues ([jaz_blu]).
Testing data
Total audio samples: 2874
Audio specifications
- Sampling frequency: 44.1 kHz
- Bit-depth: 16 bit
- Audio format: .wav
IRMAS Dataset testing samples are annotated by the following basis:
Predominant instrument:
The annotations for an excerpt named: “excerptName.wav” are given in “excerptName.txt”. More than one instrument may be annotated in each excerpt, one label per line. This part of the dataset contains excerpts from a diversity of western musical genres, with varied instrumentations, and it is derived from the original testing dataset from Fuhrmann (http://www.dtic.upf.edu/~ffuhrmann/PhD/). Instrument nomenclatures are the same as the training dataset.
Dataset compiled by Juan J. Bosch, Ferdinand Fuhrmann, Perfecto Herrera, Music Technology Group - Universitat Pompeu Fabra (Barcelona).
The IRMAS dataset is offered free of charge for non-commercial use only. You can not redistribute it nor modify it. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
For more details, please visit: https://www.upf.edu/web/mtg/irmas
-
class
mirdata.datasets.irmas.
Dataset
(data_home=None)[source]¶ The irmas dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a IRMAS dataset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_pred_inst
(*args, **kwargs)[source]¶ Load predominant instrument of track
Parameters: annotation_path (str) – Local path where the test annotations are stored. Returns: str – test track predominant instrument(s) annotations
-
class
mirdata.datasets.irmas.
Track
(track_id, data_home)[source]¶ IRMAS track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored. If None, looks for the data in the default directory, ~/mir_datasets/Mridangam-Stroke
Variables: - track_id (str) – track id
- predominant_instrument (list) – Training tracks predominant instrument
- train (bool) – flag to identify if the track is from the training of the testing dataset
- genre (str) – string containing the namecode of the genre of the track.
- drum (bool) – flag to identify if the track contains drums or not.
Other Parameters: instrument (list) – list of predominant instruments as str
-
audio
¶ The track’s audio signal
Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
maestro¶
MAESTRO Dataset Loader
Dataset Info
MAESTRO (MIDI and Audio Edited for Synchronous TRacks and Organization) is a dataset composed of over 200 hours of virtuosic piano performances captured with fine alignment (~3 ms) between note labels and audio waveforms.
The dataset is created and released by Google’s Magenta team.
The dataset contains over 200 hours of paired audio and MIDI recordings from ten years of International Piano-e-Competition. The MIDI data includes key strike velocities and sustain/sostenuto/una corda pedal positions. Audio and MIDI files are aligned with ∼3 ms accuracy and sliced to individual musical pieces, which are annotated with composer, title, and year of performance. Uncompressed audio is of CD quality or higher (44.1–48 kHz 16-bit PCM stereo).
A train/validation/test split configuration is also proposed, so that the same composition, even if performed by multiple contestants, does not appear in multiple subsets. Repertoire is mostly classical, including composers from the 17th to early 20th century.
The dataset is made available by Google LLC under a Creative Commons Attribution Non-Commercial Share-Alike 4.0 (CC BY-NC-SA 4.0) license.
This loader supports MAESTRO version 2.
For more details, please visit: https://magenta.tensorflow.org/datasets/maestro
-
class
mirdata.datasets.maestro.
Dataset
(data_home=None)[source]¶ The maestro dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download the dataset
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a MAESTRO audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_midi
(*args, **kwargs)[source]¶ Load a MAESTRO midi file.
Parameters: midi_path (str) – path to midi file Returns: pretty_midi.PrettyMIDI – pretty_midi object
-
load_notes
(*args, **kwargs)[source]¶ Load note data from the midi file.
Parameters: - midi_path (str) – path to midi file
- midi (pretty_midi.PrettyMIDI) – pre-loaded midi object or None if None, the midi object is loaded using midi_path
Returns: NoteData – note annotations
-
class
mirdata.datasets.maestro.
Track
(track_id, data_home)[source]¶ MAESTRO Track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – Path to the track’s audio file
- canonical_composer (str) – Composer of the piece, standardized on a single spelling for a given name.
- canonical_title (str) – Title of the piece. Not guaranteed to be standardized to a single representation.
- duration (float) – Duration in seconds, based on the MIDI file.
- midi_path (str) – Path to the track’s MIDI file
- split (str) – Suggested train/validation/test split.
- track_id (str) – track id
- year (int) – Year of performance.
- Cached Property:
- midi (pretty_midi.PrettyMIDI): object containing MIDI annotations notes (NoteData): annotated piano notes
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.maestro.
load_audio
(audio_path)[source]¶ Load a MAESTRO audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
medley_solos_db¶
Medley-solos-DB Dataset Loader.
Dataset Info
Medley-solos-DB is a cross-collection dataset for automatic musical instrument recognition in solo recordings. It consists of a training set of 3-second audio clips, which are extracted from the MedleyDB dataset (Bittner et al., ISMIR 2014) as well as a test set of 3-second clips, which are extracted from the solosDB dataset (Essid et al., IEEE TASLP 2009).
Each of these clips contains a single instrument among a taxonomy of eight:
- clarinet,
- distorted electric guitar,
- female singer,
- flute,
- piano,
- tenor saxophone,
- trumpet, and
- violin.
The Medley-solos-DB dataset is the dataset that is used in the benchmarks of musical instrument recognition in the publications of Lostanlen and Cella (ISMIR 2016) and Andén et al. (IEEE TSP 2019).
-
class
mirdata.datasets.medley_solos_db.
Dataset
(data_home=None)[source]¶ The medley_solos_db dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Medley Solos DB audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
class
mirdata.datasets.medley_solos_db.
Track
(track_id, data_home)[source]¶ medley_solos_db Track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – path to the track’s audio file
- instrument (str) – instrument encoded by its English name
- instrument_id (int) – instrument encoded as an integer
- song_id (int) – song encoded as an integer
- subset (str) – either equal to ‘train’, ‘validation’, or ‘test’
- track_id (str) – track id
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
medleydb_melody¶
MedleyDB melody Dataset Loader
Dataset Info
MedleyDB melody is a subset of the MedleyDB dataset containing only the mixtures and melody annotations.
MedleyDB is a dataset of annotated, royalty-free multitrack recordings. MedleyDB was curated primarily to support research on melody extraction, addressing important shortcomings of existing collections. For each song we provide melody f0 annotations as well as instrument activations for evaluating automatic instrument recognition.
For more details, please visit: https://medleydb.weebly.com
-
class
mirdata.datasets.medleydb_melody.
Dataset
(data_home=None)[source]¶ The medleydb_melody dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a MedleyDB audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_melody
(*args, **kwargs)[source]¶ Load a MedleyDB melody1 or melody2 annotation file
Parameters: melody_path (str) – path to a melody annotation file Raises: IOError
– if melody_path does not existReturns: F0Data – melody data
-
load_melody3
(*args, **kwargs)[source]¶ Load a MedleyDB melody3 annotation file
Parameters: melody_path (str) – melody 3 melody annotation path Raises: IOError
– if melody_path does not existReturns: MultiF0Data – melody 3 annotation data
-
class
mirdata.datasets.medleydb_melody.
Track
(track_id, data_home)[source]¶ medleydb_melody Track class
Parameters: track_id (str) – track id of the track
Variables: - artist (str) – artist
- audio_path (str) – path to the audio file
- genre (str) – genre
- is_excerpt (bool) – True if the track is an excerpt
- is_instrumental (bool) – True of the track does not contain vocals
- melody1_path (str) – path to the melody1 annotation file
- melody2_path (str) – path to the melody2 annotation file
- melody3_path (str) – path to the melody3 annotation file
- n_sources (int) – Number of instruments in the track
- title (str) – title
- track_id (str) – track id
Other Parameters: - melody1 (F0Data) – the pitch of the single most predominant source (often the voice)
- melody2 (F0Data) – the pitch of the predominant source for each point in time
- melody3 (MultiF0Data) – the pitch of any melodic source. Allows for more than one f0 value at a time
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.medleydb_melody.
load_audio
(audio_path)[source]¶ Load a MedleyDB audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
medleydb_pitch¶
MedleyDB pitch Dataset Loader
Dataset Info
MedleyDB Pitch is a pitch-tracking subset of the MedleyDB dataset containing only f0-annotated, monophonic stems.
MedleyDB is a dataset of annotated, royalty-free multitrack recordings. MedleyDB was curated primarily to support research on melody extraction, addressing important shortcomings of existing collections. For each song we provide melody f0 annotations as well as instrument activations for evaluating automatic instrument recognition.
For more details, please visit: https://medleydb.weebly.com
-
class
mirdata.datasets.medleydb_pitch.
Dataset
(data_home=None)[source]¶ The medleydb_pitch dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a MedleyDB audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_pitch
(*args, **kwargs)[source]¶ load a MedleyDB pitch annotation file
Parameters: pitch_path (str) – path to pitch annotation file Raises: IOError
– if pitch_path doesn’t existReturns: F0Data – pitch annotation
-
class
mirdata.datasets.medleydb_pitch.
Track
(track_id, data_home)[source]¶ medleydb_pitch Track class
Parameters: track_id (str) – track id of the track
Variables: Other Parameters: pitch (F0Data) – human annotated pitch
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mridangam_stroke¶
Mridangam Stroke Dataset Loader
Dataset Info
The Mridangam Stroke dataset is a collection of individual strokes of the Mridangam in various tonics. The dataset comprises of 10 different strokes played on Mridangams with 6 different tonic values. The audio examples were recorded from a professional Carnatic percussionist in a semi-anechoic studio conditions by Akshay Anantapadmanabhan.
Total audio samples: 6977
Used microphones:
- SM-58 microphones
- H4n ZOOM recorder.
Audio specifications:
- Sampling frequency: 44.1 kHz
- Bit-depth: 16 bit
- Audio format: .wav
The dataset can be used for training models for each Mridangam stroke. The presentation of the dataset took place on the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013) on May 2013. You can read the full publication here: https://repositori.upf.edu/handle/10230/25756
Mridangam Dataset is annotated by storing the informat of each track in their filenames. The structure of the filename is:
<TrackID>__<AuthorName>__<StrokeName>-<Tonic>-<InstanceNum>.wav
The dataset is made available by CompMusic under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) License.
For more details, please visit: https://compmusic.upf.edu/mridangam-stroke-dataset
-
class
mirdata.datasets.mridangam_stroke.
Dataset
(data_home=None)[source]¶ The mridangam_stroke dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Mridangam Stroke Dataset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
class
mirdata.datasets.mridangam_stroke.
Track
(track_id, data_home)[source]¶ Mridangam Stroke track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored.
Variables: - track_id (str) – track id
- audio_path (str) – audio path
- stroke_name (str) – name of the Mridangam stroke present in Track
- tonic (str) – tonic of the stroke in the Track
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
orchset¶
ORCHSET Dataset Loader
Dataset Info
Orchset is intended to be used as a dataset for the development and evaluation of melody extraction algorithms. This collection contains 64 audio excerpts focused on symphonic music with their corresponding annotation of the melody.
For more details, please visit: https://zenodo.org/record/1289786#.XREpzaeZPx6
-
class
mirdata.datasets.orchset.
Dataset
(data_home=None)[source]¶ The orchset dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download the dataset
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio_mono
(*args, **kwargs)[source]¶ Load an Orchset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_audio_stereo
(*args, **kwargs)[source]¶ Load an Orchset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_melody
(*args, **kwargs)[source]¶ Load an Orchset melody annotation file
Parameters: melody_path (str) – path to melody annotation file Raises: IOError
– if melody_path doesn’t existReturns: F0Data – melody annotation data
-
class
mirdata.datasets.orchset.
Track
(track_id, data_home)[source]¶ orchset Track class
Parameters: track_id (str) – track id of the track
Variables: - alternating_melody (bool) – True if the melody alternates between instruments
- audio_path_mono (str) – path to the mono audio file
- audio_path_stereo (str) – path to the stereo audio file
- composer (str) – the work’s composer
- contains_brass (bool) – True if the track contains any brass instrument
- contains_strings (bool) – True if the track contains any string instrument
- contains_winds (bool) – True if the track contains any wind instrument
- excerpt (str) – True if the track is an excerpt
- melody_path (str) – path to the melody annotation file
- only_brass (bool) – True if the track contains brass instruments only
- only_strings (bool) – True if the track contains string instruments only
- only_winds (bool) – True if the track contains wind instruments only
- predominant_melodic_instruments (list) – List of instruments which play the melody
- track_id (str) – track id
- work (str) – The musical work
Other Parameters: melody (F0Data) – melody annotation
-
audio_mono
¶ the track’s audio (mono)
Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
audio_stereo
¶ the track’s audio (stereo)
Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.orchset.
load_audio_mono
(audio_path)[source]¶ Load an Orchset audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
rwc_classical¶
RWC Classical Dataset Loader
Dataset Info
The Classical Music Database consists of 50 pieces
- Symphonies: 4 pieces
- Concerti: 2 pieces
- Orchestral music: 4 pieces
- Chamber music: 10 pieces
- Solo performances: 24 pieces
- Vocal performances: 6 pieces
A note about the Beat annotations:
- 48 corresponds to the duration of a quarter note (crotchet)
- 24 corresponds to the duration of an eighth note (quaver)
- 384 corresponds to the position of a downbeat
In 4/4 time signature, they correspond as follows:
384: 1st beat in a measure (i.e., downbeat position)
48: 2nd beat
96: 3rd beat
144 4th beat
In 3/4 time signature, they correspond as follows:
384: 1st beat in a measure (i.e., downbeat position)
48: 2nd beat
96: 3rd beat
In 6/8 time signature, they correspond as follows:
384: 1st beat in a measure (i.e., downbeat position)
24: 2nd beat
48: 3rd beat
72: 4th beat
96: 5th beat
120: 6th beat
For more details, please visit: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-c.html
-
class
mirdata.datasets.rwc_classical.
Dataset
(data_home=None)[source]¶ The rwc_classical dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a RWC audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_beats
(*args, **kwargs)[source]¶ Load rwc beat data from a file
Parameters: beats_path (str) – path to beats annotation file Returns: BeatData – beat data
-
load_sections
(*args, **kwargs)[source]¶ Load rwc section data from a file
Parameters: sections_path (str) – path to sections annotation file Returns: SectionData – section data
-
class
mirdata.datasets.rwc_classical.
Track
(track_id, data_home)[source]¶ rwc_classical Track class
Parameters: track_id (str) – track id of the track
Variables: - artist (str) – the track’s artist
- audio_path (str) – path of the audio file
- beats_path (str) – path of the beat annotation file
- category (str) – One of ‘Symphony’, ‘Concerto’, ‘Orchestral’, ‘Solo’, ‘Chamber’, ‘Vocal’, or blank.
- composer (str) – Composer of this Track.
- duration (float) – Duration of the track in seconds
- piece_number (str) – Piece number of this Track, [1-50]
- sections_path (str) – path of the section annotation file
- suffix (str) – string within M01-M06
- title (str) – Title of The track.
- track_id (str) – track id
- track_number (str) – CD track number of this Track
Other Parameters: - sections (SectionData) – human-labeled section annotations
- beats (BeatData) – human-labeled beat annotations
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.rwc_classical.
load_audio
(audio_path)[source]¶ Load a RWC audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
rwc_jazz¶
RWC Jazz Dataset Loader.
Dataset Info
The Jazz Music Database consists of 50 pieces:
Instrumentation variations: 35 pieces (5 pieces × 7 instrumentations).
The instrumentation-variation pieces were recorded to obtain different versions of the same piece; i.e., different arrangements performed by different player instrumentations. Five standard-style jazz pieces were originally composed and then performed in modern-jazz style using the following seven instrumentations:
- Piano solo
- Guitar solo
- Duo: Vibraphone + Piano, Flute + Piano, and Piano + Bass
- Piano trio: Piano + Bass + Drums
- Piano trio + Trumpet or Tenor saxophone
- Octet: Piano trio + Guitar + Alto saxophone + Baritone saxophone + Tenor saxophone × 2
- Piano trio + Vibraphone or Flute
Style variations: 9 pieces
The style-variation pieces were recorded to represent various styles of jazz. They include four well-known public-domain pieces and consist of
- Vocal jazz: 2 pieces (including “Aura Lee”)
- Big band jazz: 2 pieces (including “The Entertainer”)
- Modal jazz: 2 pieces
- Funky jazz: 2 pieces (including “Silent Night”)
- Free jazz: 1 piece (including “Joyful, Joyful, We Adore Thee”)
Fusion (crossover): 6 pieces
The fusion pieces were recorded to obtain music that combines elements of jazz with other styles such as popular, rock, and latin. They include music with an eighth-note feel, music with a sixteenth-note feel, and Latin jazz music.
For more details, please visit: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-j.html
-
class
mirdata.datasets.rwc_jazz.
Dataset
(data_home=None)[source]¶ The rwc_jazz dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a RWC audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_beats
(*args, **kwargs)[source]¶ Load rwc beat data from a file
Parameters: beats_path (str) – path to beats annotation file Returns: BeatData – beat data
-
load_sections
(*args, **kwargs)[source]¶ Load rwc section data from a file
Parameters: sections_path (str) – path to sections annotation file Returns: SectionData – section data
-
class
mirdata.datasets.rwc_jazz.
Track
(track_id, data_home)[source]¶ rwc_jazz Track class
Parameters: track_id (str) – track id of the track
Variables: - artist (str) – Artist name
- audio_path (str) – path of the audio file
- beats_path (str) – path of the beat annotation file
- duration (float) – Duration of the track in seconds
- instruments (str) – list of used instruments.
- piece_number (str) – Piece number of this Track, [1-50]
- sections_path (str) – path of the section annotation file
- suffix (str) – M01-M04
- title (str) – Title of The track.
- track_id (str) – track id
- track_number (str) – CD track number of this Track
- variation (str) – style variations
Other Parameters: - sections (SectionData) – human-labeled section data
- beats (BeatData) – human-labeled beat data
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
rwc_popular¶
RWC Popular Dataset Loader
Dataset Info
The Popular Music Database consists of 100 songs — 20 songs with English lyrics performed in the style of popular music typical of songs on the American hit charts in the 1980s, and 80 songs with Japanese lyrics performed in the style of modern Japanese popular music typical of songs on the Japanese hit charts in the 1990s.
For more details, please visit: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-p.html
-
class
mirdata.datasets.rwc_popular.
Dataset
(data_home=None)[source]¶ The rwc_popular dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a RWC audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_beats
(*args, **kwargs)[source]¶ Load rwc beat data from a file
Parameters: beats_path (str) – path to beats annotation file Returns: BeatData – beat data
-
load_chords
(*args, **kwargs)[source]¶ Load rwc chord data from a file
Parameters: chords_path (str) – path to chord annotation file Returns: ChordData – chord data
-
load_sections
(*args, **kwargs)[source]¶ Load rwc section data from a file
Parameters: sections_path (str) – path to sections annotation file Returns: SectionData – section data
-
load_tracks
()[source]¶ Load all tracks in the dataset
Returns: dict – {track_id: track data} Raises: NotImplementedError
– If the dataset does not support Track objects
-
class
mirdata.datasets.rwc_popular.
Track
(track_id, data_home)[source]¶ rwc_popular Track class
Parameters: track_id (str) – track id of the track
Variables: - artist (str) – artist
- audio_path (str) – path of the audio file
- beats_path (str) – path of the beat annotation file
- chords_path (str) – path of the chord annotation file
- drum_information (str) – If the drum is ‘Drum sequences’, ‘Live drums’, or ‘Drum loops’
- duration (float) – Duration of the track in seconds
- instruments (str) – List of used instruments
- piece_number (str) – Piece number, [1-50]
- sections_path (str) – path of the section annotation file
- singer_information (str) – could be male, female or vocal group
- suffix (str) – M01-M04
- tempo (str) – Tempo of the track in BPM
- title (str) – title
- track_id (str) – track id
- track_number (str) – CD track number
- voca_inst_path (str) – path of the vocal/instrumental annotation file
Other Parameters: - sections (SectionData) – human-labeled section annotation
- beats (BeatData) – human-labeled beat annotation
- chords (ChordData) – human-labeled chord annotation
- vocal_instrument_activity (EventData) – human-labeled vocal/instrument activity
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
salami¶
SALAMI Dataset Loader
Dataset Info
The SALAMI dataset contains Structural Annotations of a Large Amount of Music Information: the public portion contains over 2200 annotations of over 1300 unique tracks.
NB: mirdata relies on the corrected version of the 2.0 annotations: Details can be found at https://github.com/bmcfee/salami-data-public/tree/hierarchy-corrections and https://github.com/DDMAL/salami-data-public/pull/15.
For more details, please visit: https://github.com/DDMAL/salami-data-public
-
class
mirdata.datasets.salami.
Dataset
(data_home=None)[source]¶ The salami dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Salami audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_sections
(*args, **kwargs)[source]¶ Load salami sections data from a file
Parameters: sections_path (str) – path to sectin annotation file Returns: SectionData – section data
-
class
mirdata.datasets.salami.
Track
(track_id, data_home)[source]¶ salami Track class
Parameters: track_id (str) – track id of the track
Variables: - annotator_1_id (str) – number that identifies annotator 1
- annotator_1_time (str) – time that the annotator 1 took to complete the annotation
- annotator_2_id (str) – number that identifies annotator 1
- annotator_2_time (str) – time that the annotator 1 took to complete the annotation
- artist (str) – song artist
- audio_path (str) – path to the audio file
- broad_genre (str) – broad genre of the song
- duration (float) – duration of song in seconds
- genre (str) – genre of the song
- sections_annotator1_lowercase_path (str) – path to annotations in hierarchy level 1 from annotator 1
- sections_annotator1_uppercase_path (str) – path to annotations in hierarchy level 0 from annotator 1
- sections_annotator2_lowercase_path (str) – path to annotations in hierarchy level 1 from annotator 2
- sections_annotator2_uppercase_path (str) – path to annotations in hierarchy level 0 from annotator 2
- source (str) – dataset or source of song
- title (str) – title of the song
Other Parameters: - sections_annotator_1_uppercase (SectionData) – annotations in hierarchy level 0 from annotator 1
- sections_annotator_1_lowercase (SectionData) – annotations in hierarchy level 1 from annotator 1
- sections_annotator_2_uppercase (SectionData) – annotations in hierarchy level 0 from annotator 2
- sections_annotator_2_lowercase (SectionData) – annotations in hierarchy level 1 from annotator 2
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
saraga_carnatic¶
Saraga Dataset Loader
Dataset Info
This dataset contains time aligned melody, rhythm and structural annotations of Carnatic Music tracks, extracted from the large open Indian Art Music corpora of CompMusic.
The dataset contains the following manual annotations referring to audio files:
- Section and tempo annotations stored as start and end timestamps together with the name of the section and tempo during the section (in a separate file)
- Sama annotations referring to rhythmic cycle boundaries stored as timestamps.
- Phrase annotations stored as timestamps and transcription of the phrases using solfège symbols ({S, r, R, g, G, m, M, P, d, D, n, N}).
- Audio features automatically extracted and stored: pitch and tonic.
- The annotations are stored in text files, named as the audio filename but with the respective extension at the end, for instance: “Bhuvini Dasudane.tempo-manual.txt”.
The dataset contains a total of 249 tracks. A total of 168 tracks have multitrack audio.
The files of this dataset are shared with the following license: Creative Commons Attribution Non Commercial Share Alike 4.0 International
Dataset compiled by: Bozkurt, B.; Srinivasamurthy, A.; Gulati, S. and Serra, X.
For more information about the dataset as well as IAM and annotations, please refer to: https://mtg.github.io/saraga/, where a really detailed explanation of the data and annotations is published.
-
class
mirdata.datasets.saraga_carnatic.
Dataset
(data_home=None)[source]¶ The saraga_carnatic dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Saraga Carnatic audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_phrases
(*args, **kwargs)[source]¶ Load phrases
Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None. Returns: EventData – phrases annotation for track
-
load_pitch
(*args, **kwargs)[source]¶ Load pitch
Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None. Returns: F0Data – pitch annotation
-
load_sama
(*args, **kwargs)[source]¶ Load sama
Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None. Returns: BeatData – sama annotations
-
load_sections
(*args, **kwargs)[source]¶ Load sections from carnatic collection
Parameters: sections_path (str) – Local path where the section annotation is stored. Returns: SectionData – section annotations for track
-
load_tempo
(*args, **kwargs)[source]¶ Load tempo from carnatic collection
Parameters: tempo_path (str) – Local path where the tempo annotation is stored. Returns: dict – Dictionary of tempo information with the following keys:
- tempo_apm: tempo in aksharas per minute (APM)
- tempo_bpm: tempo in beats per minute (BPM)
- sama_interval: median duration (in seconds) of one tāla cycle
- beats_per_cycle: number of beats in one cycle of the tāla
- subdivisions: number of aksharas per beat of the tāla
-
load_tonic
(*args, **kwargs)[source]¶ Load track absolute tonic
Parameters: tonic_path (str) – Local path where the tonic path is stored. If None, returns None. Returns: int – Tonic annotation in Hz
-
class
mirdata.datasets.saraga_carnatic.
Track
(track_id, data_home)[source]¶ Saraga Track Carnatic class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored. default=None If None, looks for the data in the default directory, ~/mir_datasets
Variables: - title (str) – Title of the piece in the track
- mbid (str) – MusicBrainz ID of the track
- album_artists (list, dicts) – list of dicts containing the album artists present in the track and its mbid
- artists (list, dicts) – list of dicts containing information of the featuring artists in the track
- raaga (list, dict) – list of dicts containing information about the raagas present in the track
- form (list, dict) – list of dicts containing information about the forms present in the track
- work (list, dicts) – list of dicts containing the work present in the piece, and its mbid
- taala (list, dicts) – list of dicts containing the talas present in the track and its uuid
- concert (list, dicts) – list of dicts containing the concert where the track is present and its mbid
Other Parameters: - tonic (float) – tonic annotation
- pitch (F0Data) – pitch annotation
- pitch_vocal (F0Data) – vocal pitch annotation
- tempo (dict) – tempo annotations
- sama (BeatData) – sama section annotations
- sections (SectionData) – track section annotations
- phrases (SectionData) – phrase annotations
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.saraga_carnatic.
load_audio
(audio_path)[source]¶ Load a Saraga Carnatic audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.saraga_carnatic.
load_phrases
(phrases_path)[source]¶ Load phrases
Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None. Returns: EventData – phrases annotation for track
-
mirdata.datasets.saraga_carnatic.
load_pitch
(pitch_path)[source]¶ Load pitch
Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None. Returns: F0Data – pitch annotation
-
mirdata.datasets.saraga_carnatic.
load_sama
(sama_path)[source]¶ Load sama
Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None. Returns: BeatData – sama annotations
-
mirdata.datasets.saraga_carnatic.
load_sections
(sections_path)[source]¶ Load sections from carnatic collection
Parameters: sections_path (str) – Local path where the section annotation is stored. Returns: SectionData – section annotations for track
-
mirdata.datasets.saraga_carnatic.
load_tempo
(tempo_path)[source]¶ Load tempo from carnatic collection
Parameters: tempo_path (str) – Local path where the tempo annotation is stored. Returns: dict – Dictionary of tempo information with the following keys:
- tempo_apm: tempo in aksharas per minute (APM)
- tempo_bpm: tempo in beats per minute (BPM)
- sama_interval: median duration (in seconds) of one tāla cycle
- beats_per_cycle: number of beats in one cycle of the tāla
- subdivisions: number of aksharas per beat of the tāla
saraga_hindustani¶
Saraga Dataset Loader
Dataset Info
This dataset contains time aligned melody, rhythm and structural annotations of Hindustani Music tracks, extracted from the large open Indian Art Music corpora of CompMusic.
The dataset contains the following manual annotations referring to audio files:
- Section and tempo annotations stored as start and end timestamps together with the name of the section and tempo during the section (in a separate file)
- Sama annotations referring to rhythmic cycle boundaries stored as timestamps
- Phrase annotations stored as timestamps and transcription of the phrases using solfège symbols ({S, r, R, g, G, m, M, P, d, D, n, N})
- Audio features automatically extracted and stored: pitch and tonic.
- The annotations are stored in text files, named as the audio filename but with the respective extension at the end, for instance: “Bhuvini Dasudane.tempo-manual.txt”.
The dataset contains a total of 108 tracks.
The files of this dataset are shared with the following license: Creative Commons Attribution Non Commercial Share Alike 4.0 International
Dataset compiled by: Bozkurt, B.; Srinivasamurthy, A.; Gulati, S. and Serra, X.
For more information about the dataset as well as IAM and annotations, please refer to: https://mtg.github.io/saraga/, where a really detailed explanation of the data and annotations is published.
-
class
mirdata.datasets.saraga_hindustani.
Dataset
(data_home=None)[source]¶ The saraga_hindustani dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Saraga Hindustani audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_phrases
(*args, **kwargs)[source]¶ Load phrases
Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None. Returns: EventData – phrases annotation for track
-
load_pitch
(*args, **kwargs)[source]¶ Load automatic extracted pitch or melody
Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None. Returns: F0Data – pitch annotation
-
load_sama
(*args, **kwargs)[source]¶ Load sama
Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None. Returns: SectionData – sama annotations
-
load_sections
(*args, **kwargs)[source]¶ Load tracks sections
Parameters: sections_path (str) – Local path where the section annotation is stored. Returns: SectionData – section annotations for track
-
load_tempo
(*args, **kwargs)[source]¶ Load tempo from hindustani collection
Parameters: tempo_path (str) – Local path where the tempo annotation is stored. Returns: dict – Dictionary of tempo information with the following keys: - tempo: median tempo for the section in mātrās per minute (MPM)
- matra_interval: tempo expressed as the duration of the mātra (essentially dividing 60 by tempo, expressed in seconds)
- sama_interval: median duration of one tāl cycle in the section
- matras_per_cycle: indicator of the structure of the tāl, showing the number of mātrā in a cycle of the tāl of the recording
- start_time: start time of the section
- duration: duration of the section
-
load_tonic
(*args, **kwargs)[source]¶ Load track absolute tonic
Parameters: tonic_path (str) – Local path where the tonic path is stored. If None, returns None. Returns: int – Tonic annotation in Hz
-
class
mirdata.datasets.saraga_hindustani.
Track
(track_id, data_home)[source]¶ Saraga Hindustani Track class
Parameters: - track_id (str) – track id of the track
- data_home (str) – Local path where the dataset is stored. default=None If None, looks for the data in the default directory, ~/mir_datasets
Variables: - title (str) – Title of the piece in the track
- mbid (str) – MusicBrainz ID of the track
- album_artists (list, dicts) – list of dicts containing the album artists present in the track and its mbid
- artists (list, dicts) – list of dicts containing information of the featuring artists in the track
- raags (list, dict) – list of dicts containing information about the raags present in the track
- forms (list, dict) – list of dicts containing information about the forms present in the track
- release (list, dicts) – list of dicts containing information of the release where the track is found
- works (list, dicts) – list of dicts containing the work present in the piece, and its mbid
- taals (list, dicts) – list of dicts containing the taals present in the track and its uuid
- layas (list, dicts) – list of dicts containing the layas present in the track and its uuid
Other Parameters: - tonic (float) – tonic annotation
- pitch (F0Data) – pitch annotation
- tempo (dict) – tempo annotations
- sama (BeatData) – Sama section annotations
- sections (SectionData) – track section annotations
- phrases (EventData) – phrase annotations
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.saraga_hindustani.
load_audio
(audio_path)[source]¶ Load a Saraga Hindustani audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.saraga_hindustani.
load_phrases
(phrases_path)[source]¶ Load phrases
Parameters: phrases_path (str) – Local path where the phrase annotation is stored. If None, returns None. Returns: EventData – phrases annotation for track
-
mirdata.datasets.saraga_hindustani.
load_pitch
(pitch_path)[source]¶ Load automatic extracted pitch or melody
Parameters: pitch path (str) – Local path where the pitch annotation is stored. If None, returns None. Returns: F0Data – pitch annotation
-
mirdata.datasets.saraga_hindustani.
load_sama
(sama_path)[source]¶ Load sama
Parameters: sama_path (str) – Local path where the sama annotation is stored. If None, returns None. Returns: SectionData – sama annotations
-
mirdata.datasets.saraga_hindustani.
load_sections
(sections_path)[source]¶ Load tracks sections
Parameters: sections_path (str) – Local path where the section annotation is stored. Returns: SectionData – section annotations for track
-
mirdata.datasets.saraga_hindustani.
load_tempo
(tempo_path)[source]¶ Load tempo from hindustani collection
Parameters: tempo_path (str) – Local path where the tempo annotation is stored. Returns: dict – Dictionary of tempo information with the following keys: - tempo: median tempo for the section in mātrās per minute (MPM)
- matra_interval: tempo expressed as the duration of the mātra (essentially dividing 60 by tempo, expressed in seconds)
- sama_interval: median duration of one tāl cycle in the section
- matras_per_cycle: indicator of the structure of the tāl, showing the number of mātrā in a cycle of the tāl of the recording
- start_time: start time of the section
- duration: duration of the section
tinysol¶
TinySOL Dataset Loader.
Dataset Info
TinySOL is a dataset of 2913 samples, each containing a single musical note from one of 14 different instruments:
- Bass Tuba
- French Horn
- Trombone
- Trumpet in C
- Accordion
- Contrabass
- Violin
- Viola
- Violoncello
- Bassoon
- Clarinet in B-flat
- Flute
- Oboe
- Alto Saxophone
These sounds were originally recorded at Ircam in Paris (France) between 1996 and 1999, as part of a larger project named Studio On Line (SOL). Although SOL contains many combinations of mutes and extended playing techniques, TinySOL purely consists of sounds played in the so-called “ordinary” style, and in absence of mute.
TinySOL can be used for education and research purposes. In particular, it can be employed as a dataset for training and/or evaluating music information retrieval (MIR) systems, for tasks such as instrument recognition or fundamental frequency estimation. For this purpose, we provide an official 5-fold split of TinySOL as a metadata attribute. This split has been carefully balanced in terms of instrumentation, pitch range, and dynamics. For the sake of research reproducibility, we encourage users of TinySOL to adopt this split and report their results in terms of average performance across folds.
We encourage TinySOL users to subscribe to the Ircam Forum so that they can have access to larger versions of SOL.
For more details, please visit: https://www.orch-idea.org/
-
class
mirdata.datasets.tinysol.
Dataset
(data_home=None)[source]¶ The tinysol dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a TinySOL audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
class
mirdata.datasets.tinysol.
Track
(track_id, data_home)[source]¶ tinysol Track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – path of the audio file
- dynamics (str) – dynamics abbreviation. Ex: pp, mf, ff, etc.
- dynamics_id (int) – pp=0, p=1, mf=2, f=3, ff=4
- family (str) – instrument family encoded by its English name
- instance_id (int) – instance ID. Either equal to 0, 1, 2, or 3.
- instrument_abbr (str) – instrument abbreviation
- instrument_full (str) – instrument encoded by its English name
- is_resampled (bool) – True if this sample was pitch-shifted from a neighbor; False if it was genuinely recorded.
- pitch (str) – string containing English pitch class and octave number
- pitch_id (int) – MIDI note index, where middle C (“C4”) corresponds to 60
- string_id (NoneType) – string ID. By musical convention, the first string is the highest. On wind instruments, this is replaced by None.
- technique_abbr (str) – playing technique abbreviation
- technique_full (str) – playing technique encoded by its English name
- track_id (str) – track id
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
tonality_classicaldb¶
Tonality classicalDB Dataset Loader
Dataset Info
The Tonality classicalDB Dataset includes 881 classical musical pieces across different styles from s.XVII to s.XX annotated with single-key labels.
Tonality classicalDB Dataset was created as part of:
Gómez, E. (2006). PhD Thesis. Tonal description of music audio signals.
Department of Information and Communication Technologies.
This dataset is mainly intended to assess the performance of computational key estimation algorithms in classical music.
2020 note: The audio is privates. If you don’t have the original audio collection, you could create it from your private collection because most of the recordings are well known. To this end, we provide musicbrainz metadata. Moreover, we have added the spectrum and HPCP chromagram of each audio.
This dataset can be used with mirdata library: https://github.com/mir-dataset-loaders/mirdata
Spectrum features have been computed as is shown here: https://github.com/mir-dataset-loaders/mirdata-notebooks/blob/master/Tonality_classicalDB/ClassicalDB_spectrum_features.ipynb
HPCP chromagram has been computed as is shown here: https://github.com/mir-dataset-loaders/mirdata-notebooks/blob/master/Tonality_classicalDB/ClassicalDB_HPCP_features.ipynb
Musicbrainz metadata has been computed as is shown here: https://github.com/mir-dataset-loaders/mirdata-notebooks/blob/master/Tonality_classicalDB/ClassicalDB_musicbrainz_metadata.ipynb
-
class
mirdata.datasets.tonality_classicaldb.
Dataset
(data_home=None)[source]¶ The tonality_classicaldb dataset
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and
- returns (mirdata.core.Track or None) –
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
load_audio
(*args, **kwargs)[source]¶ Load a Tonality classicalDB audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
load_hpcp
(*args, **kwargs)[source]¶ Load Tonality classicalDB HPCP feature from a file
Parameters: hpcp_path (str) – path to HPCP file Returns: np.array – loaded HPCP data
-
load_key
(*args, **kwargs)[source]¶ Load Tonality classicalDB format key data from a file
Parameters: keys_path (str) – path to key annotation file Returns: str – musical key data
-
load_musicbrainz
(*args, **kwargs)[source]¶ Load Tonality classicalDB musicbraiz metadata from a file
Parameters: musicbrainz_path (str) – path to musicbrainz metadata file Returns: dict – musicbrainz metadata
-
load_spectrum
(*args, **kwargs)[source]¶ Load Tonality classicalDB spectrum data from a file
Parameters: spectrum_path (str) – path to spectrum file Returns: np.array – spectrum data
-
class
mirdata.datasets.tonality_classicaldb.
Track
(track_id, data_home)[source]¶ tonality_classicaldb track class
Parameters: track_id (str) – track id of the track
Variables: - audio_path (str) – track audio path
- key_path (str) – key annotation path
- title (str) – title of the track
- track_id (str) – track id
Other Parameters: - key (str) – key annotation
- spectrum (np.array) – computed audio spectrum
- hpcp (np.array) – computed hpcp
- musicbrainz_metadata (dict) – MusicBrainz metadata
-
audio
¶ The track’s audio
Returns: - np.ndarray - audio signal
- float - sample rate
-
mirdata.datasets.tonality_classicaldb.
load_audio
(audio_path)[source]¶ Load a Tonality classicalDB audio file.
Parameters: audio_path (str) – path to audio file Returns: - np.ndarray - the mono audio signal
- float - The sample rate of the audio file
-
mirdata.datasets.tonality_classicaldb.
load_hpcp
(hpcp_path)[source]¶ Load Tonality classicalDB HPCP feature from a file
Parameters: hpcp_path (str) – path to HPCP file Returns: np.array – loaded HPCP data
-
mirdata.datasets.tonality_classicaldb.
load_key
(keys_path)[source]¶ Load Tonality classicalDB format key data from a file
Parameters: keys_path (str) – path to key annotation file Returns: str – musical key data
Core¶
Core mirdata classes
-
class
mirdata.core.
Dataset
(data_home=None, index=None, name=None, track_object=None, bibtex=None, remotes=None, download_info=None, license_info=None)[source]¶ mirdata Dataset object
Variables: - data_home (str) – path where mirdata will look for the dataset
- name (str) – the identifier of the dataset
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- readme (str) – information about the dataset
- track (function) – a function which inputs a track_id (str) and returns (mirdata.core.Track or None)
-
__init__
(data_home=None, index=None, name=None, track_object=None, bibtex=None, remotes=None, download_info=None, license_info=None)[source]¶ Dataset init method
Parameters: - data_home (str or None) – path where mirdata will look for the dataset
- index (dict or None) – the dataset’s file index
- name (str or None) – the identifier of the dataset
- track_object (mirdata.core.Track or None) – an uninstantiated Track object
- bibtex (str or None) – dataset citation/s in bibtex format
- remotes (dict or None) – data to be downloaded
- download_info (str or None) – download instructions or caveats
- license_info (str or None) – license of the dataset
-
choice_track
()[source]¶ Choose a random track
Returns: Track – a Track object instantiated by a random track_id
-
default_path
¶ Get the default path for the dataset
Returns: str – Local path to the dataset
-
download
(partial_download=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally print a message.
Parameters: - partial_download (list or None) – A list of keys of remotes to partially download. If None, all data is downloaded
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete any zip/tar files after extracting.
Raises: ValueError
– if invalid keys are passed to partial_downloadIOError
– if a downloaded file’s checksum is different from expected
-
class
mirdata.core.
MultiTrack
[source]¶ MultiTrack class.
A multitrack class is a collection of track objects and their associated audio that can be mixed together. A multitrack is iteslf a Track, and can have its own associated audio (such as a mastered mix), its own metadata and its own annotations.
-
get_mix
()[source]¶ Create a linear mixture given a subset of tracks.
Parameters: track_keys (list) – list of track keys to mix together Returns: np.ndarray – mixture audio with shape (n_samples, n_channels)
-
get_random_target
(n_tracks=None, min_weight=0.3, max_weight=1.0)[source]¶ Get a random target by combining a random selection of tracks with random weights
Parameters: - n_tracks (int or None) – number of tracks to randomly mix. If None, uses all tracks
- min_weight (float) – minimum possible weight when mixing
- max_weight (float) – maximum possible weight when mixing
Returns: - np.ndarray - mixture audio with shape (n_samples, n_channels)
- list - list of keys of included tracks
- list - list of weights used to mix tracks
-
get_target
(track_keys, weights=None, average=True, enforce_length=True)[source]¶ Get target which is a linear mixture of tracks
Parameters: - track_keys (list) – list of track keys to mix together
- weights (list or None) – list of positive scalars to be used in the average
- average (bool) – if True, computes a weighted average of the tracks if False, computes a weighted sum of the tracks
- enforce_length (bool) – If True, raises ValueError if the tracks are not the same length. If False, pads audio with zeros to match the length of the longest track
Returns: np.ndarray – target audio with shape (n_channels, n_samples)
Raises: ValueError
– if sample rates of the tracks are not equal if enforce_length=True and lengths are not equal
-
-
class
mirdata.core.
Track
[source]¶ Track base class
See the docs for each dataset loader’s Track class for details
-
class
mirdata.core.
cached_property
(func)[source]¶ Cached propery decorator
A property that is only computed once per instance and then replaces itself with an ordinary attribute. Deleting the attribute resets the property. Source: https://github.com/bottlepy/bottle/commit/fa7733e075da0d790d809aa3d2f53071897e6f76
-
mirdata.core.
copy_docs
(original)[source]¶ Decorator function to copy docs from one function to another
Annotations¶
mirdata annotation data types
-
class
mirdata.annotations.
BeatData
(times, positions=None)[source]¶ BeatData object
Variables: - times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values
- positions (np.ndarray or None) – array of beat positions (as ints) e.g. 1, 2, 3, 4
-
class
mirdata.annotations.
ChordData
(intervals, labels, confidence=None)[source]¶ ChordData object
Variables: - intervals (np.ndarray or None) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
- labels (list) – list chord labels (as strings)
- confidence (np.ndarray or None) – array of confidence values between 0 and 1
-
class
mirdata.annotations.
EventData
(intervals, events)[source]¶ TempoData object
Variables: - intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
- events (list) – list of event labels (as strings)
-
class
mirdata.annotations.
F0Data
(times, frequencies, confidence=None)[source]¶ F0Data object
Variables: - times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values
- frequencies (np.ndarray) – array of frequency values (as floats) in Hz
- confidence (np.ndarray or None) – array of confidence values between 0 and 1
-
class
mirdata.annotations.
KeyData
(intervals, keys)[source]¶ KeyData object
Variables: - intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
- keys (list) – list key labels (as strings)
-
class
mirdata.annotations.
LyricData
(intervals, lyrics, pronunciations=None)[source]¶ LyricData object
Variables: - intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
- lyrics (list) – list of lyrics (as strings)
- pronunciations (list or None) – list of pronunciations (as strings)
-
class
mirdata.annotations.
MultiF0Data
(times, frequency_list, confidence_list=None)[source]¶ MultiF0Data object
Variables: - times (np.ndarray) – array of time stamps (as floats) in seconds with positive, strictly increasing values
- frequency_list (list) – list of lists of frequency values (as floats) in Hz
- confidence_list (list or None) – list of lists of confidence values between 0 and 1
-
class
mirdata.annotations.
NoteData
(intervals, notes, confidence=None)[source]¶ NoteData object
Variables: - intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
- notes (np.ndarray) – array of notes (as floats) in Hz
- confidence (np.ndarray or None) – array of confidence values between 0 and 1
-
class
mirdata.annotations.
SectionData
(intervals, labels=None)[source]¶ SectionData object
Variables: - intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] times should be positive and intervals should have non-negative duration
- labels (list or None) – list of labels (as strings)
-
class
mirdata.annotations.
TempoData
(intervals, value, confidence=None)[source]¶ TempoData object
Variables: - intervals (np.ndarray) – (n x 2) array of intervals (as floats) in seconds in the form [start_time, end_time] with positive time stamps and end_time >= start_time.
- value (list) – array of tempo values (as floats)
- confidence (np.ndarray or None) – array of confidence values between 0 and 1
-
mirdata.annotations.
validate_array_like
(array_like, expected_type, expected_dtype, none_allowed=False)[source]¶ Validate that array-like object is well formed
If array_like is None, validation passes automatically.
Parameters: - array_like (array-like) – object to validate
- expected_type (type) – expected type, either list or np.ndarray
- expected_dtype (type) – expected dtype
- none_allowed (bool) – if True, allows array to be None
Raises: TypeError
– if type/dtype does not match expected_type/expected_dtypeValueError
– if array
-
mirdata.annotations.
validate_confidence
(confidence)[source]¶ Validate if confidence is well-formed.
If confidence is None, validation passes automatically
Parameters: confidence (np.ndarray) – an array of confidence values Raises: ValueError
– if confidence are not between 0 and 1
-
mirdata.annotations.
validate_intervals
(intervals)[source]¶ Validate if intervals are well-formed.
If intervals is None, validation passes automatically
Parameters: intervals (np.ndarray) – (n x 2) array
Raises: ValueError
– if intervals have an invalid shape, have negative values- or if end times are smaller than start times.
Advanced¶
mirdata.validate¶
Utility functions for mirdata
-
mirdata.validate.
log_message
(message, verbose=True)[source]¶ Helper function to log message
Parameters: - message (str) – message to log
- verbose (bool) – if false, the message is not logged
-
mirdata.validate.
md5
(file_path)[source]¶ Get md5 hash of a file.
Parameters: file_path (str) – File path Returns: str – md5 hash of data in file_path
-
mirdata.validate.
validate
(local_path, checksum)[source]¶ Validate that a file exists and has the correct checksum
Parameters: - local_path (str) – file path
- checksum (str) – md5 checksum
Returns: - bool - True if file exists
- bool - True if checksum matches
-
mirdata.validate.
validate_files
(file_dict, data_home, verbose)[source]¶ Validate files
Parameters: - file_dict (dict) – dictionary of file information
- data_home (str) – path where the data lives
- verbose (bool) – if True, show progress
Returns: - dict - missing files
- dict - files with invalid checksums
-
mirdata.validate.
validate_index
(dataset_index, data_home, verbose=True)[source]¶ Validate files in a dataset’s index
Parameters: - dataset_index (list) – dataset indices
- data_home (str) – Local home path that the dataset is being stored
- verbose (bool) – if true, prints validation status while running
Returns: - dict - file paths that are in the index but missing locally
- dict - file paths with differing checksums
-
mirdata.validate.
validate_metadata
(file_dict, data_home, verbose)[source]¶ Validate files
Parameters: - file_dict (dict) – dictionary of file information
- data_home (str) – path where the data lives
- verbose (bool) – if True, show progress
Returns: - dict - missing files
- dict - files with invalid checksums
-
mirdata.validate.
validator
(dataset_index, data_home, verbose=True)[source]¶ Checks the existence and validity of files stored locally with respect to the paths and file checksums stored in the reference index. Logs invalid checksums and missing files.
Parameters: - dataset_index (list) – dataset indices
- data_home (str) – Local home path that the dataset is being stored
- verbose (bool) – if True (default), prints missing and invalid files to stdout. Otherwise, this function is equivalent to validate_index.
Returns: missing_files (list) –
- List of file paths that are in the dataset index
but missing locally.
- invalid_checksums (list): List of file paths that file exists in the
dataset index but has a different checksum compare to the reference checksum.
mirdata.download_utils¶
Utilities for downloading from the web.
-
class
mirdata.download_utils.
DownloadProgressBar
(iterable=None, desc=None, total=None, leave=True, file=None, ncols=None, mininterval=0.1, maxinterval=10.0, miniters=None, ascii=None, disable=False, unit='it', unit_scale=False, dynamic_ncols=False, smoothing=0.3, bar_format=None, initial=0, position=None, postfix=None, unit_divisor=1000, write_bytes=None, lock_args=None, nrows=None, colour=None, gui=False, **kwargs)[source]¶ Wrap tqdm to show download progress
-
class
mirdata.download_utils.
RemoteFileMetadata
(filename, url, checksum, destination_dir)[source]¶ The metadata for a remote file
Variables: - filename (str) – the remote file’s basename
- url (str) – the remote file’s url
- checksum (str) – the remote file’s md5 checksum
- destination_dir (str or None) – the relative path for where to save the file
-
mirdata.download_utils.
download_from_remote
(remote, save_dir, force_overwrite)[source]¶ Download a remote dataset into path Fetch a dataset pointed by remote’s url, save into path using remote’s filename and ensure its integrity based on the MD5 Checksum of the downloaded file.
Adapted from scikit-learn’s sklearn.datasets.base._fetch_remote.
Parameters: - remote (RemoteFileMetadata) – Named tuple containing remote dataset meta information: url, filename and checksum
- save_dir (str) – Directory to save the file to. Usually data_home
- force_overwrite (bool) – If True, overwrite existing file with the downloaded file. If False, does not overwrite, but checks that checksum is consistent.
Returns: str – Full path of the created file.
-
mirdata.download_utils.
download_tar_file
(tar_remote, save_dir, force_overwrite, cleanup)[source]¶ Download and untar a tar file.
Parameters: - tar_remote (RemoteFileMetadata) – Object containing download information
- save_dir (str) – Path to save downloaded file
- force_overwrite (bool) – If True, overwrites existing files
- cleanup (bool) – If True, remove tarfile after untarring
-
mirdata.download_utils.
download_zip_file
(zip_remote, save_dir, force_overwrite, cleanup)[source]¶ Download and unzip a zip file.
Parameters: - zip_remote (RemoteFileMetadata) – Object containing download information
- save_dir (str) – Path to save downloaded file
- force_overwrite (bool) – If True, overwrites existing files
- cleanup (bool) – If True, remove zipfile after unziping
-
mirdata.download_utils.
downloader
(save_dir, remotes=None, partial_download=None, info_message=None, force_overwrite=False, cleanup=False)[source]¶ Download data to save_dir and optionally log a message.
Parameters: - save_dir (str) – The directory to download the data
- remotes (dict or None) – A dictionary of RemoteFileMetadata tuples of data in zip format. If None, there is no data to download
- partial_download (list or None) – A list of keys to partially download the remote objects of the download dict. If None, all data is downloaded
- info_message (str or None) – A string of info to log when this function is called. If None, no string is logged.
- force_overwrite (bool) – If True, existing files are overwritten by the downloaded files.
- cleanup (bool) – Whether to delete the zip/tar file after extracting.
-
mirdata.download_utils.
extractall_unicode
(zfile, out_dir)[source]¶ Extract all files inside a zip archive to a output directory.
In comparison to the zipfile, it checks for correct file name encoding
Parameters: - zfile (obj) – Zip file object created with zipfile.ZipFile
- out_dir (str) – Output folder
mirdata.jams_utils¶
Utilities for converting mirdata Annotation objects to jams format.
-
mirdata.jams_utils.
beats_to_jams
(beat_data, description=None)[source]¶ Convert beat annotations into jams format.
Parameters: - beat_data (annotations.BeatData) – beat data object
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
chords_to_jams
(chord_data, description=None)[source]¶ Convert chord annotations into jams format.
Parameters: - chord_data (annotations.ChordData) – chord data object
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
events_to_jams
(event_data, description=None)[source]¶ Convert events annotations into jams format.
Parameters: - event_data (annotations.EventData) – event data object
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
f0s_to_jams
(f0_data, description=None)[source]¶ Convert f0 annotations into jams format.
Parameters: - f0_data (annotations.F0Data) – f0 annotation object
- description (str) – annotation descriptoin
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
jams_converter
(audio_path=None, spectrogram_path=None, beat_data=None, chord_data=None, note_data=None, f0_data=None, section_data=None, multi_section_data=None, tempo_data=None, event_data=None, key_data=None, lyrics_data=None, tags_gtzan_data=None, tags_open_data=None, metadata=None)[source]¶ Convert annotations from a track to JAMS format.
Parameters: - audio_path (str or None) – A path to the corresponding audio file, or None. If provided, the audio file will be read to compute the duration. If None, ‘duration’ must be a field in the metadata dictionary, or the resulting jam object will not validate.
- spectrum_cante100_path (str or None) – A path to the corresponding spectrum file, or None.
- beat_data (list or None) – A list of tuples of (annotations.BeatData, str), where str describes the annotation (e.g. ‘beats_1’).
- chord_data (list or None) – A list of tuples of (annotations.ChordData, str), where str describes the annotation.
- note_data (list or None) – A list of tuples of (annotations.NoteData, str), where str describes the annotation.
- f0_data (list or None) – A list of tuples of (annotations.F0Data, str), where str describes the annotation.
- section_data (list or None) – A list of tuples of (annotations.SectionData, str), where str describes the annotation.
- multi_section_data (list or None) – A list of tuples. Tuples in multi_section_data should contain another list of tuples, indicating annotations in the different levels e.g. ([(segments0, level0), ‘(segments1, level1)], annotator) and a str indicating the annotator
- tempo_data (list or None) – A list of tuples of (float, str), where float gives the tempo in bpm and str describes the annotation.
- event_data (list or None) – A list of tuples of (annotations.EventData, str), where str describes the annotation.
- key_data (list or None) – A list of tuples of (annotations.KeyData, str), where str describes the annotation.
- lyrics_data (list or None) – A list of tuples of (annotations.LyricData, str), where str describes the annotation.
- tags_gtzan_data (list or None) – A list of tuples of (str, str), where the first srt is the tag and the second is a descriptor of the annotation.
- tags_open_data (list or None) – A list of tuples of (str, str), where the first srt is the tag and the second is a descriptor of the annotation.
- metadata (dict or None) – A dictionary containing the track metadata.
Returns: jams.JAMS – A JAMS object containing the annotations.
-
mirdata.jams_utils.
keys_to_jams
(key_data, description)[source]¶ Convert key annotations into jams format.
Parameters: - key_data (annotations.KeyData) – key data object
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
lyrics_to_jams
(lyric_data, description=None)[source]¶ Convert lyric annotations into jams format.
Parameters: - lyric_data (annotations.LyricData) – lyric annotation object
- description (str) – annotation descriptoin
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
multi_sections_to_jams
(multisection_data, description)[source]¶ Convert multi-section annotations into jams format.
Parameters: - multisection_data (list) – list of tuples of the form [(SectionData, int)]
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
notes_to_jams
(note_data, description)[source]¶ Convert note annotations into jams format.
Parameters: - note_data (annotations.NoteData) – note data object
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
sections_to_jams
(section_data, description=None)[source]¶ Convert section annotations into jams format.
Parameters: - section_data (annotations.SectionData) – section data object
- description (str) – annotation description
Returns: jams.Annotation – jams annotation object.
-
mirdata.jams_utils.
tag_to_jams
(tag_data, namespace='tag_open', description=None)[source]¶ Convert lyric annotations into jams format.
Parameters: - lyric_data (annotations.LyricData) – lyric annotation object
- namespace (str) – the jams-compatible tag namespace
- description (str) – annotation descriptoin
Returns: jams.Annotation – jams annotation object.