Last active
December 7, 2022 20:07
-
-
Save suyashcjoshi/2a902a5357e67f081827083bf4c9b45a to your computer and use it in GitHub Desktop.
Free to use Audio DataSets - Music, Speech, MIDI
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Here is a list of various audio datasets, most of them are free use use under respective license. | |
1. Audio Data Set by Google Research : 10-second sound clips drawn from 2,084,320 YouTube videos containing 527 labels. They contain a wide variety of every day sounds. | |
License : DataSet is under Creative Commons Attribution 4.0 International (CC BY 4.0) license, while the ontology is available under a Creative Commons Attribution-ShareAlike 4.0 International(CC BY-SA 4.0) license. | |
Link: https://research.google.com/audioset/index.html | |
2. MAPS Database : A piano database for multipitch estimation and automatic transcription of music. 31 GB of CD-quality recordings in .wav format. | |
License: (CC BY-NC-SA 2.0 FR) | |
Link: https://www.tsi.telecom-paristech.fr/aao/en/2010/07/08/maps-database-a-piano-database-for-multipitch-estimation-and-automatic-transcription-of-music/ | |
3. Free Music Archieve DataSet : Full length, high quality audio audio from 106,574 tracks from 16,341 artists and 14,854 albums. It arranged in a hierarchical taxonomy of 161 genres. | |
License: Creative Commons Attribution 4.0 International License (CC BY 4.0) | |
Link: https://github.com/mdeff/fma | |
4. FreeSound : It has more than 445k sounds contributed by people, highly encourage checking it out and contributing your own sounds. | |
License: Various Types of Creative Common License for differnt sounds zero (cc0), attribution (by), attribution noncommercial etc. (by-nc) | |
Link: https://freesound.org | |
5. Urban Sound Dataset : It contains 27 hours of audio with 18.5 hours of annotated sound event occurrences across 10 sound classes. | |
License: Creative Commons Attribution Noncommercial License (by-nc), version 3.0 | |
Link : https://urbansounddataset.weebly.com | |
6. The NSynth Dataset : The NSynth Dataset is an audio dataset containing ~300k musical notes, each with a unique pitch, timbre, and envelope. Each note is annotated with three additional pieces of information based on a combination of human evaluation and heuristic algorithms: Source, Family, and Qualities. | |
License : Creative Commons Attribution 4.0 International (CC BY 4.0) license | |
Link : https://magenta.tensorflow.org/datasets/nsynth | |
7. Groove MIDI DataSet : The Groove MIDI Dataset (GMD) is composed of 13.6 hours of aligned MIDI and (synthesized) audio of human-performed, tempo-aligned expressive drumming captured on a Roland TD-11 V-Drum electronic drum kit. | |
License : Creative Commons Attribution 4.0 International (CC BY 4.0) License | |
Link : https://magenta.tensorflow.org/datasets/groove | |
8. Common Voice by Mozzila : It comprises of multi-language dataset of various voices in .mp3 format. It is already used by lot of speech recognition AI systems for training purpose. | |
License : CC-0 | |
Link: https://commonvoice.mozilla.org/en/datasets | |
9. TED-LIUM : The TED-LIUM corpus is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech. | |
License : Creative Commons BY-NC-ND 3.0 | |
Link : https://www.tensorflow.org/datasets/catalog/tedlium | |
10. CREMA-D (Crowd-sourced Emotional Mutimodal Actors Dataset) : It is a data set of 7,442 original clips from 91 actor. In the clips, actor speak from a selection of 12 sentences using one of six different emotions (Anger, Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified). | |
License : Open Database License | |
Link : https://github.com/CheyneyComputerScience/CREMA-D | |
Apart from above, if you have video recordings and want to export audio from that, use this powerful open source tool 'fffmeg' : https://stackoverflow.com/questions/9913032/how-can-i-extract-audio-from-video-with-ffmpeg |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment