FSD is a large-scale, general-purpose audio dataset

Thousands of audio samples from Freesound organised following the AudioSet Ontology


FSD: a dataset of everyday sounds

The AudioSet Ontology is a hierarchical collection of over 600 sound classes and we have filled them with 297,160 audio samples from Freesound. This process generated 678,731 candidate annotations that express the potential presence of sound sources in audio clips. FSD includes a variety of everyday sounds, from human and animal sounds to music and sounds made by things, all under Creative Commons licenses.



Cattle, bovinae

Cymbal

Bowed string instrument

Speech synthesizer

Crumpling, crinkling

Power windows, electric windows

Female speech, woman speaking

Rustle

Dental drill, dentist's drill

Brass instrument

190/396 categories have reached our first goal!

Crowdsourcing annotations

By creating this dataset, we seek promoting research that will enable machines to hear and interpret sound similarly to humans. But to make FSD reliable enough for research, we need to verify the generated annotations. So we are now crowdsourcing annotations to build the first FSD release, which will include waveforms, audio features, ground truth and additional metadata. Our first goal is to gather at least 100 verified samples per category (whenever available). Wanna contribute?




Our long-term goals


+600k
annotations
improving
quantity
& quality
+260k
audio samples

Currently


12.1%
annotations
verified

If you use this dataset in your work please cite our paper or check it out for more information:
Freesound Datasets: A Platform for the Creation of Open Audio Datasets
E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter & X. Serra
In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017 [BiB]