A simple audio/speech dataset consisting of recordings of spoken
digits in wav files at 8kHz. The recordings are
trimmed so that they have near
minimal silence at the beginnings and ends.
FSDD is an open dataset, which means it will grow over time
as data is contributed. In order to enable reproducibility and accurate
citation the dataset is versioned using Zenodo
DOI as well as git tags.