graviti logoProductOpen DatasetsAbout
Request DemoSign in
214
0
0
Free Spoken Digit
General
Discussion
Code
Activities
c77d3213-8cd1-11eb-88ae-0e1f58d5e9a9
798ce68·
Jun 22, 2021 8:10 AM
·1Commits

Overview

A simple audio/speech dataset consisting of recordings of spoken digits in wav files at 8kHz. The recordings are trimmed so that they have near minimal silence at the beginnings and ends.
FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.

Data Collection

FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.
Please contribute your homemade recordings. All recordings should be mono 8kHz wav files and be trimmed to have minimal silence. Don't forget to update metadata.py with the speaker meta-data.
To add your data, follow the recording instructions in acquire_data/say_numbers_prompt.py and then run split_and_label_numbers.py to make your files.

Data Format

Files are named in the following format: {digitLabel}{speakerName} {index}.wav Example: 7_jackson_32.wav
Now it contains 3,000 recordings (50 of each digit per speaker) from 6 speaks in English Prounciations.
metadata.py contains meta-data regarding the speakers gender and accents.

🎉Many thanks to Graviti Open Datasets for contributing the dataset
Basic Information
Application ScenariosNot Available
AnnotationsNot Available
TasksNot Available
LicenseCC BY-SA 4.0
Updated on2021-01-20 04:00:59
Metadata
Data TypeNot Available
Data Volume3K
Annotation Amount0
File Size0B
Copyright Owner
Unknown
Annotator
Unknown
More Support Options
Start building your AI now
Get StartedContact