graviti logoProductOpen DatasetsAbout us
Sign in
181
0
0
Free Spoken Digit
General
Discussion
Code
Activities

Overview

A simple audio/speech dataset consisting of recordings of spoken digits in wav files at 8kHz. The recordings are trimmed so that they have near minimal silence at the beginnings and ends.
FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.

Data Collection

FSDD is an open dataset, which means it will grow over time as data is contributed. In order to enable reproducibility and accurate citation the dataset is versioned using Zenodo DOI as well as git tags.
Please contribute your homemade recordings. All recordings should be mono 8kHz wav files and be trimmed to have minimal silence. Don't forget to update metadata.py with the speaker meta-data.
To add your data, follow the recording instructions in acquire_data/say_numbers_prompt.py and then run split_and_label_numbers.py to make your files.

Data Format

Files are named in the following format: {digitLabel}{speakerName} {index}.wav Example: 7_jackson_32.wav
Now it contains 3,000 recordings (50 of each digit per speaker) from 6 speaks in English Prounciations.
metadata.py contains meta-data regarding the speakers gender and accents.

License

CC BY-SA 4.0

🎉Many thanks to Graviti Open Datasets for contributing the dataset
Basic Information
Application ScenariosNot Available
AnnotationsClassification
TasksVoice Print RecognitionASR
LicenseCC BY-SA 4.0
Updated on2021-03-24 19:49:11
Metadata
Data TypeAudio
Data Volume3k
Annotation Amount0
File Size0B
Copyright Owner
Zohar Jackson
Annotator
Unknown
More Support Options
Start building your AI now
Get StartedContact