March 07, 09:58

Russian STT datasets

Anyone knows more proper datasets?

I found this (60 hours), but I could not find the link to the dataset:

www.lrec-conf.org/proceedings/lrec2010/pdf/274_Paper.pdf

Anyway, here is the list I found:

- 20 hours of Bible github.com/festvox/datasets-CMU_Wilderness;

- www.kaggle.com/bryanpark/russian-single-speaker-speech-dataset - does not say how many hours

- Ofc audio book datasets - www.caito.de/data/Training/stt_tts/ + and some scraping scripts github.com/ainy/shershe/tree/master/scripts

- And some disappointment here voice.mozilla.org/ru/languages

#deep_learning

Download 274_Paper.pdf 0.31 MB