May 02, 06:02

Russian Open Speech To Text (STT/ASR) Dataset

4000 hours of STT data in Russian

Made by us. Yes, really. I am not joking.

It was a lot of work.

The dataset:

github.com/snakers4/open_stt/

Accompanying post:

spark-in.me/post/russian-open-stt-part1

TLDR:

- On third release, we have ~4000 hours;

- Contributors and help wanted;

- Let's bring the Imagenet moment in STT closer together!;

Please repost this as much as you can.

#stt

#asr

#data_science

#deep_learning

snakers4/open_stt

Russian open STT dataset. Contribute to snakers4/open_stt development by creating an account on GitHub.