I've updated the Google Colab notebook that I use for making datasets. There are quite a few changes - added rnnoise again, added Demucs, changed audio normalization, added speaker diarization and segmentation, added segmentation with ffmpeg for long files. In this video I just go over the changes to the dataset tools, and take some audio files, process them, and run them through Whisper.
Updated dataset tools notebook with Coqui VITS model training:
[ Ссылка ]
I go over training in some of the other videos:
[ Ссылка ]
![](https://s2.save4k.ru/pic/196h4JsqmZc/maxresdefault.jpg)