Audio Word2vec Guide - Quantizes phonemes, transforms, GAN trains on text and audio.
Wav2vec is fascinating in that it combines several neural network architectures and methods: CNN, transformer, quantization, and GAN training. I bet you’ll enjoy this guide through Wav2vec papers solving the problem of speech to text.
- Notes & links: [ Ссылка ]
- Wav2vec 2.0 - A Framework for Self-Supervised Learning of Speech Representations
- Wav2vec-U - Unsupervised Speech Recognition
![](https://i.ytimg.com/vi/PHIKbgMJq4c/mqdefault.jpg)