This is our machine learning project for CMPUT 466. We have built a model which is able to transfer the style of one audio file onto another.
This model works by first utilizing the short time Fourier transform to convert our input songs into a spectrogram. This spectrogram is then interpreted as an image and passed on to a convolutional neural network to transfer the style of another spectrogram onto it. We then take our style transferred spectrogram through the inverse short time Fourier transform to create the style transferred audio file.
At 2:00, it is mentioned that "the typical Fourier transform [...] is non-invertible." Though this precise statement is false, the sentiment that this method is lossy remains true our application. Constructing the spectrogram using the standard Fourier transform neglects the phase information of the input, which is essential for reconstructing an audio file after the style transfer.
This model was built by:
Adam Elamy - [ Ссылка ]
Cameron Hildebrandt - [ Ссылка ]
Fahad Ahammed -
Max Melendez - [ Ссылка ]
![](https://i.ytimg.com/vi/jmB4IhfGhuY/mqdefault.jpg)