Preprocessing Audio Datasets for Machine Learning