Deep Learning cleans podcast episodes from ‘ahem’ sounds

“3.5 mm audio jack… Ahem!!” where did you hear that? ;) Well, this post is not about Google Pixel vs iPhone 7, but how to remove ugly “Ahem” sound from a speech using deep convolutional neural network. I must say, very interesting read.



By Francesco Gadaleta, Data Science at Home.

Do you know why you can’t hear the ugly ahem sounds on the podcast Data Science at Home? Because we remove them. Actually not us. A neural network does.

ahem detector explained

The ahem detector is a deep convolutional neural network trained on transformed audio signals to recognize ahem sounds. The network has been trained to detect such signals on the episodes of Data Science at Home, the podcast about data science at podcast.datascienceathome.com

Slides and technical details are provided here.

But before proceeding, some concepts should be clarified.

Two sets of audio files are required, very similarly to a cohort study:

  • a negative sample with clean voice/sound and
  • a positive one with “ahem” sounds concatenated

While the detector works for the aforementioned audio files, it can be generalized to any other audio input, provided enough data are available. The minimum required is ~10 seconds for the positive samples and ~3 minutes for the negative cohort. The network will adapt to the training data and can perform detection on a different spoken voice. A GPU is recommended for the training as – under the conditions specific to this example – at least 5 epochs are required to obtain ~81% accuracy.

Once the artificial brain has learned what is good and what not, a new audio file must be transformed in the same way of the training samples. This can be easily done with a utility provided together with the rest of the code.

The entire project is published on github

Enjoy!

Original post. Reposted with permission.

Bio: Francesco Gadaleta is Data Scientist at Janssen Pharmaceutical Companies of Johnson & Johnson and a Science writer. He is committed to “A World Without Disease” paradigm shift in healthcare, leveraging Artificial Intelligence and Data Science to predict risk and intercepting diseases. He is focused on putting machine learning at the service of human beings.

Related: