Audio Representation for Machine Learning

Tim Anderton (08.Jun.2018 at 15:00, 50 min)
Talk at OpenWest Conference 2018 (English - US)

Rating: 0 of 5

When training machine learning systems on audio data for tasks like speech recognition it is useful to first transform the audio into a rich intermediate representation like a spectrogram. Although with enough data effective models can be trained to use the raw audio as inputs models which begin with rich representations typically perform better. I will talk about several different audio representation schemes including spectrograms, mel filter banks, and MFCC's and wavelets. We will discuss how each of these representations works, the types of information preserved and destroyed by each, and their strengths and weaknesses from a machine learning perspective. [322]

Who are you?

Claim talk

Talk claims have been moved to the new site.

Please login to the new site to claim your talk

Comments closed.
No comments yet.
© 2019