Danchik Audio processing

Created time

Mar 5, 2023 09:18 AM

Summary

Progress

Done

Category

Programming

URL

Source

Tet-A-Tet

44100Hz - default frames per second

How can we approach audio data?

Raw amplitudes - hard to extract features

Spectrograms - already know some models which can deal with it

Time domain vs Frequency domain

Fast Fourier Transformation

FF is all you need!

MFCC

LSTM architecture

torchaudio or librossa

S3PRL - audio processing models (recognition and so)

Squeezeformer