Skip to content

week 6.03 12.03.2017

Matthijs Van keirsbilck edited this page Mar 29, 2017 · 1 revision

database management

  • goal: converting .Wav files to .MFCC files using HTK toolbox; - if you have MFCC file of whole .wav, extract specific parts from it using
    HCopy -C config0 -s 10e7 -e 11e7 source.mfcc target.mfcc (cuts 00:10 .. 00:11 from source.)

  • do this in batch using scripts, see audioSR/Preprocessing

    1. find a database (eg using this)
    2. search folders for .wav files; save them in a file; prepare file (.scp) for HCopy batch command. See prepareWAV_HTK.py
    3. the WAV files had corrupted headers; fix them -> see fixWavs.py
    4. run prepareWAV_HTK on fixed Wavs
    5. run HCopy using the output wrom prepareWAV_HTK
    6. copy files in 'mfc' folder back to wav folder. All label files, wav files, and mfc files are now together.
  • Using the data:

Visualizing Audio
Audio SR plan
TODO

finishing lipreading software so we can easily choose a different network/model to use when evaluating. Also add Top-1 and Top-5 accuracy