Voice activity detection using SoX

Install SoX. Relevant documentation (search for vad).

Let’s say your vice file is a.wav.

$ play ~/Downloads/a.wav vad norm
/home/dilawars/Downloads/a.wav:

 File Size: 52.3k     Bit Rate: 328k
  Encoding: Signed PCM
  Channels: 1 @ 16-bit
Samplerate: 20500Hz
Replaygain: off
  Duration: 00:00:01.27

In:100%  00:00:01.27 [00:00:00.00] Out:20.0k [      |      ] Hd:0.0 Clip:0
Done.

It is decent!

You can play the trimmed version. It won’t sound like a magically filtered voice-only file but you can use this utility to take a decision if there is voice in the sample at all.

Make sure to run it on your samples before using it. SoX comes with many other utility function. It’s like ffmpeg for sound files!.

Leave a Reply

Scroll to Top