~ How To: Generate an Audio Fingerprinting Data Set With Sox Audio Effects
» By Joren on Wednesday 07 December 2011A small part of Tarsos has been turned into a audio fingerprinting application. The idea of audio fingerprinting is to create a condensed representation of an audio file. A perceptually similar audio file should generate similar fingerprints. To test how robust a fingerprinting technique is, a data set with audio files that are alike in some way is practical.
SoX – Sound eXchange is a command line utility for sound processing. It can apply audio effects to a sound. Using these effects and a set of unmodified songs an audio fingerprinting data set can be created. To generate such a data set SoX can be used to:
- Trim the first x seconds of a file
- Speed-up or slow-down the audio
- Change the pitch of a file without modifying the tempo
- Generate background noise (white noise is used)
- Reverse the audio stream
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#Trim the first 10 seconds
sox input.wav output.wav trim 10
#speed-up of 10%
sox input.wav output.wav speed 1.10
#change the pitch upwards 100 cents (one semitone)
#without changing the tempo
sox input.wav output.wav pitch 100
#generate white noise with the length of input.wav
sox input.wav noise.wav synth whitenoise
#mix the white noise with the input to generate noisy output
#-v defines how loud the white noise is
sox -m input.wav -v 0.1 noise.wav output.wav
#reverse the audio
sox input.wav output.wav reverse
A ruby script to generate a lot of these files can be found attached.