~ How To: Generate an Audio Fingerprinting Data Set With Sox Audio Effects» By Joren on Wednesday 07 December 2011
A small part of Tarsos has been turned into a audio fingerprinting application. The idea of audio fingerprinting is to create a condensed representation of an audio file. A perceptually similar audio file should generate similar fingerprints. To test how robust a fingerprinting technique is, a data set with audio files that are alike in some way is practical.
SoX – Sound eXchange is a command line utility for sound processing. It can apply audio effects to a sound. Using these effects and a set of unmodified songs an audio fingerprinting data set can be created. To generate such a data set SoX can be used to:
- Trim the first x seconds of a file
- Speed-up or slow-down the audio
- Change the pitch of a file without modifying the tempo
- Generate background noise (white noise is used)
- Reverse the audio stream
#Trim the first 10 seconds sox input.wav output.wav trim 10 #speed-up of 10% sox input.wav output.wav speed 1.10 #change the pitch upwards 100 cents (one semitone) #without changing the tempo sox input.wav output.wav pitch 100 #generate white noise with the length of input.wav sox input.wav noise.wav synth whitenoise #mix the white noise with the input to generate noisy output #-v defines how loud the white noise is sox -m input.wav -v 0.1 noise.wav output.wav #reverse the audio sox input.wav output.wav reverse
A ruby script to generate a lot of these files can be found attached.