~ How To: Generate an Audio Fingerprinting Data Set With Sox Audio Effects
» By Joren on Wednesday 07 December 2011A small part of Tarsos has been turned into a audio fingerprinting application. The idea of audio fingerprinting is to create a condensed representation of an audio file. A perceptually similar audio file should generate similar fingerprints. To test how robust a fingerprinting technique is, a data set with audio files that are alike in some way is practical.
SoX - Sound eXchange is a command line utility for sound processing. It can apply audio effects to a sound. Using these effects and a set of unmodified songs an audio fingerprinting data set can be created. To generate such a data set SoX can be used to:
-
Trim the first x seconds of a file
-
Speed-up or slow-down the audio
-
Change the pitch of a file without modifying the tempo
-
Generate background noise (white noise is used)
-
Reverse the audio stream
```ruby\ #Trim the first 10 seconds\ sox input.wav output.wav trim 10
speed-up of 10%\
sox input.wav output.wav speed 1.10
change the pitch upwards 100 cents (one semitone)\
#without changing the tempo\ sox input.wav output.wav pitch 100
generate white noise with the length of input.wav\
sox input.wav noise.wav synth whitenoise\ #mix the white noise with the input to generate noisy output\ #-v defines how loud the white noise is\ sox -m input.wav -v 0.1 noise.wav output.wav
reverse the audio\
sox input.wav output.wav reverse\ ```
A ruby script to generate a lot of these files can be found “attached”:[audio_fingerprinting_dataset_generator.rb.txt].