0110.be logo

~ How To: Generate an Audio Fingerprinting Data Set With Sox Audio Effects

A small part of Tarsos has been turned into a audio fingerprinting application. The idea of audio fingerprinting is to create a condensed representation of an audio file. A perceptually similar audio file should generate similar fingerprints. To test how robust a fingerprinting technique is, a data set with audio files that are alike in some way is practical.

SoX - Sound eXchange is a command line utility for sound processing. It can apply audio effects to a sound. Using these effects and a set of unmodified songs an audio fingerprinting data set can be created. To generate such a data set SoX can be used to:

```ruby\ #Trim the first 10 seconds\ sox input.wav output.wav trim 10

speed-up of 10%\

sox input.wav output.wav speed 1.10

change the pitch upwards 100 cents (one semitone)\

#without changing the tempo\ sox input.wav output.wav pitch 100

generate white noise with the length of input.wav\

sox input.wav noise.wav synth whitenoise\ #mix the white noise with the input to generate noisy output\ #-v defines how loud the white noise is\ sox -m input.wav -v 0.1 noise.wav output.wav

reverse the audio\

sox input.wav output.wav reverse\ ```

A ruby script to generate a lot of these files can be found “attached”:[audio_fingerprinting_dataset_generator.rb.txt].