This blog post documents how to get the Matlab implementation by Dan Ellis of Avery Wangs Industrial-Strength Audio Search Algorithm running with GNU Octave on Ubuntu (and similar Linux distributions).
The Dan Ellis implementation is nicely documented here: Robust Landmark-Based Audio Fingerprinting . To download, get info about and decode mp3’s some external binaries are needed:
#install octave if needed sudo apt-get install octave3.2 #Install the required dependencies for the script sudo apt-get install mp3info curl #mpg123 is not present as a package, install from source: wget http://www.mpg123.de/download/mpg123-1.13.5.tar.bz2 tar xvvf mpg123-1.13.5.tar.bz2 cd mpg123-1.13.5/ ./configure make sudo make install
mp3read.m the following code was changed (line 111 and 112):
mpg123 = 'mpg123'; % was fullfile(path,['mpg123.',ext]); mp3info = 'mp3info'; % was fullfile(path,['mp3info.',ext]);
Then, the demo program runs flawlessly when executing
octave -q demo_fingerprint.m.
Running the demo with the original code with GNU Octave, version 3.2.3 takes 152 seconds on a PC with a Q9650 @ 3GHz processor. A small tweak can make it run almost 8 times faster. When working with larger data sets (10k audio files) this makes a big difference. I do not know why but storing a hash in the large hash table was really slow (0.5s per hash, with 900 hashes per song…). Caching the hashes and adding them all at once makes it faster (at least in Octave, YMMV). The optimized version of record_hashes.m can be found attached. With this alteration the same demo ran in 20s. When caching the data locally the difference is 11.5s to 141s or 12 times faster. The code with all the changes can be found here: Robust Landmark-Based Audio Fingerprinting – optimized for Octave 3.2]. Please note again that the implementation is done by Dan Ellis (2009) ( available on Robust Landmark-Based Audio Fingerprinting) and I did only some small tweaks.