Articles Tagged 'UGent'

~ ESP32 Thing as xOSC alternative

ESP32 Thing The xOSC board by x-io technologies looks like a very nice solution in many interactive wireless setups. Judging from the specifications and documentation it offers a lot of value. It is basically a small WiFi transmitter with some sensors and a battery attached to it. The board also has some drawbacks. 1) It is expensive at about € 180. This is especially problematic if you need about five or so for your application.2) It seems that it is also hard to add extra sensors via SPI or I²C. 3)The battery needs to be removed to charge, which makes it harder to build into a fixed enclosure. This post describes an alternative based on the ESP32 platform that addresses these shortcomings.

The ESP32 is a micro-controller with a WiFi transmitter which can be programmed using the Arduino environment. Sparkfun has a thing called the ESP32 Thing which contains the ESP32 chip. It can be used to build an xOSC alternative.

  1. It costs about 20$, when you add a battery 5$ and a sensor 20$ (IMU) you end up with a 45$ price tag. The price of course depends on which exact sensor/battery you need for your application. A 500mAh lasts about two hours when sending 66 messages per second over WiFi (using UDP).
  2. The ESP32 Thing supports the Arduino environment which potentially allows you to use all available Arduino libraries and supported sensors. However, some libraries do contain hardware specific instructions which are often not ported yet. Since the hardware is rather new – large scale production started only 3 months ago – not many libraries have been ported. Fortunately a lot of libraries simply work without any changes. At hackaday they have been testing a few: ESP32 and Arduino libraries. I had success with the BNO55 library, it did not need any changes. The OSC library did need some small changes to operate as expected.
  3. The Thing contains a battery charging circuit. Once embedded into an enclosure the battery can stay in place. The software running on the device even keeps running when changing power sources.

Attached to this post you can find modifications to the Andriod OSC library that enable it to run on the ESP32: ESP32-Arduino-OSC-library together with a patch that sends random data over OSC. This should enable you to build an xOSC alternative.

Some drawbacks of the ESP32 is that the supporting software is quite immature. There is a Bluetooth chip on the ESP32 which is currently not supported in the Arduino environment. The setup can be somewhat challenging. The documentation can be improved. Some of the ESP32 Things seem to be unable to connect to WiFi which is quite annoying.

  • A signal from the ESP32

    A signal from the ESP32

  • Graphical datasheet

    Graphical datasheet

  • ESP32 with battery and sensor.

    ESP32 with battery and sensor.

~ Ipem at Opening Event Digital Week

Last Saturday, October eight 2016, IPEM was present at the opening event of the digital week. A small video report was made for VRT news, unfortunately our contribution did not make the cut.

Van 8 tot en met 16 oktober 2016 loopt de elfde editie van de De Digitale Week. Plaatselijke organisaties in heel Vlaanderen en Brussel organiseren tijdens deze week diverse laagdrempelige activiteiten waarbij het gebruik van multimedia centraal staat, steeds gratis of zeer goedkoop, en open voor zowel beginners als mensen met wat meer ervaring. Daarnaast loopt er tijdens de Digitale Week een grote publiciteitscampagne die aandacht vraagt voor de thema’s e-inclusie en mediawijsheid.

  • Overview of our installation

    Overview of our installation

~ IPEM at Parklife 2016

This weekend IPEM, the research institute in musicology of University Ghent, was present at Parklife 2016. Parklife is a music festival with a special focus on interactive music installations aimed at children. Two of those were provided by IPEM.

The first installation was a trampoline that triggered sounds. Two trampoline were provided with a pressure sensor. An Axoloti provides the sonic feedback. A simple but fun experience especially for younger children.

The second installation was more involved. It consisted of a bike – controlled by a first participant – that provided the speed of falling blocks that a second participant had to step on. When the second participant triggered the blocks on time a melody appeared. The video above makes it more clear.

~ Real-time signal synchronization with acoustic fingerprinting - A Master's Thesis By Ward Van Assche

During the last semester Ward wrote a Masters thesis titled Real-time signal synchronization with acoustic fingerprinting. For his thesis Marleen Denert and I served both as promoter.

The aim of the thesis was to design and develop a system to automatically synchronize streams of incoming sensor data in real-time. Ward followed up on an idea that was described in an article called Synchronizing Multimodal Recordings Using Audio-To-Audio Alignment. The extended abstract can be consulted. The remainder of the thesis is in Dutch.

For the thesis Ward developed a Max/MSP object to read data from sensors together with audio. Also provided by Ward is an object to synchronize audio and data in real-time. The objects are depicted above.

~ Connecting Musical Modules - Musical Hardware and Software Interfaces

Axoloti logo I have given a presentation at the the Newline conference, a yearly event organized by the Hackerspace Ghent. It was about:

“In this talk I will give a practical overview on how to connect hard- and software components for musical applications. Next to an overview there will be demos! Do you want to make a musical instrument using a light sensor? Use your smartphone as an input device for a synth? Or are you simply interested in simple low-latency communication between devices? Come to this talk! More concretely the talk will feature the Axoloti audio board, Teensy micro-controller with audio board, MIDI and OSC protocols, Android MIDI features and some sensors.”

During the presentation the hard and software components were demonstrated. More concretely an introduction was given to the following:

The presentation about DIY musical modules can be downloaded here.

~ Lecture on MIR - Tone Scale Extraction - Acoustic Fingerprinting

This morning, the 30th of October 2015, I gave a lecture on Music Information Retrieval in general and two MIR-tasks in particular. The two more detailed tasks were tone scale analysis and acoustic fingerprinting.

A slide

During the lecture some live demonstrations were done with Panako and Tarsos. Also some examples from TarsosDSP were used. Excerpts of the music used is available here, this is especially interesting if you want to repeat the demos. Sonic visualizer, Music21 and MuseScore were also mentioned during the lecture.

The presentation about Music Information Retrieval and the handouts can be found here als well.

~ TgForce Sensor on Android

Kelsec Systems developed a nice sensor for measuring running impact, the TgForce Running Impact Sensor. The sensor comes with an IOS application but has no available counterpart on Android. To interface with the sensor on Android I needed to create some glue code. The people of Kelsec Systems were kind enough to mail some documentation about the protocol and with that information I got to work.

The TgForce Sensor Android code is available on GitHub, together with some documentation which is available below as well:

TgForce Impact Running Sensor Andoid API

The TgForceSensor repository contains Android code to interface with the TgForce Impact Running Sensor. The TgForce sensor is a Bluetooth LE device that measures tibial shock. It follows the
Bluetooth LE standards and is relatively easy to interface with.

This repository contains Android code to interface with the device. The protocol is encoded in the source code and is documented in the readme.

  • TgForce Sensor

    TgForce Sensor

~ Spontaneous Entrainment of Running Cadence to Music Tempo

Collega Edith van Dyck stuurde vorige week een persbericht rond over het onderzoek dat ze deed rond muziek en sporten. UGent persbericht ‘Muziek beïnvloedt pasfrequentie bij lopers’:

Aangezien heel wat joggers met muziek trainen, wilden onderzoekers van het IPEM (het onderzoekscentrum van de afdeling Musicologie, Vakgroep Kunst-, Muziek-, en Theaterwetenschappen aan de UGent) nagaan of het tempo van muziek de pasfrequentie tijdens het lopen kan beïnvloeden. Eerdere studies hadden al aangetoond dat muziek een motiverend effect kan hebben op sportprestaties en dat een hogere pasfrequentie blessurepreventief kan werken.

Een neerslag van het onderzoek is te lezen in het artikel Spontaneous Entrainment of Running Cadence to Music Tempo. Het persbericht werd goed opgepikt door de media en ook de lokale televisiezender AVS vertoonde interesse. Een cameraploeg kwam langs en dit resulteerde in volgend verslag. In het verslag spelen mijn vriendin en ikzelf een figurantenrol. De hoofdrol is weggelegd voor Dieter.

~ Access Mi Band from Android - Notes on the Bluetooth LE Protocol

Vibrate flowchartThe Mi Band is a bracelet with some sensors, three RGB leds and a vibration motor. It is marketed as an activity tracker and notifier. It is a neat little device that communicates via Bluetooth LE and has a battery life of around 30 days. It would be nice if it could be used for whatever purpose you want but alas, its API is not very open. This blog post gives pointers to useful resources and tips to make it work with your own code.

There have been some efforts to reverse engineer the Bluetooth protocol. This blog post contains some info. There are even complete implementations available of the protocol, there is a Mi Band protocol implementation in python and a Mi Band protocol implementation in Java. It is however not always clear which firmware version is targeted.

I would advise against installing the official Mi Band app, if you want to use it with custom code. The app upgrades the firmware to the latest version and it seems that Xiaomi is obfuscating the protocol more and more with each version. I was able to send vibrate and led commands to a Mi Band with firmware version With the previously mentioned sources and the flow described to the right the device reacts to commands. I used an Android device. The flow:

  1. Pair with the Mi Band in the Android Bluetooth setting.
  2. In your code, connect to the paired device. Save the device address, you will need it later.
  3. Send a pair command to the device. This is part of the Mi Band protocol and has nothing to do with the previous Bluetooth pairing. If all goes well it reacts with a 2. See here
  4. Send user info. This step is crucial and not trivial. The user info needs to be encoded in a certain way and is CRC’d with the device address. The following is an example implementation of the Mi Band user info encoding
  5. Now you can send vibrate or other commands.

Some notes: the self-test command works without the set user step. For Android the Mi Band protocol implementation in Java works well. To check the firmware version of the device, call the get device info characteristic. The last bytes, interpreted as an integer, define the version info. For my device it is

Write to characteristic 0000ff05-0000-1000-8000-00805f9b34fb
onCharacteristicWrite status: 0 characteristic 0000ff05-0000-1000-8000-00805f9b34fb
Read firmware version
11 value: 2
12 value: 3
13 value: 9
14 value: 0
15 value: 1

Another note: the set user info needs to be called with a 1 as type the first time the band is used. This is done with new UserInfo(20111111, 1, 32, 180, 55, "NM", 1) with the Android sdk by GitHub user pangliang. This sets and overwrites the user info. The next times you do not want to overwrite the info and the type needs to be zero.

~ Synchronizing Multimodal Recordings Using Audio-To-Audio Alignment - In Journal on Multimodal User Interfaces

The article titled “Synchronizing Multimodal Recordings Using Audio-To-Audio Alignment” by Joren Six and Marc Leman has been accepted for publication in the Journal on Multimodal User Interfaces. The article will be published later this year. It describes and tests a method to synchronize data-streams. Below you can find the abstract, pointers to the software under discussion and an author version of the article itself.

Synchronizing Multimodal Recordings Using Audio-To-Audio Alignment
An Application of Acoustic Fingerprinting to Facilitate Music Interaction Research

Abstract: Research on the interaction between movement and music often involves analysis of multi-track audio, video streams and sensor data. To facilitate such research a framework is presented here that allows synchronization of multimodal data. A low cost approach is proposed to synchronize streams by embedding ambient audio into each data-stream. This effectively reduces the synchronization problem to audio-to-audio alignment. As a part of the framework a robust, computationally efficient audio-to-audio alignment algorithm is presented for reliable synchronization of embedded audio streams of varying quality. The algorithm uses audio fingerprinting techniques to measure offsets. It also identifies drift and dropped samples, which makes it possible to find a synchronization solution under such circumstances as well. The framework is evaluated with synthetic signals and a case study, showing millisecond accurate synchronization.

To read the article, consult the author version of Synchronizing Multimodal Recordings Using Audio-To-Audio Alignment. The data-set used in the case study is available here. It contains a recording of balanceboard data, accelerometers, and two webcams that needs to be synchronized. The final publication is available at Springer via 10.1007/s12193-015-0196-1

The algorithm under discussion is included in Panako an audio fingerprinting system but is also available for download here. The SyncSink application has been packaged separately for ease of use.

To use the application start it with double click the downloaded SyncSink JAR-file. Subsequently add various audio or video files using drag and drop. If the same audio is found in the various media files a time-box plot appears, as in the screenshot below. To add corresponding data-files click one of the boxes on the timeline and choose a data file that is synchronized with the audio. The data-file should be a CSV-file. The separator should be ‘,’ and the first column should contain a time-stamp in fractional seconds. After pressing Sync a new CSV-file is created with the first column containing correctly shifted time stamps. If this is done for multiple files, a synchronized sensor-stream is created. Also, ffmpeg commands to synchronize the media files themselves are printed to the command line.

This work was supported by funding by a Methusalem grant from the Flemish Government, Belgium. Special thanks goes to Ivan Schepers for building the balance boards used in the case study. If you want to cite the article, use the following BiBTeX:

  author      = {Joren Six and Marc Leman},
  title       = {{Synchronizing Multimodal Recordings Using Audio-To-Audio Alignment}},
  issn        = {1783-7677},
  volume      = {9},
  number      = {3},
  pages       = {223-229},
  doi         = {10.1007/s12193-015-0196-1},
  journal     = {{Journal of Multimodal User Interfaces}}, 
  publisher   = {Springer Berlin Heidelberg},
  year        = 2015
  • The synchronized data from the two webcams, accelerometer and balanceboard in ELAN. From top to bottom the synchronized streams are two video-streams, balance-board data (red), accelerometer-data (green) and audio (black).

    The synchronized data from the two webcams, accelerometer and balanceboard in ELAN. From top to bottom the synchronized streams are two video-streams, balance-board data (red), accelerometer-data (green) and audio (black).

  • Conceptual drawing used as a basis for the SyncSync application. A reference stream (blue) can be synchronized with streams one and two. It allows a workflow where streams are started and stopped (red) or start before the reference stream (green).

    Conceptual drawing used as a basis for the SyncSync application. A reference stream (blue) can be synchronized with streams one and two. It allows a workflow where streams are started and stopped (red) or start before the reference stream (green).

  • A microcontroller fitted with an electret microphone and a microSD card slot. It can record audio in real-time together with sensor data.

    A microcontroller fitted with an electret microphone and a microSD card slot. It can record audio in real-time together with sensor data.

  • SyncSink Synchronize media files. A user-friendly interface to synchronize media and data files.  First a reference media-file is added using drag-and-drop. The audio steam of the reference is extracted and plotted on a timeline as the topmost box. Subsequently other media-files are added. The offsets with respect to the reference are calculated and plotted. CSV-files with timestamps and data recorded in sync with a stream can be attached to a respective audio stream. Finally, after pressing Sync!, the data and media files are modified to be exactly in sync with the reference.

    SyncSink Synchronize media files. A user-friendly interface to synchronize media and data files. First a reference media-file is added using drag-and-drop. The audio steam of the reference is extracted and plotted on a timeline as the topmost box. Subsequently other media-files are added. The offsets with respect to the reference are calculated and plotted. CSV-files with timestamps and data recorded in sync with a stream can be attached to a respective audio stream. Finally, after pressing Sync!, the data and media files are modified to be exactly in sync with the reference.

  • Multimodal recording system diagram. Each webcam has a microphone and is connected to the pc via USB. The dashed arrows represent analog signals. The balance board has four analog sensors but these are simplified to one connection in the schematic. The analog output of the microphones is also recorded through the DAQ. An analog accelerometer is connected with a microcontroller which also records audio.

    Multimodal recording system diagram. Each webcam has a microphone and is connected to the pc via USB. The dashed arrows represent analog signals. The balance board has four analog sensors but these are simplified to one connection in the schematic. The analog output of the microphones is also recorded through the DAQ. An analog accelerometer is connected with a microcontroller which also records audio.

  • Two streams of audio with fingerprints marked. Some fingerprints are present in both streams (green, O) while others are not (red, x). Matching fingerprints have the same offset, indicated by the dotted lines.

    Two streams of audio with fingerprints marked. Some fingerprints are present in both streams (green, O) while others are not (red, x). Matching fingerprints have the same offset, indicated by the dotted lines.

  • Synchronized streams in Sonic Visualizer. Here you can see two channel audio synchronized with accelerometer data (top, green) and balanceboard data (bottom, purple).

    Synchronized streams in Sonic Visualizer. Here you can see two channel audio synchronized with accelerometer data (top, green) and balanceboard data (bottom, purple).

~ Control Audio Time Stretching and Pitch Shifting from Java using Rubber Band And JNI

This post explains how to do real-time pitch-shifting and audio time-stretching in Java. It uses two components. The first component is a high quality software C++ library for audio time-stretching and pitch-shifting C++ called Rubber Band. The second component is a Java audio library called TarsosDSP. To bridge the gap between the two JNI is used. Rubber Band provides a JNI interface and starting from the currently unreleased version 1.8.2, makefiles are provided that make compiling and subsequently using the JNI version of Rubber Band relatively straightforward.

However, it still requires some effort to control real-time pitch-shifting and audio time-stretching from java. To make it more easy some example code and documentation is available in a GitHub repository called RubberBandJNI. It documents some of the configuration steps needed to get things working. It also offers precompiled libraries and documents how to compile those for the following systems:

If the instructions are followed rather precisely you are able to control the tempo of a song in real-time with the following Java code:

float tempoFactor = 0.8f;
float pitchFactor = 1.0f;
AudioDispatcher adp =  AudioDispatcherFactory.fromPipe("music.mp3", 44100, 4096, 0);
TarsosDSPAudioFormat format = adp.getFormat();
rbs = new RubberBandAudioProcessor(44100, tempoFactor, pitchFactor);
adp.addAudioProcessor(new AudioPlayer(JVMAudioInputStream.toAudioFormat(format)));
new Thread(adp).start();
  • User interfact to control tempo/pitch of audio in Java. It uses RubberBand, a high quality time-stretcher library implemented in C++, called via JNI.

    User interfact to control tempo/pitch of audio in Java. It uses RubberBand, a high quality time-stretcher library implemented in C++, called via JNI.

~ Decode MP3s and other Audio formats the easy way on Android

This post describes how to decode MP3’s using an already compiled ffmpeg binary on android. Using ffmpeg to decode audio on Android has advantages:

  • It supports about every audio format known to man. Three channel flac, vorbis with 32 bit samples, … do not pose a problem.
  • Extracting audio from video container formats is supported. Accessing the first audio stream from mkv, avi, mov,… just works.
  • Decoding audio frames is more efficient using native code than often buggy Java decoders.
  • Resampling and downmixing is supported. If you want to resample incoming audio to e.g. 44.1kHz and only want single channel audio this is easily achievable.

The main disadvantage is that you need an ffmpeg build for your Android device. Luckily some poor soul already managed to compile ffmeg for Android for several architectures. The precompiled ffmpeg binaries for Android are available for download and are mirrored here as well.

To bridge the ffmpeg binary and the java world TarsosDSP contains some glue code. The AndroidFFMPEGLocator is responsible to find and extract the correct binary for your Android device. It expects these ffmpeg binaries in the assets folder of your Android application. When the correct ffmpeg binary has been extracted and made executable the PipeDecoder is able to call it. The PipeDecoder calls ffmpeg so that decoded, downmixed and resampled PCM samples are streamed into the Java application via a pipe, which explains its name.

With the TarsosDSP Android library the following code plays an MP3 from external storage:

new AndroidFFMPEGLocator(this);
new Thread(new Runnable() {
  public void run() {
    File externalStorage = Environment.getExternalStorageDirectory();
    File mp3 = new File(externalStorage.getAbsolutePath() , "/audio.mp3");
    AudioDispatcher adp;
    adp = AudioDispatcherFactory.fromPipe(mp3.getAbsolutePath(),44100,5000,2500);
    adp.addAudioProcessor(new AndroidAudioPlayer(adp.getFormat(),5000, AudioManager.STREAM_MUSIC));;

This code just works if the application has the READ_EXTERNAL_STORAGE permission, includes a recent TarsosDSP-Android.jar, is ran on one of the supported ffmpeg architectures and has these binaries available in the assets folder.

~ TarsosDSP featured in EFY Plus Magazine

EFY Plus July 2015 CoverTarsosDSP, the is a real-time audio processing library written in Java, is featured in EFY Plus Magazine of July 2015. It is a leading electronics magazine with a history going back more than 40 years and about 300 000 subscribers mainly in India. The index mentions this:

TarsosDSP: A Real-Time Audio Analysis and Processing Framework
In last month’s EFY Plus, we discussed Essentia, a C++ library for audio analysis. In this issue we will discuss a Java based real-time audio analysis and processing framework known as TarsosDSP

To read the full article, buy a (digital) copy of the magazine.

~ TeensyDAQ - Capture, Visualize and Record Analog Input Signals from Teensy

This post describes a tool to quickly visualize and record analog signals with a Teensy micro-controller and some custom software. It is mainly useful to quickly get an idea of how an analog sensor reacts to different stimuli. Since it is also able to capture and store analog input siginals it is also useful to generate test data recordings which then can be used for example to test a peak detection algorithm on. The tool is called TeensyDAQ hinting at the Data AcQuisition features and the micro-controller used.

Some of the features of the TeensyDAQ:

  • Visualize up to five analog signals simultaneously in real-time.
  • Capture analog input signals with sampling rates up to 8000Hz.
  • Record analog input to a CSV-file and, using drag-and-drop, previously recorded CSV-files can be visualized.
  • Works on Linux, Mac OS X and Windows.
  • While a capture session is in progress you can going back in time and zoom, pan and drag to get a detailed view on your data.
  • Allows you to listen to your input signal, this is especially practical with analog microphone input.

The system consists of two parts. A hardware and a software part. The hardware is a Teensy micro-controller running an Arduino sketch that ready analog input A0 to A4 at the requested sampling rate. A Teensy is used instead of a regular Arduino for two reasons. First the Teensy is capable of much higher data throughput, it is able to send five reading at 8000Hz, which is impossible on Arduino. The second reason is the 13bit analog read resolution. Classic Arduino only provides 10 bits.

The software part reads data from the serial port the Teensy is attached to. It interprets the data and stores it in an efficient data-structure. As quickly as possible the data is visualized. The software is written in Java. A recent Java runtime environment is needed to execute it.

Try out the latest version of TeensyDAQ or check out the source code on the github TeensyDAQ source repository.

  • The interface allows going back in time and zooming, panning, dragging.

    The interface allows going back in time and zooming, panning, dragging.

  • The ports used by TeensyDAQ marked in green. Mainly A0 to A4.

    The ports used by TeensyDAQ marked in green. Mainly A0 to A4.

  • The hardware: a Teensy and a simple light sensor.

    The hardware: a Teensy and a simple light sensor.

  • The interface for live visualization.

    The interface for live visualization.

~ Notifications from an RFduino over Bluetooth LE (4.0) on a Linux machine

This post describes how to get notifications from a Bluetooth LE or Bluetooth v4.0 device on a Linux machine. Since it took me a while to get it going it is perhaps of interest to others.

The hardware I used is an RFduino board and a Belikin mini Bluethooth v4.0 adapter. The RFduino was programmed to wait for an event with RFduino_pinWake(pni, HIGH). When the pin is HIGH a count is incremented and this number is send to any device that is listening. In my case a Linux machine. The code is essentially the same as the button example included in the RDduino software distribution.

To install the Bluetooth stack on Debian the following command is executed sudo apt-get install bluetooth bluez bluez-utils bluez-firmware. A blog post describes more about the Bluetooth tools. Some other interesting reads are Get started with Bluetooth Low Energy and this stackoverflow question. Once the stack is installed correctly the lescan utility should give an output like this:

$ sudo hcitool lescan
LE Scan ...
DC:87:CC:18:14:A5 RFduino
DC:87:CC:18:14:A5 (unknown)

Bluetooth LE works with the Generic Attribute Profile (GATT). A Bluetooth LE device can provide services by combining characteristics. These characteristics are the way to communicate with the device. Some characteristics are writable and are able to send notifications. To receive notifications one such characteristic (referred to with a hex handle) needs to be written. Write 0100 to get notifications, 0200 for indications (indications are notifications that are acknowledged), 0300 for both, or 0000 for nothing (default). With this in mind, the following command enables listening for notifications:

gatttool --device=DC:87:CC:18:14:A5  --char-write-req --handle=0x000f --value=0300 --listen

With those commands working, the process can be automated with a Ruby script to get Bluetooth LE notifications. The script essentially calls gatttool with the correct parameters and parses and reacts to its output. To make it work lescan needs to be called before starting the script:

$ sudo hcitool lescan && ruby bluetooth_notifications.rb 
LE Scan ...
DC:87:CC:18:14:A5 RFduino
DC:87:CC:18:14:A5 (unknown)
Characteristic value was written successfully
Notification handle = 0x000e value: 41 decimal value: 65
Notification handle = 0x000e value: 42 decimal value: 66
Notification handle = 0x000e value: 43 decimal value: 67
Notification handle = 0x000e value: 44 decimal value: 68
Notification handle = 0x000e value: 45 decimal value: 69
Notification handle = 0x000e value: 46 decimal value: 70

~ Access Features for Music Using AcoustID, Musicbrainz and AcousticBrainz

MusicBrainz logoThis post describes how to connect music in your library with precomputed features. Say, for example, you are developing a DJ application and you want to facilitate mixing tracks. To provide a seamless mix you perhaps want information about beats and about the key the music in your library is in. Since vast databases of features are already available you probably want to access those, instead of using your own feature extractors and database. The problems that need to be addressed are:

  1. Automatically identify the music in your library without relying on incomplete meta-data (tag information).
  2. Connect the music with a data-base of meta-data. Preferably a large and well curated database.
  3. Fetch pre-computed features for the music. The features should be extracted using algorithms that are currently state of the art or at least perform well. The features and the audio itself should be synchronized, otherwise beat information, for example, is not of much use.

To help with these task there are several open source tools and services available.

To identify music a condensed representation of musical audio is created. This process is known as acoustic fingerprinting. On the website AcoustID a tool is available to create such fingerprint. The library is called Chromaprint and the command line client is called fpcalc. Currently the latest version is Chromaprint version 1.2 and static binaries for fpcalc are available on the AcoustID website. A packages for Debian (and probably Ubuntu) can be installed by calling apt-get install libchromaprint-tools. Once this tool is correctly installed a fingerprint for a piece of music can be created:

fpcalc music.mp3


A fingerprint by itself is not of much use. The AcoustID webservice translates a fingerprint into one or more MusicBrainz identifiers. One fingerprint can result in multiple identifiers because the same audio can be released on several albums. There is documentation for AcoustID webservice available. To use the webservice an API key is needed. Confusingly, the AcoustID service has two types of API keys. One for end-users and one for developers. The last type is needed to translate ID’s. To request a developer API key, log in on the AcoustID website and “add an application”, there you can find the correct API key. Substitute dev_api_key in the following URL. Also change the fingerprint and duration to match the information provided by the fpcalc application. The webservice should reply with a set of MusicBrainz identifiers:

AcousticBrainz provides features for a subset of music that has a MusicBrainz identifier. Currently about a million tracks are analyzed but more are added every day. The API for the webservice is straightforward:


The low-level features include beat positions and chroma information. For the hypothetical DJ-application this is the information that would be used.

If you find the services useful please consider contributing to MusicBrainz, AcoustID and AcousticBrainz.

A small Ruby script to automatically fetch features for audio can be downloaded here. It needs Ruby and a RubyGems to parse JSON. On Debian this can be installed with apt-get install ruby and rubygems install json. Once these dependencies are installed the script can be ran as follows:

ruby mbid_lookup.rb example.mp3 
Found 6 musicbrainz identifiers!
Not found in AcousticBrainz: 0afcd4a1-3709-499b-b76f-0d5491f839a5
Beat positions for 3d49fab8-fd08-42be-b0d2-9f1dc884d902: 0.522448956966,1.05650794506,1.57895684242,2.10140585899,2.61224484444,3.13469386101
Not found in AcousticBrainz: 448258f0-aa5a-4968-8efd-8c9348d5142e
Not found in AcousticBrainz: adcd7079-57d9-49bd-a36b-a20fa27b02b1
Beat positions for d1cd1321-0b66-4848-935e-f3afba6c7356: 0.441179126501,0.905578196049,1.369977355,1.83437633514,2.29877543449,2.76317453384
Not found in AcousticBrainz: e1f433be-af6b-4b5d-a969-4b53f014c395

~ SINGmaster Android App uses TarsosDSP

Singmaster logoTarsosDSP is a real-time audio processing library written in Java. Since version 2.0 it is compatible with Android. Judging by the number of forks of the TarsosDSP GitHub repository Android compatibility increased the popularity of the library. Now the first Android application which uses TarsosDSP has found its way to the Google Play store. Download and play with SINGmaster to see an application of the pitch tracking capabilities within TarsosDSP. The SINGmaster description:

SING master is a smart phone app that helps you to learn how to sing. SING master presents a collection of practical exercises (on the most important building blocks of melodies). Colours and sounds guide you in the exercise. After recording, SING master gives visual feedback : you can see and hear your voice. This is important so that you can identify where your mistakes are.”

Another application in the Play Store that uses TarsosDSP is CuePitcher.

  • SINGMaster screenshot

    SINGMaster screenshot

  • SINGMaster in action

    SINGMaster in action

~ OSC in Matlab on Windows, Linux and Mac OS X using Java

matlab logoThis post explains how to receive OSC in a MatLab environment. It uses a platform independent Java library which should work on 64 and 32 bit versions of Windows, Unix and Mac OS X. Using Java makes installation relatively easy compared with other solutions.

The most used method to get OSC-messages in Matlab can be found here. This method uses a library called liblo which needs to be configured (compiled) correctly on your system. Especially on Windows this can be problematic. A brave soul documented his quest to get OSC working with Matlab on Windows here. Obviously not for the faint of heart.

An alternative way leverages the Matlab facilities to run Java. Since there is a Java OSC library available (JavaOSC on github) it is relatively easy to bridge the two. To make the connection, I have written some glue code and provide an easy to use Jar-library here. Using the bridge is done as follows:

How to make Matlab receive OSC-messages

  1. Download the JavaOSCtoMatlab Java library and store it in an easy to remember directory.
  2. Download the example Matlab OSC client Script and store it in the same directory. The client is included below as well.
  3. Start Matlab, modify the client script to fit your needs. You probably need to change the OSC method to listen to and the OSC port. Also make sure that the cd command points to the directory with the downloaded jar-file.
  4. Run the client script and receive your OSC messages.

Note that there are three ways to receive the payload of a message. They are returned by the Java code as either Object[], double[] or String[]. The last two are automatically understood by Matlab, so they are more easy to work with. Respectively to get the message data you need to call either osc_listener.getMessageArguments(), osc_listener.getMessageArgumentsAsDouble(), osc_listener.getMessageArgumentsAsString().

I hope this is useful to some…


% Check your java version 1.6+ should be ok
version -java
% Load the jar file
% Import the needed java packages
import com.illposed.osc.*;
import java.lang.String

% defines the OSC port to listen to
receiver =  OSCPortIn(4000);
% defines the OSC method to listen to
osc_method = String('/ECG');
osc_listener = MatlabOSCListener();

%infinite loop, receiving all non empty messages 
    struct = osc_listener.getMessageArgumentsAsDouble();
     if ~isempty(struct)


~ Measuring Audio Output Latency on Android Lollipop using an Arduino

This post explains how to measure audio output latency on Android devices. To measure audio latency USB-OTG and an Arduino is used. In the process it documents audio output latency on an LG Nexus 5 device running the most recent version of Android, which currently is Lollipop (5.0).

Audio latency is an important aspect of a system, especially if it is used for real-time sonification or for musical applications. Audio latency is the, preferably short, delay between audio entering a system and emerging from a system. Audio output latency is the time it takes between a signal (e.g. a button pressed) and when audio emerges. For sonification purposes audio output latency is more interesting than round-trip audio latency.

Android systems are often portable, generally available and relatively cheap. Android offers an attractive platform to develop sonifications or musical applications for. Unfortunately, audio latency on Android has not been a priority in the first versions. With Android 4.1 things started to change but due to hard- and software fragmentation it is still hard to find how much audio latency is expected. Even if the exact model (e.g. Nexus 5) and software version (stock Android 5.0) is known, exact numbers are, so it seems, nowhere to be found. For more information on the internal changes that make low latency audio on Android possible, watch the talk on High Performance Audio from the 2013 Google I/O conference. Also note the lack of exact latency numbers in that talk. It is a very enjoyable talk by two Google engineers going after the culprits of high latency in true Sherlock/dr. Watson style.

Since audio output latency is generally not documented and since it is an important factor to decide if Android is a viable platform for real-time sonification or musical applications it needs to be measured. One way of measuring audio output latency on Android is documented by the people of Google. Unfortunately, the approach is not easily reproducible since it needs a custom circuit board, an oscilloscope and there is no source code available. Below a reproducible way to measure audio output latency for Android is documented.

An Arduino, an Android device, an USB-OTG cable and a butchered mini-jack audio cable are needed together with the software provided here. Optionally, a data acquisition module can be used to visualize the signals. The measurement system works as follows:

  1. An Arduino sends a signal over USB. The time at which the signal is send is stored for later use.
  2. An Android device, connected to the Arduino via an USB-OTG-cable, receives the signal.
  3. The Android device responds as quickly as possible, with the lowest latency as possible, by emitting a sound.
  4. The sound is captured on an analog input port of the Arduino, via the mini-jack cable. The time the sound appears on the Arduino is stored.
  5. By comparing the time when the signal was send with the time when the sound arrived, the audio output latency is measured and reported.

The previous steps are repeated every second to gain insights into the variability of the measurements. To generate microsecond accurate timing interrupts are used on the Arduino. For visualisation, a digital pin is toggled every time the Arduino sends a signal. The Arduino sketch is attached to this post, as is the source code for the Android application. An already compiled APK is also available. With some luck – a recent Android version is needed, your device should support USB-OTG – it might work on your device.


Using the OpenSL ES native interface on a Nexus 5 with Lollipop installed the USB input to audio output latency is on average about 48 milliseconds. There is some variability but it is usually within 15 milliseconds. For music applications this latency is not great but, depending on the application, acceptable. For expert drummers latency should be in the range of 20ms but for many sonification tasks, 50ms suffices. It is clear that Android will never be able to compete with purpose built hardware running a real time operating system like Axoloti (Audio roundtrip latency 2ms, usb-audio 1.6ms) but for a general purpose device the measured latency is significantly better than what I expected (around 100ms).

The non-native audio interface is a lot slower. I have measured an average latency of about 85ms and a much larger variability (25ms).

With this post I hope others will report the latency for their devices as well, so that buyers that are interested in a low-latency Android devices can make an informed decision.

  • Result on Android.

    Result on Android.

  • Onsets and audio visualized using a DAC and a Java program.

    Onsets and audio visualized using a DAC and a Java program.

  • Arduino and the DAC

    Arduino and the DAC

  • The latency visualized.

    The latency visualized.

  • The DAC used.

    The DAC used.

  • Arduino wiring.

    Arduino wiring.

~ Axoloti: a digital audio platform for makers

Currently, there is a crowd-funding campaign ongoing about Axoloti . Axoloti is a very cool project by Johannes Taelman. It is a stand alone audio processing unit that can be used as a synthesizer, groovebox, guitar effect pedal, as a part of a sound installation, or for about any other audio application you can think of.

Axeloti is controlled by a patcher environment and once it is programmed it operates as a stand alone unit. For more information, visit the Axoloti Website, watch the video below and and fund Axoloti.

Update: Good news everyone! Axoloti has been funded!

  • Axoloti Logo

    Axoloti Logo

  • Axoloti Board

    Axoloti Board

  • Axoloti Patch

    Axoloti Patch

  • Axoloti Party!

    Axoloti Party!

~ TarsosLSH in a Photomosaic Web App

TarsosLSH is a Java library implementing Locality-sensitive Hashing (LSH), a practical nearest neighbor search algorithm for high dimensional vectors that operates in sublinear time. The open source software package is authored by me and is available on GitHub: TarsosLSH on GitHub.

With TarsosLSH, Joseph Hwang and Nicholas Kwon from Rice University created an Image Mosaic web application. The application chops an uploaded photo into small blocks. For each block, a color histogram is created and compared with an index of color histograms of reference images. Subsequently each block is replaced with one of the top three nearest neighbors, creating a mosaic. Since high dimensional nearest neighbor search is needed, this is an ideal application for TarsosLSH. The application somewhat proves that TarsosLSH can be used in practical applications, which is comforting.

  • The Starry Night, by Van Ghogh in Mosaic as created by the mosaic webapplication.

    The Starry Night, by Van Ghogh in Mosaic as created by the mosaic webapplication.

  • The Starry Night, by Van Ghogh - Original

    The Starry Night, by Van Ghogh - Original

~ Using the Advantech USB-4716 Data Acquisition Module on a Linux System

Below some notes on installing and using the drivers for the Avandtech USB-4716 on Linux can be found. Since I was unable to find these instructions elsewhere and it took me some time to figure things out, it is perhaps of use to someone else. A similar approach should work for the following devices as well: pci1715, pci1724, pci1734, pci1752, pci1758, pcigpdc, usb4711a, usb4750, pci1711, pci1716, pci1727, pci1747, pci1753_mic3753_pcm3753i, pci1761_pcm3761i, pcm3810i, usb4716, usb4761, pci1714_pcie1744, pci1721, pci1730_pcm3730i, pci1750, pci1756, pci1762, usb4702_usb4704, usb4718

Download the linux driver for the Avandtech USB-4716 DAQ. If you are on a system that can install either deb or rpm use the driver_package. Unzip the package. The driver is split into two parts. A base driver biokernbase and a driver specific for the USB-4716 device, bio4716. The drivers are Linux kernel modules that need to installed. First the base driver needs to be installed, the order is important. After the base driver install the device specific deb kernel module. After a reboot or perhaps immediately this should be the result of executing lsmod | grep bio:

bio4716              23724  0 
biokernbase       17983  1 bio4716
usbcore              128741  9 ehci_hcd,uhci_hcd,usbhid,usb_storage,snd_usbmidi_lib,snd_usb_audio,biokernbase,bio4716

A library to interface with the hardware is provided as a deb package as well. Install this library on your system.

Next download the the examples for the Avandtech USB-4716 DAQ. With the kernel modules installed the system is ready to test the examples in the provided examples directory. If you are using the Java code, make sure to set the java.library.path correctly.

  • Signals acquired using the DAQ

    Signals acquired using the DAQ

~ Audio Fingerprinting - Opportunities for digital musicology

The 27th of November, 2014 a lecture on audio fingerprinting and its applications for digital musicology will be given at IPEM. The lecture introduces audio fingerprinting, explains an audio fingerprinting technique and then goes on to explain how such algorithm offers opportunities for large scale digital musicological applications. Here you can download the slides about audio fingerprinting and its opportunities for digital musicology.

With the explained audio fingerprinting technique a specific form of very reliable musical structure analysis can be done. Below, in the figure section, an example of repetitive structure in the song Ribs Out is shown. Another example is comparing edits or versions of songs. Below, also in the figure section, the radio edit of Daft Punk’s Get Lucky is compared with the original version. Audio synchronization using fingerprinting is another application that is actively used in the field of digital musicology to align audio with extracted features.

Since acoustic fingerprinting makes structure analysis very efficiently it can be applied on a large scale (20k songs). The figure below shows that identical repetition is something that has been used more and more since the mid 1970’s. The trend probably aligns with the amount of technical knowledge needed to ‘copy and paste’ a snippet of music.

How much identical repetition is used in music, over the years

Fig: How much identical repetition is used in music, over the years.

The Panako audio fingerprinting system was used to generate data for these case studies. The lecture and this post are partly inspired by a blog post by Paul Brossier.

  • Spectral peak Acoustic fingerprinting system

    Spectral peak Acoustic fingerprinting system

  • Structure in Ribs Out

    Structure in Ribs Out

  • Radio edit vs. original of Daft Punk's Get Lucky

    Radio edit vs. original of Daft Punk's Get Lucky

  • How much identical repetition is used in a set of 20k songs.

    How much identical repetition is used in a set of 20k songs.

~ ISMIR 2014 - Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification

Panako poster At ISMIR 2014 i will present a paper on a fingerprinting system. ISMIR is the annual conference of the International Society for Music Information Retrieval is the world’s leading interdisciplinary forum on accessing, analyzing, and organizing digital music of all sorts. This years instalment takes place in Taipei, Taiwan. My contribution is a paper titled Panako – A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification, it will be presented during a poster session the 27th of October.

This paper presents a scalable granular acoustic fingerprinting system. An acoustic fingerprinting system uses condensed representation of audio signals, acoustic fingerprints, to identify short audio fragments in large audio databases. A robust fingerprinting system generates similar fingerprints for perceptually similar audio signals. The system presented here is designed to handle time-scale and pitch modifications. The open source implementation of the system is called Panako and is evaluated on commodity hardware using a freely available reference database with fingerprints of over 30,000 songs. The results show that the system responds quickly and reliably on queries, while handling time-scale and pitch modifications of up to ten percent.

The system is also shown to handle GSM-compression, several audio effects and band-pass filtering. After a query, the system returns the start time in the reference audio and how much the query has been pitch-shifted or time-stretched with respect to the reference audio. The design of the system that offers this combination of features is the main contribution of this paper.

The system is available, together with documentation and information on how to reproduce the results from the ISMIR paper, on the Panako website. Also available for download is the Panako poster, Panako ISMIR paper and the Panako poster.

  • General fingerprinter

    General fingerprinter

  • Fingerprint and modifications

    Fingerprint and modifications

  • Results after pitch shifting

    Results after pitch shifting

  • Results after time scale modification

    Results after time scale modification

  • Results after time stretching

    Results after time stretching

~ TarsosDSP PureData or MAX MSP external

Pitch detection pure data patch It makes sense to connect TarsosDSP, a real-time audio processing library written in Java, with patcher environments such as Pure Data and Max/MSP. Both Pure Data and Max/MSP offer the capability to code object, or externals using Java. In Pure Data this is done using the pdj~ object, which should be compatible with the Max/MSP implementation. This post demonstrates a patch that connects an oscillator with a pitch tracking algorithm implemented in TarsosDSP.

To the left you can see the finished patch. When it is working an audio stream is generated using an oscillator. The frequency of the oscillator can be controlled. Subsequently the stream is send to the Java environment with the pdj bridge. The Java environment receives an array of floats, representing the audio. A pitch estimation algorithm tries to find the pitch of the audio represented by the buffer. The detected pitch is returned to the pd environment by means of outlet. In pd, the detected pitch is shown and used for auditory feedback.

PitchDetectionResult result = yin.getPitch(audioBuffer);
pitch = result.getPitch();
outlet(0, Atom.newAtom(pitch));

Please note that the pitch detection algorithm can handle any audio stream, not only pure sines. The example here demonstrates the most straightforward case. Using this method all algorithms implemented in TarsosDSP can be used in Pure Data. These range from onset detection to filtering, from audio effects to wavelet compression. For a list of features, please see the TarsosDSP github page. Here, the source for this patch implementing pitch tracking in pd can be downloaded. To run it, extract it to a directory and simply run the pitch.pd patch. Pure Data should load pdj~ automatically together with the classes present in the classes directory.

~ TarsosDSP on Android - Audio Processing in Java on Android

Audio on AndroidThis post explains how to get TarsosDSP running on Android. TarsosDSP is a Java library for audio processing. Its aim is to provide an easy-to-use interface to practical music processing algorithms implemented, as simply as possible, in pure Java and without any other external dependencies.

Since version 2.0 there are no more references to javax.sound.* in the TarsosDSP core codebase. This makes it easy to run TarsosDSP on Android. Audio Input/Output operations that depend on either the JVM or Dalvik runtime have been abstracted and removed from the core. For each runtime target a Jar file is provided in the TarsosDSP release directory.

The source code for the audio I/O on the JVM and the audio I/O on Android can be found on GitHub. To get an audio processing algorithm working on Android the only thing that is needed is to place TarsosDSP-Android-2.0.jar in the lib directory of your project.

The following example connects an AudioDispatcher to the microphone of an Android device. Subsequently, a real-time pitch detection algorithm is added to the processing chain. The detected pitch in Hertz is printed on a TextView element, if no pitch is present in the incoming sound, -1 is printed. To test the application download and install the TarsosDSPAndroid.apk application on your Android device. The source code is available as well.

AudioDispatcher dispatcher = AudioDispatcherFactory.fromDefaultMicrophone(22050,1024,0);

PitchDetectionHandler pdh = new PitchDetectionHandler() {
        public void handlePitch(PitchDetectionResult result,AudioEvent e) {
                final float pitchInHz = result.getPitch();
                runOnUiThread(new Runnable() {
                    public void run() {
                        TextView text = (TextView) findViewById(;
                        text.setText("" + pitchInHz);
AudioProcessor p = new PitchProcessor(PitchEstimationAlgorithm.FFT_YIN, 22050, 1024, pdh);
new Thread(dispatcher,"Audio Dispatcher").start();

Thanks to these changes, the fork of TarsosDSP kindly provided by GitHub user srubin, created for a programming assignment at UC Berkley, is not needed any more.

Have fun hacking audio on Android!

~ Haar Wavlet Transform in TarsosDSP

The TarsosDSP Java library for audio processing now contains an implementation of the Haar Wavelet Transform. A discrete wavelet transform based on the Haar wavelet (depicted at the right). This reversible transform has some interesting properties and is practical in signal compression and for analyzing sudden transitions in a file. It can e.g. be used to detect edges in an image.

As an example use case of the Haar transform, a simple lossy audio compression algorithm is implemented in TarsosDSP. It compresses audio by dividing audio into bloks of 32 samples, transforming them using the Haar wavelet Transform and subsequently removing samples with the least difference between them. The last step is to reverse the transform and play the audio. The amount of compressed samples can be chosen between 0 (no compression) and 31 (no signal left). This crude lossy audio compression technique can save at least a tenth of samples without any noticeable effect. A way to store the audio and read it from disk is included as well.

The algorithm works in real time and an example application has been implemented which operates on an mp3 stream. To make this work immediately, the avconv tool needs to be on your system’s path. Also implemented is a bit depth compressor, which shows the effect of (extreme) bit depth compression.

The example is available at the TarsosDSP release directory, the code can be found on the TarsosDSP github page.

  • Haar Wavelet Audio Compression

    Haar Wavelet Audio Compression

~ TarsosDSP Spectral Peak extraction

The TarsosDSP Java library for audio processing now contains a module for spectral peak extraction. It calculates a short time Fourier transform and subsequently finds the frequency bins with most energy present using a median filter. The frequency estimation for each identified bin is significantly improved by taking phase information into account. A method described in “Sethares et al. 2009 – Spectral Tools for Dynamic Tonality and Audio Morphing”.

The noise floor, determined by the median filter, the spectral information itself and the estimated peak locations are returned for each FFT-frame. Below a visualization of a flute can be found. As expected, the peaks are harmonically spread over the complete spectrum up until the Nyquist frequency.

  • Spectral peaks of a flute. The first 10 harmonic are detected up until the Nyquist frequency.

    Spectral peaks of a flute. The first 10 harmonic are detected up until the Nyquist frequency.

~ International School on Systematic Musicology and Sound and Music Computing (ISSSM) 2014, Genova

From 9 to 20 March 2014 I was a student at the International School on Systematic Musicology and Sound and Music Computing. The aim of the course was to:

Give students an intensive course in the most advanced and current topics in the research fields of systematic musicology and sound and music computing. Give students the opportunity to discuss their research proposals/project with an international staff of teachers representing a variety of expertise in different domains of systematic musicology and sound and music computing. Teach students the most recent knowledge and basic skills needed to start a PhD. Give students the opportunity to join the research communities on systematic musicology, on sound and music computing.

Next to the lectures, the informal meetings with the professors was very interesting. I got to add some things to my ‘to read’ list:

  • Rolf Bader, Calculation of Helmholtz frequency of a Renaissance vihuela string instrument with five tone hole
  • Schneider, A. & Frieler, K. (2009) Perception of harmonic and inharmonic sounds: Results from
    ear models
    . In S. Ystad, R. Kronland-Martinet & K. Jensen (Eds.), Computer music modeling and retrieval. Genesis of meaning in sound and music (pp. 18–44). Berlin: Springer.
  • Rolf Bader, Sound – Perception – Performance

ISSSM 2014 logo

~ TarsosDSP Paper and Presentation at AES 53rd International conference on Semantic Audio

TarsosDSP will be presented at the AES 53rd International conference on Semantic Audio in London . During the conference both a presentation and demonstration of the paper TarsosDSP, a Real-Time Audio Processing Framework in Java, by Joren Six, Olmo Cornelis and Marc Leman, in Proceedings of the 53rd AES Conference (AES 53rd), 2014. From their website:

Semantic Audio is concerned with content-based management of digital audio recordings. The rapid evolution of digital audio technologies, e.g. audio data compression and streaming, the availability of large audio libraries online and offline, and recent developments in content-based audio retrieval have significantly changed the way digital audio is created, processed, and consumed. New audio content can be produced at lower cost, while also large audio archives at libraries or record labels are opening to the public. Thus the sheer amount of available audio data grows more and more each day. Semantic analysis of audio resulting in high-level metadata descriptors such as musical chords and tempo, or the identification of speakers facilitate content-based management of audio recordings. Aside from audio retrieval and recommendation technologies, the semantics of audio signals are also becoming increasingly important, for instance, in object-based audio coding, as well as intelligent audio editing, and processing. Recent product releases already demonstrate this to a great extent, however, more innovative functionalities relying on semantic audio analysis and management are imminent. These functionalities may utilise, for instance, (informed) audio source separation, speaker segmentation and identification, structural music segmentation, or social and Semantic Web technologies, including ontologies and linked open data.

This conference will give a broad overview of the state of the art and address many of the new scientific disciplines involved in this still-emerging field. Our purpose is to continue fostering this line of interdisciplinary research. This is reflected by the wide variety of invited speakers presenting at the conference.

The paper presents TarsosDSP, a framework for real-time audio analysis and processing. Most libraries and frameworks offer either audio analysis and feature extraction or audio synthesis and processing. TarsosDSP is one of a only a few frameworks that offers both analysis, processing and feature extraction in real-time, a unique feature in the Java ecosystem. The framework contains practical audio processing algorithms, it can be extended easily, and has no external dependencies. Each algorithm is implemented as simple as possible thanks to a straightforward processing pipeline. TarsosDSP’s features include a resampling algorithm, onset detectors, a number of pitch estimation algorithms, a time stretch algorithm, a pitch shifting algorithm, and an algorithm to calculate the Constant-Q. The framework also allows simple audio synthesis, some audio effects, and several filters. The Open Source framework is a valuable contribution to the MIR-Community and ideal fit for interactive MIR-applications on Android. The full paper can be downloaded TarsosDSP, a Real-Time Audio Processing Framework in Java

A BibTeX entry for the paper can be found below.

  author      = {Joren Six and Olmo Cornelis and Marc Leman},
  title       = {{TarsosDSP, a Real-Time Audio Processing Framework in Java}},
  booktitle   = {{Proceedings of the 53rd AES Conference (AES 53rd)}}, 
  year        =  2014
  • AES53


  • Constant-Q


  • Flanger


  • Pitch Shifting

    Pitch Shifting

  • Samping


~ Doctoral defense Olmo Cornelis - Exploring the Symbiosis of Western and non-Western Music

Woensdag 18 december 2013 organiseerde Olmo Cornelis een concert in het kader van zijn doctoraat. De dag erna volgde zijn verdediging. Nogmaals proficiat Olmo met het mooie eeh mbirapunt. Hieronder staat kort wat uitleg over het project en het concert.

In zijn onderzoeksproject ‘Exploring the symbiosis of Western and non-Western Music’ stelde Olmo Cornelis de beschrijving van Centraal-Afrikaanse muziek centraal. Deze werd verkend via computationele technieken die de klank als signaal
benaderden. De verkregen informatie zorgde voor beïnvloeding van het artistieke oeuvre waarin steeds een mengeling van impliciete en expliciete etnische invloeden spelen.

In het kader van de afronding van dit doctoraal onderzoek spelen het HERMESensemble, het Nadar Ensemble, Maja Jantar en Françoise Vanhecke op 18 december werk van Olmo Cornelis dat tijdens dit project geschreven werd. Het onderzoeksproject Exploring the symbiosis of Western and non-Western Music werd in 2008 geïnitieerd aan het Conservatorium / School of Arts van de HoGent en werd gefinancierd door het onderzoeksfonds Hogeschool Gent.

Beeld: Noel Cornelis, Reality of Possibilities, 2012

~ IPEM D-Jogger featured on RTBF

Yesterday, December 4th 2013, the RTBF was at IPEM to do a small feature on research currently going on at the institue. The RTBF is the public broadcasting organization of the French Community of Belgium, the southern, French-speaking part of Belgium. The clip shows the D-Jogger in action, with me using it. The fragment is available on the RTBF website and is embedded below.

~ Development and Application of MIR Techniques on Ethnic Music


The aim of this research project is to gain novel musicological insights into a large dataset of music from Central Africa. While practising ethnomusicological research on this dataset, we to develop and publish useful software and methodologies for the (ethno)musicological research community.

From November 2009 until November 2013 this research project was organised at the School of Arts, University College Ghent, under supervision by Olmo Cornelis. Later, from November 2013 onwards, the project turned into a 2 year doctoral research project hosted at IPEM, University Ghent under the supervision of Marc Leman.


Royal Museum For Central Africa University Ghent  Institute for Psychoacoustics and Electronic Music University College Ghent, Hogeschool Gent School of Arts, Ghent