Welcome

Hi, I'm Joren. Welcome to my website. I'm a researcher in the field of Music Informatics, Music Information Retrieval, and Computational Ethnomusicology. Here you can find a record of my research and other projects I have been working on. Learn more »

Contact

Joren Six
joren.six@ugent.be
University Ghent, IPEM

~ Decode MP3s and other Audio formats the easy way on Android

This post describes how to decode MP3’s using an already compiled ffmpeg binary on android. Using ffmpeg to decode audio on Android has advantages:

  • It supports about every audio format known to man. Three channel flac, vorbis with 32 bit samples, … do not pose a problem.
  • Extracting video container formats is supported. Accessing the first audio stream from mkv, avi, mov,… just works.
  • Decoding audio frames is more efficient using native code than often buggy Java decoders.
  • Resampling and downmixing is supported. If you want to resample incoming audio to e.g. 44.1kHz and only want single channel audio this is easily achievable.

The main disadvantage is that you need an ffmpeg build for your Android device. Luckily some poor soul already managed to compile ffmeg for Android for several architectures. The precompiled ffmpeg binaries for Android are available for download and are mirrored here as well.

To bridge the ffmpeg binary and the java world TarsosDSP contains some glue code. The AndroidFFMPEGLocator is responsible to find and extract the correct binary for your Android device. It expects these ffmpeg binaries in the assets folder of your Android application. When the correct ffmpeg binary has been extracted and made executable the PipeDecoder is able to call it. The PipeDecoder calls ffmpeg so that decoded, downmixed and resampled PCM samples are streamed into the Java application via a pipe, which explains its name.

With the TarsosDSP Android library the following code plays an MP3 from external storage:

1
2
3
4
5
6
7
8
9
10
11
12
new AndroidFFMPEGLocator(this);
new Thread(new Runnable() {
  @Override
  public void run() {
    File externalStorage = Environment.getExternalStorageDirectory();
    File mp3 = new File(externalStorage.getAbsolutePath() , "/audio.mp3");
    AudioDispatcher adp;
    adp = AudioDispatcherFactory.fromPipe(mp3.getAbsolutePath(),44100,5000,2500);
    adp.addAudioProcessor(new AndroidAudioPlayer(adp.getFormat(),5000, AudioManager.STREAM_MUSIC));
    adp.run();
  }
}).start();

This code just works if the application has the READ_EXTERNAL_STORAGE permission, includes a recent TarsosDSP-Android.jar, is ran on one of the supported ffmpeg architectures and has these binaries available in the assets folder.


~ TarsosDSP featured in EFY Plus Magazine

EFY Plus July 2015 CoverTarsosDSP, the is a real-time audio processing library written in Java, is featured in EFY Plus Magazine of July 2015. It is a leading electronics magazine with a history going back more than 40 years and about 300 000 subscribers mainly in India. The index mentions this:

TarsosDSP: A Real-Time Audio Analysis and Processing Framework
In last month’s EFY Plus, we discussed Essentia, a C++ library for audio analysis. In this issue we will discuss a Java based real-time audio analysis and processing framework known as TarsosDSP

To read the full article, buy a (digital) copy of the magazine.


~ TeensyDAQ - Capture, Visualize and Record Analog Input Signals from Teensy

This post describes a tool to quickly visualize and record analog signals with a Teensy micro-controller and some custom software. It is mainly useful to quickly get an idea of how an analog sensor reacts to different stimuli. Since it is also able to capture and store analog input siginals it is also useful to generate test data recordings which then can be used for example to test a peak detection algorithm on. The tool is called TeensyDAQ hinting at the Data AcQuisition features and the micro-controller used.

Some of the features of the TeensyDAQ:

  • Visualize up to five analog signals simultaneously in real-time.
  • Capture analog input signals with sampling rates up to 8000Hz.
  • Record analog input to a CSV-file and, using drag-and-drop, previously recorded CSV-files can be visualized.
  • Works on Linux, Mac OS X and Windows.
  • While a capture session is in progress you can going back in time and zoom, pan and drag to get a detailed view on your data.
  • Allows you to listen to your input signal, this is especially practical with analog microphone input.

The system consists of two parts. A hardware and a software part. The hardware is a Teensy micro-controller running an Arduino sketch that ready analog input A0 to A4 at the requested sampling rate. A Teensy is used instead of a regular Arduino for two reasons. First the Teensy is capable of much higher data throughput, it is able to send five reading at 8000Hz, which is impossible on Arduino. The second reason is the 13bit analog read resolution. Classic Arduino only provides 10 bits.

The software part reads data from the serial port the Teensy is attached to. It interprets the data and stores it in an efficient data-structure. As quickly as possible the data is visualized. The software is written in Java. A recent Java runtime environment is needed to execute it.

Try out the latest version of TeensyDAQ or check out the source code on the github TeensyDAQ source repository.

  • The interface for live visualization.

    The interface for live visualization.

  • The hardware: a Teensy and a simple light sensor.

    The hardware: a Teensy and a simple light sensor.

  • The ports used by TeensyDAQ marked in green. Mainly A0 to A4.

    The ports used by TeensyDAQ marked in green. Mainly A0 to A4.

  • The interface allows going back in time and zooming, panning, dragging.

    The interface allows going back in time and zooming, panning, dragging.


~ Notifications from an RFduino over Bluetooth LE (4.0) on a Linux machine

This post describes how to get notifications from a Bluetooth LE or Bluetooth v4.0 device on a Linux machine. Since it took me a while to get it going it is perhaps of interest to others.

The hardware I used is an RFduino board and a Belikin mini Bluethooth v4.0 adapter. The RFduino was programmed to wait for an event with RFduino_pinWake(pni, HIGH). When the pin is HIGH a count is incremented and this number is send to any device that is listening. In my case a Linux machine. The code is essentially the same as the button example included in the RDduino software distribution.

To install the Bluetooth stack on Debian the following command is executed sudo apt-get install bluetooth bluez bluez-utils bluez-firmware. A blog post describes more about the Bluetooth tools. Some other interesting reads are Get started with Bluetooth Low Energy and this stackoverflow question. Once the stack is installed correctly the lescan utility should give an output like this:

1
2
3
4
$ sudo hcitool lescan
LE Scan ...
DC:87:CC:18:14:A5 RFduino
DC:87:CC:18:14:A5 (unknown)

Bluetooth LE works with the Generic Attribute Profile (GATT). A Bluetooth LE device can provide services by combining characteristics. These characteristics are the way to communicate with the device. Some characteristics are writable and are able to send notifications. To receive notifications one such characteristic (referred to with a hex handle) needs to be written. Write 0100 to get notifications, 0200 for indications (indications are notifications that are acknowledged), 0300 for both, or 0000 for nothing (default). With this in mind, the following command enables listening for notifications:

1
gatttool --device=DC:87:CC:18:14:A5  --char-write-req --handle=0x000f --value=0300 --listen

With those commands working, the process can be automated with a Ruby script to get Bluetooth LE notifications. The script essentially calls gatttool with the correct parameters and parses and reacts to its output. To make it work lescan needs to be called before starting the script:

1
2
3
4
5
6
7
8
9
10
11
$ sudo hcitool lescan && ruby bluetooth_notifications.rb 
LE Scan ...
DC:87:CC:18:14:A5 RFduino
DC:87:CC:18:14:A5 (unknown)
Characteristic value was written successfully
Notification handle = 0x000e value: 41 decimal value: 65
Notification handle = 0x000e value: 42 decimal value: 66
Notification handle = 0x000e value: 43 decimal value: 67
Notification handle = 0x000e value: 44 decimal value: 68
Notification handle = 0x000e value: 45 decimal value: 69
Notification handle = 0x000e value: 46 decimal value: 70

~ Access Features for Music Using AcoustID, Musicbrainz and AcousticBrainz

MusicBrainz logoThis post describes how to connect music in your library with precomputed features. Say, for example, you are developing a DJ application and you want to facilitate mixing tracks. To provide a seamless mix you perhaps want information about beats and about the key the music in your library is in. Since vast databases of features are already available you probably want to access those, instead of using your own feature extractors and database. The problems that need to be addressed are:

  1. Automatically identify the music in your library without relying on incomplete meta-data (tag information).
  2. Connect the music with a data-base of meta-data. Preferably a large and well curated database.
  3. Fetch pre-computed features for the music. The features should be extracted using algorithms that are currently state of the art or at least perform well. The features and the audio itself should be synchronized, otherwise beat information, for example, is not of much use.

To help with these task there are several open source tools and services available.

To identify music a condensed representation of musical audio is created. This process is known as acoustic fingerprinting. On the website AcoustID a tool is available to create such fingerprint. The library is called Chromaprint and the command line client is called fpcalc. Currently the latest version is Chromaprint version 1.2 and static binaries for fpcalc are available on the AcoustID website. A packages for Debian (and probably Ubuntu) can be installed by calling apt-get install libchromaprint-tools. Once this tool is correctly installed a fingerprint for a piece of music can be created:

1
2
3
4
5
fpcalc music.mp3

FILE=music.mp3
DURATION=168
FINGERPRINT=AQADtEmi..hADAAOCGAQghZRgQByjAEAICSMWYME

A fingerprint by itself is not of much use. The AcoustID webservice translates a fingerprint into one or more MusicBrainz identifiers. One fingerprint can result in multiple identifiers because the same audio can be released on several albums. There is documentation for AcoustID webservice available. To use the webservice an API key is needed. Confusingly, the AcoustID service has two types of API keys. One for end-users and one for developers. The last type is needed to translate ID’s. To request a developer API key, log in on the AcoustID website and “add an application”, there you can find the correct API key. Substitute dev_api_key in the following URL. Also change the fingerprint and duration to match the information provided by the fpcalc application. The webservice should reply with a set of MusicBrainz identifiers:

http://api.acoustid.org/v2/lookup?client=dev_api_key&duration=x&fingerprint=ADORIF...LKJE6&meta=recordingids

AcousticBrainz provides features for a subset of music that has a MusicBrainz identifier. Currently about a million tracks are analyzed but more are added every day. The API for the webservice is straightforward:

GET http://acousticbrainz.org/96685213-a25c-4678-9a13-abd9ec81cf35/low-level
GET http://acousticbrainz.org/96685213-a25c-4678-9a13-abd9ec81cf35/high-level

The low-level features include beat positions and chroma information. For the hypothetical DJ-application this is the information that would be used.

If you find the services useful please consider contributing to MusicBrainz, AcoustID and AcousticBrainz.

A small Ruby script to automatically fetch features for audio can be downloaded here. It needs Ruby and a RubyGems to parse JSON. On Debian this can be installed with apt-get install ruby and rubygems install json. Once these dependencies are installed the script can be ran as follows:

1
2
3
4
5
6
7
8
ruby mbid_lookup.rb example.mp3 
Found 6 musicbrainz identifiers!
Not found in AcousticBrainz: 0afcd4a1-3709-499b-b76f-0d5491f839a5
Beat positions for 3d49fab8-fd08-42be-b0d2-9f1dc884d902: 0.522448956966,1.05650794506,1.57895684242,2.10140585899,2.61224484444,3.13469386101
Not found in AcousticBrainz: 448258f0-aa5a-4968-8efd-8c9348d5142e
Not found in AcousticBrainz: adcd7079-57d9-49bd-a36b-a20fa27b02b1
Beat positions for d1cd1321-0b66-4848-935e-f3afba6c7356: 0.441179126501,0.905578196049,1.369977355,1.83437633514,2.29877543449,2.76317453384
Not found in AcousticBrainz: e1f433be-af6b-4b5d-a969-4b53f014c395

~ SINGmaster Android App uses TarsosDSP

Singmaster logoTarsosDSP is a real-time audio processing library written in Java. Since version 2.0 it is compatible with Android. Judging by the number of forks of the TarsosDSP GitHub repository Android compatibility increased the popularity of the library. Now the first Android application which uses TarsosDSP has found its way to the Google Play store. Download and play with SINGmaster to see an application of the pitch tracking capabilities within TarsosDSP. The SINGmaster description:

SING master is a smart phone app that helps you to learn how to sing. SING master presents a collection of practical exercises (on the most important building blocks of melodies). Colours and sounds guide you in the exercise. After recording, SING master gives visual feedback : you can see and hear your voice. This is important so that you can identify where your mistakes are.”

Another application in the Play Store that uses TarsosDSP is CuePitcher.

  • SINGMaster in action

    SINGMaster in action

  • SINGMaster screenshot

    SINGMaster screenshot


~ OSC in Matlab on Windows, Linux and Mac OS X using Java

matlab logoThis post explains how to receive OSC in a MatLab environment. It uses a platform independent Java library which should work on 64 and 32 bit versions of Windows, Unix and Mac OS X. Using Java makes installation relatively easy compared with other solutions.

The most used method to get OSC-messages in Matlab can be found here. This method uses a library called liblo which needs to be configured (compiled) correctly on your system. Especially on Windows this can be problematic. A brave soul documented his quest to get OSC working with Matlab on Windows here. Obviously not for the faint of heart.

An alternative way leverages the Matlab facilities to run Java. Since there is a Java OSC library available (JavaOSC on github) it is relatively easy to bridge the two. To make the connection, I have written some glue code and provide an easy to use Jar-library here. Using the bridge is done as follows:

How to make Matlab receive OSC-messages

  1. Download the JavaOSCtoMatlab Java library and store it in an easy to remember directory.
  2. Download the example Matlab OSC client Script and store it in the same directory. The client is included below as well.
  3. Start Matlab, modify the client script to fit your needs. You probably need to change the OSC method to listen to and the OSC port. Also make sure that the cd command points to the directory with the downloaded jar-file.
  4. Run the client script and receive your OSC messages.

Note that there are three ways to receive the payload of a message. They are returned by the Java code as either Object[], double[] or String[]. The last two are automatically understood by Matlab, so they are more easy to work with. Respectively to get the message data you need to call either osc_listener.getMessageArguments(), osc_listener.getMessageArgumentsAsDouble(), osc_listener.getMessageArgumentsAsString().

I hope this is useful to some…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
cd('C:/dir/with/jar/file/')

% Check your java version 1.6+ should be ok
version -java
% Load the jar file
javaaddpath('javaosctomatlab.jar');
% Import the needed java packages
import com.illposed.osc.*;
import java.lang.String

% defines the OSC port to listen to
receiver =  OSCPortIn(4000);
% defines the OSC method to listen to
osc_method = String('/ECG');
osc_listener = MatlabOSCListener();
receiver.addListener(osc_method,osc_listener);
receiver.startListening();

%infinite loop, receiving all non empty messages 
while(1)
    struct = osc_listener.getMessageArgumentsAsDouble();
     if ~isempty(struct)
         struct
     end
end


receiver.stopListening();
receiver=0;

~ Measuring Audio Output Latency on Android Lollipop using an Arduino

This post explains how to measure audio output latency on Android devices. To measure audio latency USB-OTG and an Arduino is used. In the process it documents audio output latency on an LG Nexus 5 device running the most recent version of Android, which currently is Lollipop (5.0).

Audio latency is an important aspect of a system, especially if it is used for real-time sonification or for musical applications. Audio latency is the, preferably short, delay between audio entering a system and emerging from a system. Audio output latency is the time it takes between a signal (e.g. a button pressed) and when audio emerges. For sonification purposes audio output latency is more interesting than round-trip audio latency.

Android systems are often portable, generally available and relatively cheap. Android offers an attractive platform to develop sonifications or musical applications for. Unfortunately, audio latency on Android has not been a priority in the first versions. With Android 4.1 things started to change but due to hard- and software fragmentation it is still hard to find how much audio latency is expected. Even if the exact model (e.g. Nexus 5) and software version (stock Android 5.0) is known, exact numbers are, so it seems, nowhere to be found. For more information on the internal changes that make low latency audio on Android possible, watch the talk on High Performance Audio from the 2013 Google I/O conference. Also note the lack of exact latency numbers in that talk. It is a very enjoyable talk by two Google engineers going after the culprits of high latency in true Sherlock/dr. Watson style.

Since audio output latency is generally not documented and since it is an important factor to decide if Android is a viable platform for real-time sonification or musical applications it needs to be measured. One way of measuring audio output latency on Android is documented by the people of Google. Unfortunately, the approach is not easily reproducible since it needs a custom circuit board, an oscilloscope and there is no source code available. Below a reproducible way to measure audio output latency for Android is documented.

An Arduino, an Android device, an USB-OTG cable and a butchered mini-jack audio cable are needed together with the software provided here. Optionally, a data acquisition module can be used to visualize the signals. The measurement system works as follows:

  1. An Arduino sends a signal over USB. The time at which the signal is send is stored for later use.
  2. An Android device, connected to the Arduino via an USB-OTG-cable, receives the signal.
  3. The Android device responds as quickly as possible, with the lowest latency as possible, by emitting a sound.
  4. The sound is captured on an analog input port of the Arduino, via the mini-jack cable. The time the sound appears on the Arduino is stored.
  5. By comparing the time when the signal was send with the time when the sound arrived, the audio output latency is measured and reported.

The previous steps are repeated every second to gain insights into the variability of the measurements. To generate microsecond accurate timing interrupts are used on the Arduino. For visualisation, a digital pin is toggled every time the Arduino sends a signal. The Arduino sketch is attached to this post, as is the source code for the Android application. An already compiled APK is also available. With some luck – a recent Android version is needed, your device should support USB-OTG – it might work on your device.

Results

Using the OpenSL ES native interface on a Nexus 5 with Lollipop installed the USB input to audio output latency is on average about 48 milliseconds. There is some variability but it is usually within 15 milliseconds. For music applications this latency is not great but, depending on the application, acceptable. For expert drummers latency should be in the range of 20ms but for many sonification tasks, 50ms suffices. It is clear that Android will never be able to compete with purpose built hardware running a real time operating system like Axoloti (Audio roundtrip latency 2ms, usb-audio 1.6ms) but for a general purpose device the measured latency is significantly better than what I expected (around 100ms).

The non-native audio interface is a lot slower. I have measured an average latency of about 85ms and a much larger variability (25ms).

With this post I hope others will report the latency for their devices as well, so that buyers that are interested in a low-latency Android devices can make an informed decision.

  • The latency visualized.

    The latency visualized.

  • Arduino wiring.

    Arduino wiring.

  • The DAC used.

    The DAC used.

  • Arduino and the DAC

    Arduino and the DAC

  • Result on Android.

    Result on Android.

  • Onsets and audio visualized using a DAC and a Java program.

    Onsets and audio visualized using a DAC and a Java program.


~ Axoloti: a digital audio platform for makers

Currently, there is a crowd-funding campaign ongoing about Axoloti . Axoloti is a very cool project by Johannes Taelman. It is a stand alone audio processing unit that can be used as a synthesizer, groovebox, guitar effect pedal, as a part of a sound installation, or for about any other audio application you can think of.

Axeloti is controlled by a patcher environment and once it is programmed it operates as a stand alone unit. For more information, visit the Axoloti Website, watch the video below and and fund Axoloti.

Update: Good news everyone! Axoloti has been funded!

  • Axoloti Board

    Axoloti Board

  • Axoloti Patch

    Axoloti Patch

  • Axoloty Party!

    Axoloty Party!

  • Axoloti Logo

    Axoloti Logo


~ TarsosLSH in a Photomosaic Web App

TarsosLSH is a Java library implementing Locality-sensitive Hashing (LSH), a practical nearest neighbor search algorithm for high dimensional vectors that operates in sublinear time. The open source software package is authored by me and is available on GitHub: TarsosLSH on GitHub.

With TarsosLSH, Joseph Hwang and Nicholas Kwon from Rice University created an Image Mosaic web application. The application chops an uploaded photo into small blocks. For each block, a color histogram is created and compared with an index of color histograms of reference images. Subsequently each block is replaced with one of the top three nearest neighbors, creating a mosaic. Since high dimensional nearest neighbor search is needed, this is an ideal application for TarsosLSH. The application somewhat proves that TarsosLSH can be used in practical applications, which is comforting.

  • The Starry Night, by Van Ghogh in Mosaic as created by the mosaic webapplication.

    The Starry Night, by Van Ghogh in Mosaic as created by the mosaic webapplication.

  • The Starry Night, by Van Ghogh - Original

    The Starry Night, by Van Ghogh - Original


Previous entries »