0110.be logo

~ Axoloti: a digital audio platform for makers

Currently, there is a crowd-funding campaign ongoing about Axoloti . Axoloti is a very cool project by Johannes Taelman. It is a stand alone audio processing unit that can be used as a synthesizer, groovebox, guitar effect pedal, as a part of a sound installation, or for about any other audio application you can think of.

Axeloti is controlled by a patcher environment and once it is programmed it operates as a stand alone unit. For more information, visit the Axoloti Website, watch the video below and and fund Axoloti.

Update: Good news everyone! Axoloti has been funded!


~ Using the Advantech USB-4716 Data Acquisition Module on a Linux System

Below some notes on installing and using the drivers for the Avandtech USB-4716 on Linux can be found. Since I was unable to find these instructions elsewhere and it took me some time to figure things out, it is perhaps of use to someone else. A similar approach should work for the following devices as well: pci1715, pci1724, pci1734, pci1752, pci1758, pcigpdc, usb4711a, usb4750, pci1711, pci1716, pci1727, pci1747, pci1753_mic3753_pcm3753i, pci1761_pcm3761i, pcm3810i, usb4716, usb4761, pci1714_pcie1744, pci1721, pci1730_pcm3730i, pci1750, pci1756, pci1762, usb4702_usb4704, usb4718

Download the linux driver for the Avandtech USB-4716 DAQ. If you are on a system that can install either deb or rpm use the driver_package. Unzip the package. The driver is split into two parts. A base driver biokernbase and a driver specific for the USB-4716 device, bio4716. The drivers are Linux kernel modules that need to installed. First the base driver needs to be installed, the order is important. After the base driver install the device specific deb kernel module. After a reboot or perhaps immediately this should be the result of executing lsmod | grep bio:

1
2
3
bio4716 23724 0
biokernbase 17983 1 bio4716
usbcore 128741 9 ehci_hcd,uhci_hcd,usbhid,usb_storage,snd_usbmidi_lib,snd_usb_audio,biokernbase,bio4716

A library to interface with the hardware is provided as a deb package as well. Install this library on your system.

Next download the the examples for the Avandtech USB-4716 DAQ. With the kernel modules installed the system is ready to test the examples in the provided examples directory. If you are using the Java code, make sure to set the java.library.path correctly.


~ Audio Fingerprinting - Opportunities for digital musicology

The 27th of November, 2014 a lecture on audio fingerprinting and its applications for digital musicology will be given at IPEM. The lecture introduces audio fingerprinting, explains an audio fingerprinting technique and then goes on to explain how such algorithm offers opportunities for large scale digital musicological applications. Here you can download the slides about audio fingerprinting and its opportunities for digital musicology.

With the explained audio fingerprinting technique a specific form of very reliable musical structure analysis can be done. Below, in the figure section, an example of repetitive structure in the song Ribs Out is shown. Another example is comparing edits or versions of songs. Below, also in the figure section, the radio edit of Daft Punk’s Get Lucky is compared with the original version. Audio synchronization using fingerprinting is another application that is actively used in the field of digital musicology to align audio with extracted features.

Since acoustic fingerprinting makes structure analysis very efficiently it can be applied on a large scale (20k songs). The figure below shows that identical repetition is something that has been used more and more since the mid 1970’s. The trend probably aligns with the amount of technical knowledge needed to ‘copy and paste’ a snippet of music.

How much identical repetition is used in music, over the years\
Fig: How much identical repetition is used in music, over the years.

The Panako audio fingerprinting system was used to generate data for these case studies. The lecture and this post are partly inspired by a blog post by Paul Brossier.


~ ISMIR 2014 - Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification

Panako poster At ISMIR 2014 i will present a paper on a fingerprinting system. ISMIR is the annual conference of the International Society for Music Information Retrieval is the world’s leading interdisciplinary forum on accessing, analyzing, and organizing digital music of all sorts. This years instalment takes place in Taipei, Taiwan. My contribution is a paper titled Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification, it will be presented during a poster session the 27th of October.

This paper presents a scalable granular acoustic fingerprinting system. An acoustic fingerprinting system uses condensed representation of audio signals, acoustic fingerprints, to identify short audio fragments in large audio databases. A robust fingerprinting system generates similar fingerprints for perceptually similar audio signals. The system presented here is designed to handle time-scale and pitch modifications. The open source implementation of the system is called Panako and is evaluated on commodity hardware using a freely available reference database with fingerprints of over 30,000 songs. The results show that the system responds quickly and reliably on queries, while handling time-scale and pitch modifications of up to ten percent.

The system is also shown to handle GSM-compression, several audio effects and band-pass filtering. After a query, the system returns the start time in the reference audio and how much the query has been pitch-shifted or time-stretched with respect to the reference audio. The design of the system that offers this combination of features is the main contribution of this paper.

The system is available, together with documentation and information on how to reproduce the results from the ISMIR paper, on the Panako website. Also available for download is the Panako poster (pdf), Panako ISMIR paper and the Panako poster (inkscape SVG).


~ TarsosDSP PureData or MAX MSP external

Pitch detection pure data patch It makes sense to connect TarsosDSP, a real-time audio processing library written in Java, with patcher environments such as Pure Data and Max/MSP. Both Pure Data and Max/MSP offer the capability to code object, or externals using Java. In Pure Data this is done using the pdj~ object, which should be compatible with the Max/MSP implementation. This post demonstrates a patch that connects an oscillator with a pitch tracking algorithm implemented in TarsosDSP.

To the left you can see the finished patch. When it is working an audio stream is generated using an oscillator. The frequency of the oscillator can be controlled. Subsequently the stream is send to the Java environment with the pdj bridge. The Java environment receives an array of floats, representing the audio. A pitch estimation algorithm tries to find the pitch of the audio represented by the buffer. The detected pitch is returned to the pd environment by means of outlet. In pd, the detected pitch is shown and used for auditory feedback.

1
2
3
PitchDetectionResult result = yin.getPitch(audioBuffer);
pitch = result.getPitch();
outlet(0, Atom.newAtom(pitch));

Please note that the pitch detection algorithm can handle any audio stream, not only pure sines. The example here demonstrates the most straightforward case. Using this method all algorithms implemented in TarsosDSP can be used in Pure Data. These range from onset detection to filtering, from audio effects to wavelet compression. For a list of features, please see the TarsosDSP github page. Here, the source for this patch implementing pitch tracking in pd can be downloaded. To run it, extract it to a directory and simply run the pitch.pd patch. Pure Data should load pdj~ automatically together with the classes present in the classes directory.


~ TarsosDSP on Android - Audio Processing in Java on Android

Audio on AndroidThis post explains how to get TarsosDSP running on Android. TarsosDSP is a Java library for audio processing. Its aim is to provide an easy-to-use interface to practical music processing algorithms implemented, as simply as possible, in pure Java and without any other external dependencies.

Since version 2.0 there are no more references to javax.sound.* in the TarsosDSP core codebase. This makes it easy to run TarsosDSP on Android. Audio Input/Output operations that depend on either the JVM or Dalvik runtime have been abstracted and removed from the core. For each runtime target a Jar file is provided in the TarsosDSP release directory.

The source code for the audio I/O on the JVM and the audio I/O on Android can be found on GitHub. To get an audio processing algorithm working on Android the only thing that is needed is to place TarsosDSP-Android-2.0.jar in the lib directory of your project.

The following example connects an AudioDispatcher to the microphone of an Android device. Subsequently, a real-time pitch detection algorithm is added to the processing chain. The detected pitch in Hertz is printed on a TextView element, if no pitch is present in the incoming sound, –1 is printed. To test the application download and install the /files/attachments/420/TarsosDSPAndroid.apk(/files/attachments/420/TarsosDSPAndroid.apk) application on your Android device. The source code is available as well.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
AudioDispatcher dispatcher = AudioDispatcherFactory.fromDefaultMicrophone(22050, 1024, 0);
PitchDetectionHandler pdh = new PitchDetectionHandler() {
    @Override
    public void handlePitch(PitchDetectionResult result, AudioEvent e) {
        final float pitchInHz = result.getPitch();
        runOnUiThread(new Runnable() {
            @Override
            public void run() {
                TextView text = (TextView) findViewById(R.id.textView1);
                text.setText("" + pitchInHz);
            }
        });
    }
};
AudioProcessor p = new PitchProcessor(PitchEstimationAlgorithm.FFT_YIN, 22050, 1024, pdh);
dispatcher.addAudioProcessor(p);
new Thread(dispatcher, "Audio Dispatcher").start();

Thanks to these changes, the fork of TarsosDSP kindly provided by GitHub user srubin, created for a programming assignment at UC Berkley, is not needed any more.

Have fun hacking audio on Android!


~ Haar Wavlet Transform in TarsosDSP

The TarsosDSP Java library for audio processing now contains an implementation of the Haar Wavelet Transform. A discrete wavelet transform based on the Haar wavelet (depicted at the right). This reversible transform has some interesting properties and is practical in signal compression and for analyzing sudden transitions in a file. It can e.g. be used to detect edges in an image.

As an example use case of the Haar transform, a simple lossy audio compression algorithm is implemented in TarsosDSP. It compresses audio by dividing audio into bloks of 32 samples, transforming them using the Haar wavelet Transform and subsequently removing samples with the least difference between them. The last step is to reverse the transform and play the audio. The amount of compressed samples can be chosen between 0 (no compression) and 31 (no signal left). This crude lossy audio compression technique can save at least a tenth of samples without any noticeable effect. A way to store the audio and read it from disk is included as well.

The algorithm works in real time and an example application has been implemented which operates on an mp3 stream. To make this work immediately, the avconv tool needs to be on your system’s path. Also implemented is a bit depth compressor, which shows the effect of (extreme) bit depth compression.

The example is available at the TarsosDSP release directory, the code can be found on the TarsosDSP github page.


~ TarsosDSP Spectral Peak extraction

The TarsosDSP Java library for audio processing now contains a module for spectral peak extraction. It calculates a short time Fourier transform and subsequently finds the frequency bins with most energy present using a median filter. The frequency estimation for each identified bin is significantly improved by taking phase information into account. A method described in “Sethares et al. 2009 - Spectral Tools for Dynamic Tonality and Audio Morphing”.

The noise floor, determined by the median filter, the spectral information itself and the estimated peak locations are returned for each FFT-frame. Below a visualization of a flute can be found. As expected, the peaks are harmonically spread over the complete spectrum up until the Nyquist frequency.


~ International School on Systematic Musicology and Sound and Music Computing (ISSSM) 2014, Genova

From 9 to 20 March 2014 I was a student at the International School on Systematic Musicology and Sound and Music Computing. The aim of the course was to:

Give students an intensive course in the most advanced and current topics in the research fields of systematic musicology and sound and music computing. Give students the opportunity to discuss their research proposals/project with an international staff of teachers representing a variety of expertise in different domains of systematic musicology and sound and music computing. Teach students the most recent knowledge and basic skills needed to start a PhD. Give students the opportunity to join the research communities on systematic musicology, on sound and music computing.

Next to the lectures, the informal meetings with the professors was very interesting. I got to add some things to my ‘to read’ list:

ISSSM 2014 logo


~ Power Socket Control with Arduino

This post contains some info on how do some basic home automation: it shows how cheap remote controlled power sockets can be managed using a computer. The aim is to power on or power off lights, a stereo or other devices remotely from a command shell.

The solution here uses an Arduino connected to a 433.33MHz transmitter. Via a Ruby script installed on the computer a command is send over serial to the Arduino. Subsequently the Arduino sends the command over the air to the power socket(s). If all goes well the power socket reacts by switching the connecting device on or off.

In the video below the process is shown. The command line interface controls the light via the Arduino. It should show the general idea.

The following Ruby script simply sends the binary control codes to the Arduino. For this type of power socket the code consist of a five bit group code and five bit device code. The Arduino is connected to /dev/tty.usbmodem411.

```ruby\ require ‘rubygems’\ require ‘serialport’

group = “11111”;

lamp = “01000” #B\ kerstboom = “00100” #C\ stereo = “00010” #D

port = “/dev/tty.usbmodem411”\ baud_rate = 9600\ data_bits = 8\ stop_bits = 1\ parity = SerialPort::NONE

command = ARGV[1] == “on”

device_string = ARGV[0]\ device = if device_string "kerstboom" kerstboom elsif device_string “lamp”\ lamp\ elsif device_string == “stereo”\ stereo\ end

def send(sp,group,device,deviceOn)\ command = deviceOn ? “1” : “0”\ command.each_char{|c| sp.write©}\ group.each_char{|c| sp.write©}\ device.each_char{|c| sp.write©}\ sp.flush\ read_response sp\ read_response sp\ end

def read_response(sp)\ response = sp.readline\ puts response.chomp\ end

SerialPort.open(port, baud_rate, data_bits, stop_bits, parity) do |sp|\ read_response sp\ send(sp,group,device,command)\ end\ ```

The code below is the complete Arduino sketch. It uses the RCSwich library, which makes the implementation very simple. Essentially it waits for a complete command and transmits it through the connected transmitter. The transmitter connected is a tx433n

```ruby\ #include <RCSwitch.h>

RCSwitch mySwitch = RCSwitch();

char command[12];//2x5 for device and group + command\ int index = 0;\ char currentChar = –1;

//the led pin in use\ int ledPin = 12;

void setup() {\ //start the serial communication\ Serial.begin(9600);\ // 433MHZ Transmitter is connected to Arduino Pin #10\ mySwitch.enableTransmit(10);\ //Led connected to led pin\ pinMode(ledPin, OUTPUT);\ Serial.println(“Started the power command center! Mwoehahaha!”);\ }

void readCommand(){\ //read a command\ while (Serial.available() > 0){\ if(index < 11){\ currentChar = Serial.read(); // Read a character\ command[index] = currentChar; // Store it\ index; // Increment where to write next\ command[index] = ‘\0’; // append termination char\ }\ }\ }

void loop() {\ //read a command\ readCommand();\ //if a command is complete\ if(index == 11){\ Serial.print(“Recieved command: “);\ Serial.println(command);\ char operation = command[0];\ char* group = &command[1];\ //group is 5 bits, as is device\ char* device = &command[6];

//execute the operation\ doSwitch(operation,group,device);\ //reset the index to read a new command\ index=0;\ }\ }

void doSwitch(char operation, char* group, char* device){\ digitalWrite(ledPin, HIGH);\ if(operation == ‘1’){\ mySwitch.switchOn(group, device);\ Serial.print(“Switched on device “);\ } else {\ mySwitch.switchOff(group, device);\ Serial.print(“Switched off device “);\ }\ Serial.println(device);\ digitalWrite(ledPin, LOW);\ }\ ```


~ TarsosDSP Paper and Presentation at AES 53rd International conference on Semantic Audio

TarsosDSP will be presented at the AES 53rd International conference on Semantic Audio in London . During the conference both a presentation and demonstration of the paper “TarsosDSP, a Real-Time Audio Processing Framework in Java”:[aes53_tarsos_dsp.pdf], by Joren Six, Olmo Cornelis and Marc Leman, in Proceedings of the 53rd AES Conference (AES 53rd), 2014. From their website:

Semantic Audio is concerned with content-based management of digital audio recordings. The rapid evolution of digital audio technologies, e.g. audio data compression and streaming, the availability of large audio libraries online and offline, and recent developments in content-based audio retrieval have significantly changed the way digital audio is created, processed, and consumed. New audio content can be produced at lower cost, while also large audio archives at libraries or record labels are opening to the public. Thus the sheer amount of available audio data grows more and more each day. Semantic analysis of audio resulting in high-level metadata descriptors such as musical chords and tempo, or the identification of speakers facilitate content-based management of audio recordings. Aside from audio retrieval and recommendation technologies, the semantics of audio signals are also becoming increasingly important, for instance, in object-based audio coding, as well as intelligent audio editing, and processing. Recent product releases already demonstrate this to a great extent, however, more innovative functionalities relying on semantic audio analysis and management are imminent. These functionalities may utilise, for instance, (informed) audio source separation, speaker segmentation and identification, structural music segmentation, or social and Semantic Web technologies, including ontologies and linked open data.

This conference will give a broad overview of the state of the art and address many of the new scientific disciplines involved in this still-emerging field. Our purpose is to continue fostering this line of interdisciplinary research. This is reflected by the wide variety of invited speakers presenting at the conference.

The paper presents TarsosDSP, a framework for real-time audio analysis and processing. Most libraries and frameworks offer either audio analysis and feature extraction or audio synthesis and processing. TarsosDSP is one of a only a few frameworks that offers both analysis, processing and feature extraction in real-time, a unique feature in the Java ecosystem. The framework contains practical audio processing algorithms, it can be extended easily, and has no external dependencies. Each algorithm is implemented as simple as possible thanks to a straightforward processing pipeline. TarsosDSP’s features include a resampling algorithm, onset detectors, a number of pitch estimation algorithms, a time stretch algorithm, a pitch shifting algorithm, and an algorithm to calculate the Constant-Q. The framework also allows simple audio synthesis, some audio effects, and several filters. The Open Source framework is a valuable contribution to the MIR-Community and ideal fit for interactive MIR-applications on Android. The full paper can be downloaded “TarsosDSP, a Real-Time Audio Processing Framework in Java”:[aes53_tarsos_dsp.pdf]

A BibTeX entry for the paper can be found below.

```ruby\ \@inproceedings{six2014tarsosdsp,\ author = {Joren Six and Olmo Cornelis and Marc Leman},\ title = {{TarsosDSP, a Real-Time Audio Processing Framework in Java}},\ booktitle = {{Proceedings of the 53rd AES Conference (AES 53rd)}},\ year = 2014\ }\ ```


~ Doctoral defense Olmo Cornelis - Exploring the Symbiosis of Western and non-Western Music

Woensdag 18 december 2013 organiseerde Olmo Cornelis een concert in het kader van zijn doctoraat. De dag erna volgde zijn verdediging. Nogmaals proficiat Olmo met het mooie eeh mbirapunt. Hieronder staat kort wat uitleg over het project en het concert.

bq.. In zijn onderzoeksproject ‘Exploring the symbiosis of Western and non-Western Music’ stelde Olmo Cornelis de beschrijving van Centraal-Afrikaanse muziek centraal. Deze werd verkend via computationele technieken die de klank als signaal\ benaderden. De verkregen informatie zorgde voor beïnvloeding van het artistieke oeuvre waarin steeds een mengeling van impliciete en expliciete etnische invloeden spelen.

In het kader van de afronding van dit doctoraal onderzoek spelen het HERMESensemble, het Nadar Ensemble, Maja Jantar en Françoise Vanhecke op 18 december werk van Olmo Cornelis dat tijdens dit project geschreven werd. Het onderzoeksproject Exploring the symbiosis of Western and non-Western Music werd in 2008 geïnitieerd aan het Conservatorium / School of Arts van de HoGent en werd gefinancierd door het onderzoeksfonds Hogeschool Gent.

Beeld: Noel Cornelis, Reality of Possibilities, 2012


~ IPEM D-Jogger featured on RTBF

Yesterday, December 4th 2013, the RTBF (Radio Télévision Belge Francophone) was at IPEM (Institute for Psychoacoustics and Electronic Music) to do a small feature on research currently going on at the institue. The RTBF is the public broadcasting organization of the French Community of Belgium, the southern, French-speaking part of Belgium. The clip shows the D-Jogger in action, with me using it. The fragment is available on the RTBF website and is embedded below.


~ Constant-Q Transform in Java with TarsosDSP

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a Constant-Q Transform (as of version 1.6). The Constant-Q transform does essentially the same thing as an FFT, but has the advantage that each octave has the same amount of bins. This makes the Constant-Q transform practical for applications processing music. If, for example, 12 bins per octave are chosen, these can correspond with the western musical scale.

Also included in the newest release (version 1.7) is a way to visualize the transform, or other musical features. The visualization implementation is done together with Thomas Stubbe.

The example application below shows the Constant-Q transform with an overlay of pitch estimations. The corresponding waveform is also shown.

Constant-Q transform in Java

Find your oven fresh baked binaries at the TarsosDSP Release Repository.\ The source code can be found at the TarsosDSP GitHub repository.


~ Evaluation and Recommendation of Pulse and Tempo Annotation in Ethnic Music - In Journal Of New Music Research

The journal paper Evaluation and Recommendation of Pulse and Tempo Annotation in Ethnic Music - In Journal Of New Music Research by Cornelis, Six, Holzapfel and Leman was published in a special issue about Computational Ethnomusicology of the Journal of New Music Research (JNMR) on the 20th of august 2013. Below you can find the abstract for the article, and the full text author version of the article itself.

Abstract: Large digital archives of ethnic music require automatic tools to provide musical content descriptions. While various automatic approaches are available, they are to a wide extent developed for Western popular music. This paper aims to analyze how automated tempo estimation approaches perform in the context of Central-African music. To this end we collect human beat annotations for a set of musical fragments, and compare them with automatic beat tracking sequences. We first analyze the tempo estimations derived from annotations and beat tracking results. Then we examine an approach, based on mutual agreement between automatic and human annotations, to automate such analysis, which can serve to detect musical fragments with high tempo ambiguity.

To read the full text you can either download “Evaluation and Recommendation of Pulse ant Tempo Annotation in Ethnic Music, Author version”:[2013.10.09.Tempo_annotation_in_ethnic_music-JNMR-Author_Version.pdf]. Or obtain the published version of Evaluation and Recommendation of Pulse ant Tempo Annotation in Ethnic Music, published version

Below the BibTex entry for the article is embedded.

```ruby\ \@article{cornelis2013tempo_jnmr,\ author = {Olmo Cornelis, Joren Six, Andre Holzapfel, and Marc Leman},\ title = {{Evaluation and Recommendation of Pulse ant Tempo Annotation in Ethnic Music}},\ journal = {{Journal of New Music Research}},\ volume = {42},\ number = {2},\ pages = {131-149},\ year = {2013},\ doi = {10.1080/09298215.2013.812123}\ }\ ```


~ Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music - In Journal Of New Music Research

The journal paper Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music by Six, Cornelis, and Leman was published in a special issue about Computational Ethnomusicology of the Journal of New Music Research (JNMR) on the 20th of august 2013. Below you can find the abstract for the article, and pointers to audio examples, the Tarsos software, and the author version of the article itself.

Abstract: This paper presents Tarsos, a modular software platform used to extract and analyze pitch organization in music. With Tarsos pitch estimations are generated from an audio signal and those estimations are processed in order to form musicologically meaningful representations. Tarsos aims to offer a flexible system for pitch analysis through the combination of an interactive user interface, several pitch estimation algorithms, filtering options, immediate auditory feedback and data output modalities for every step. To study the most frequently used pitches, a fine-grained histogram that allows up to 1200 values per octave is constructed. This allows Tarsos to analyze deviations in Western music, or to analyze specific tone scales that differ from the 12 tone equal temperament, common in many non-Western musics. Tarsos has a graphical user interface or can be launched using an API - as a batch script. Therefore, it is fit for both the analysis of individual songs and the analysis of large music corpora. The interface allows several visual representations, and can indicate the scale of the piece under analysis. The extracted scale can be used immediately to tune a MIDI keyboard that can be played in the discovered scale. These features make Tarsos an interesting tool that can be used for musicological analysis, teaching and even artistic productions.

To read the full text you can either download “Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music, Author version”:[2013.08.20.tarsos_jnmr_author_version.pdf]. Or obtain the published version of Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music, published version

Ladrang Kandamanyura (slendro pathet manyura), is the name of the piece used in the article throughout section 2. The album on which the piece can be found is available at wergo. Below a thirty second fragment is embedded. You can also “download”:[08._Ladrang_Kandamanyura_10s-20s_up.wav] the thirty second fragment to analyse it yourself.

</param> </param> </embed>

Below the BibTex entry for the article is embedded.

```ruby\ \@article{six2013tarsos_jnmr,\ author = {Six, Joren and Cornelis, Olmo and Leman, Marc},\ title = {Tarsos, a Modular Platform for Precise Pitch Analysis\ of Western and Non-Western Music},\ journal = {Journal of New Music Research},\ volume = {42},\ number = {2},\ pages = {113-129},\ year = {2013},\ doi = {10.1080/09298215.2013.797999},\ URL = {http://www.tandfonline.com/doi/abs/10.1080/09298215.2013.797999}\ }\ ```


~ FMA 2013 - Computer Assisted Transcripton of Ethnic Music

At the third international workshop on Folk Music Analysis (FMA2013) we presented a poster titled “Computer Assisted Transcription of Ethnic Music”:[A0-Poster.png]. The workshop took place in Amsterdam, Netherlands, June 6 and 7, 2013.

In the extended abstract, also titled “Computer Assisted Transcription of Ethnic Music”:[FMA_2013.computer_assisted_transcription.pdf], it is described how the Tarsos software program now has features aiding transcription. Tarsos is especially practical for ethnic music of which the tone scale is not known beforehand. The proceedings of FMA 2013 are available as well.

"Computer Assited Transcription of Ethnic Music poster":\[A0-Poster.jpg\]

During the conference there also was an interesting panel on transcription. The following people participated: John Ashley Burgoyne, moderator (University of Amsterdam), Kofi Agawu (Princeton University), Dániel P. Biró (University of Victoria), Olmo Cornelis (University College Ghent, Belgium), Emilia Gómez (Universitat Pompeu Fabra, Barcelona), and Barbara Titus (Utrecht University). Some pictures can be found below.


~ TarsosLSH - Locality Sensitive Hashing (LSH) in Java

TarsosLSH is a Java library implementing Locality-sensitive Hashing (LSH), a practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It supports several Locality Sensitive Hashing (LSH) families: the Euclidean hash family (L2), city block hash family (L1) and cosine hash family. The library tries to hit the sweet spot between being capable enough to get real tasks done, and compact enough to serve as a demonstration on how LSH works. It relates to the Tarsos project because it is a practical way to search for and compare musical features.

Quickly Getting Started with TarsosLSH

Head over to the TarsosLSH release repository and download the latest TarsosLSH library. Consult the TarsosLSH API documentation. If you, for some reason, want to build from source, you need Apache Ant and git installed on your system. The following commands fetch the source and build the library and example jars:

<code>git clone https://JorenSix@github.com/JorenSix/TarsosLSH.git
cd TarsosLSH/build
ant  #Builds the core TarsosLSH library
ant javadoc #build the API documentation
</code>

\ When everything runs correctly you should be able to run the command line application, and have the latest version of the TarsosLSH library for inclusion in your projects. Also, the Javadoc documentation for the API should be available in TarsosLSH/doc. Drop me a line if you use TarsosLSH in your project. Always nice to hear how this software is used.

The fastest way to get something on your screen is executing this on your command line: java - jar TarsosLSH.jar this lets LSH run on a random data set. The full reference of the command line application is included below:

Name
    TarsosLSH: finds the nearest neighbours in a data set quickly, using LSH.
Synopsis    
    java - jar TarsosLSH.jar [options] dataset.txt queries.txt 
Description
    Tries to find nearest neighbours for each vector in the 
    query file, using Euclidean (L2) distance by default.

    Both dataset.txt and queries.txt have a similar format: 
    an optional identifier for the vector and a list of N 
    coordinates (which should be doubles).

    [Identifier] coord1 coord2 ... coordN
    [Identifier] coord1 coord2 ... coordN

    For an example data set with two elements and 4 dimensions:

    Hans 12 24 18.5 -45.6
    Jane 13 19 -12.0 49.8

    Options are:

    -f cos|l1|l2 
        Defines the hash family to use:
            l1  City block hash family (L1)
            l2  Euclidean hash family(L2)
            cos Cosine distance hash family
    -r radius 
        Defines the radius in which near neighbours should
        be found. Should be a double. By default a reasonable
        radius is determined automatically.
    -h n_hashes
        An integer that determines the number of hashes to 
        use. By default 4, 32 for the cosine hash family.
    -t n_tables
        An integer that determines the number of hash tables,
        each with n_hashes, to use. By default 4.
    -n n_neighbours
        Number of neighbours in the neighbourhood, defaults to 3.
    -b 
        Benchmark the settings. 
    --help 
        Prints this helpful message.
Examples
    Search for nearest neighbours using the l2 hash family with a radius of 500
    and utilizing 5 hash tables, each with 3 hashes.

    java - jar TarsosLSH.jar -f l2 -r 500 -h 3 -t 5 dataset.txt queries.txt

Source Code Organization

The source tree is divided in three directories:

Further Reading

This section includes a links to resources used to implement this library.


~ Spotify Music Quiz - With a Big Red Button

This post documents how to implement a simple music quiz with the meta data provided by Spotify and a big red button.

During my last birthday party, organized at Hackerspace Ghent ,there was an ongoing Spotify music quiz. The concept was rather simple: If music that was playing was created in the same year as I was born, the guests could press a big red button and win a price! If they guessed incorrectly, they were publicly shamed with a sad trombone. Only one guess for each\ song was allowed.

Below you can find a small videograph which shows the whole thing in action. The music quiz is simple: press the button if the song is created in 1984. In the video, at first a wrong answer is given, then a couple of invalid answers follow. Finally a good answer is given, the song “The Killing Moon by Echo & the Bunnymen” is from 1984! Woohoo!

Allright, what did I use to to implement the spotify music quiz:

The Red Button

A nice big red button, which main features are that it is red and big, serves as the main input device for the quiz. To be able to connect the salvaged safety button to a computer via USB, an Arduino Nano is used. The Arduino is loaded with the code from the debounce tutorial, basically unchanged. Unfortunately, I could not fit the the FTDI-chip, which provides the USB connection, in the original enclosure. An additional enclosure, the white one seen in the pictures below, was added.

When pressed, the button sends a signal over the serial line. See below how the Ruby Quiz script waits for, and handles such event.

Get Meta Data from the Currently Playing Song in Spotify

Another requirement for the quiz to work is the ability to get information about the song currently playing in Spotify. On Linux this can be done with the dbus interface for Spotify. The script included below returns information about the artist, album and title of the song. It also detects if Spotify is running or not. Find it, fork it, use it, on github

On Mac OS X, Spotify can be controlled using AppleScript. A script on github shows how to get meta data about the currently playing song in Spotify, on Mac OS X.

```ruby\ #!/usr/bin/python\ #

  1. now_playing.py

  2. Python script to fetch the meta data of the currently playing

  3. track in Spotify. This is tested on Ubuntu.

import dbus\ bus = dbus.SessionBus()\ try:\ spotify = bus.get_object(‘com.spotify.qt’, ‘/’)\ iface = dbus.Interface(spotify, ‘org.freedesktop.MediaPlayer2’)\ meta_data = iface.GetMetadata()\ artistname = “,”.join(meta_data[‘xesam:artist’])\ trackname = meta_data[‘xesam:title’]\ albumname = meta_data[‘xesam:album’]\ #Other fields are:\ # ‘xesam:trackNumber’, ‘xesam:discNumber’,’mpris:trackid’,\ # ‘mpris:length’,’mpris:artUrl’,’xesam:autoRating’,’xesam:contentCreated’,’xesam:url’\ print str(trackname + “ | “ + artistname + “ | “ + albumname + “ | Unknown”)\ except dbus.exceptions.DBusException:\ print “Spotify is not running.”\ ```

The Quiz Ruby Script

With the individual part in order, we now need Ruby glue to paste it all together. The complete music quiz script can be found on github. The main loop, below, waits for a button press. If it is pressed the “sad trombone”:[sad_trombone.wav], “invalid answer”:[cheater.wav] or “winner”:[winner.wav] sound is played. The sounds are attached to this post.

```ruby\ while true do\ # Wait for a button press\ data = sp.readline\ # Fetch meta data about the currently\ # playing song\ result = `#{now_playing_command}`\ # Parse the meta data\ title,artist,album = parse_result(result)\ #Title is hash key, should be unique within playlist\ key = title\ if responded_to.has_key? key\ puts “Already answered: you cheater”\ play cheater\ elsif correct_answers.has_key? key\ puts “Correct answers: woohoo”\ responded_to[key]=true\ play winner\ else\ puts “Incorrect answer: sad trombone”\ responded_to[key]=true\ play sad_trombone\ end\ end\ ```


~ Flanger Audio Effect in Java

The DSP library for Taros, aptly named TarsosDSP, now includes an example demonstrating the flanging audio effect. Flanging, essentialy mixing the signal with a varying delay of itself, produces an interesting interference pattern.

Pitch estimation synthesizer

The flanging example works on wav-files or on input from microphone. Try it yourself, download\ Flanging.jar, the executable jar file. Below you can check what flanging sounds like with various parameters.

The source code of the Java implementation can be found on the TarsosDSP github page.


~ TarsosDSP Christmas Edition: Jingle Cats

The DSP library for Taros, aptly named TarsosDSP, now includes an example showing how to synthesize cat sounds. The inspration came from this youtube video

To hear what exactly it does, listen to the following audio example.

There is also a command line interface, the following command does

\ java -jar Catify-latest.jar in.mid\

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP catify'er
----------------------------------------------------
Synopsis:
    java -jar Catify-latest.jar input.mid
----------------------------------------------------
Description:

The source code of the Java implementation of the catify’er can be found on the TarsosDSP github page.


~ TarsosDSP Pitch Estimation Synthesizer

The DSP library for Taros, aptly named TarsosDSP, now includes an example showing how to synthesize pitch estimations. The goal of the example is to show which errors are made by different pitch detectors.

Pitch estimation synthesizer

To test the application, download and execute the Resynthesizer.jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To hear what exactly it does, compare the following two audio fragments:

There is also a command line interface, the following command does pitch tracking, and follows the envelope of in.wav and immediately plays it on the default audio device. If you want to save the audio, see the command line options. The “flute example”:[flute.wav] is provided for your convenience.

\ java -jar Resynthesizer-latest.jar in.wav\

 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP resynthesizer
----------------------------------------------------
Synopsis:
    java -jar CommandLineResynthesizer.jar [--detector DETECTOR] [--output out.wav] [--combined combined.wav] input.wav
----------------------------------------------------
Description:
    Extracts pitch and loudnes from audio and resynthesises the audio with that information.
    The result is either played back our written in an output file. 
    There is als an option to combine source and synthezized material
    in the left and right channels of a stereo audio file.

    input.wav       a readable wav file.

    --output out.wav        a writable file.

    --combined combined.wav     a writable output file. One channel original, other synthesized.
    --detector DETECTOR defaults to FFT_YIN or one of these:
                YIN
                MPM
                FFT_YIN
                DYNAMIC_WAVELET
                AMDF

The source code of the Java implementation of the synthesizer can be found on the TarsosDSP github page.


~ Phase Vocoding: Time Stretching and Pitch Shifting with TarsosDSP Java

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting algorithm (as of version 1.4) and a time stretching algorithm. Combined, the two can be used for something like phase vocoding. With a phase vocoder you can load an audio snippet, change the pitch and duration and e.g. create a library of snippets. E.g. by recording one piano key stroke, it is possible to generate two octaves of samples of different lengths, and use those in stead of synthesized samples. The following example application shows exactly that, implemented in the java programming language.

The example application below shows how to pitch shift and time stretch a sample to create a sample library with the TarsosDSP library.

<a href=/releases/TarsosDSP/TarsosDSP-latest/TarsosDSP-latest-Examples/SampleExtractor-latest.jar">Pitch shifting in Java</a>

Find your oven fresh baked binaries at the TarsosDSP Release Repository.


~ Tarsos 1.0: Transcription Features

Today marks the reslease of Tarsos 1.0 . The new Tarsos release contains practical transcription features. As can be seen in the screenshot below, a time stretching feature makes it easy to loop a certain audio fragment while it is playing in a slow tempo. The next loop can be played with by pressing the n key, the one before by pressing b.

Since the pitch classes can be found in a song, and there is a feature that lets you play a MIDI keyboard in the tone scale of the song under analysis, transcription of ethnic music is made a lot easier.

\ Tarsos 1.0\

The new release of Tarsos can be found in the Tarsos release repository. From now on, nightly releases are uploaded there automatically.


~ Pitch Shifting - Implementation in Pure Java with Resampling and Time Stretching

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a pitch shifting algorithm (as of version 1.4). The goal of pitch shifting is to change the pitch of a piece of audio without affecting the duration. The algorithm implemented is a combination of resampling and time stretching. Resampling changes the pitch of the audio, but affects the total duration. Consecutively, the duration of the audio is stretched to the original (without affecting pitch) with time stretching. The result is very similar to phase vocoding.

The example application below shows how to pitch shift input from the microphone in real-time, or pitch shift a recorded track with the TarsosDSP library.

Pitch shifting in Java

To test the application, download and execute the PitchShift.jar file and load an audio file. For the moment only 44.1kHz mono wav is allowed. To get started you can try “this piece of audio”:[08._Ladrang_Kandamanyura_10s-20s.wav].

There is also a command line interface, the following command lowers the pitch of in.wav by two semitones.

java -jar in.wav out.wav -200

----------------------------------------------------
 _______                       _____   _____ _____  
|__   __|                     |  __ \ / ____|  __ \ 
   | | __ _ _ __ ___  ___  ___| |  | | (___ | |__) |
   | |/ _` | '__/ __|/ _ \/ __| |  | |\___ \|  ___/ 
   | | (_| | |  \__ \ (_) \__ \ |__| |____) | |     
   |_|\__,_|_|  |___/\___/|___/_____/|_____/|_|     

----------------------------------------------------
Name:
    TarsosDSP Pitch shifting utility.
----------------------------------------------------
Synopsis:
    java -jar PitchShift.jar source.wav target.wav cents
----------------------------------------------------
Description:
    Change the play back speed of audio without changing the pitch.

        source.wav  A readable, mono wav file.
        target.wav  Target location for the pitch shifted file.
        cents       Pitch shifting in cents: 100 means one semitone up, 
                -100 one down, 0 is no change. 1200 is one octave up.

The resampling feature was implemented with libresample4j by Laszlo Systems. libresample4j is a Java port of Dominic Mazzoni’s libresample 0.1.3, which is in turn based on Julius Smith’s Resample 1.7 library.


~ ISMIR 2012 - Highlights

Logo ISMIR 2012The 13th International Society for Music Information Retrieval Conference took place in Porto, Portugal, October 8th-12th, 2012. This text contains links to some papers, toolkits, software presented there which are interesting for my research. Basically it contains my personal highlights of the conference. The ISMIR 2012 is described as follows:

The annual Conference of the International Society for Music Information Retrieval (ISMIR) is the world’s leading research forum on processing, searching, organizing and accessing music-related data. The revolution in music distribution and storage brought about by digital technology has fueled tremendous research activities and interests in academia as well as in industry. The ISMIR Conference reflects this rapid development by providing a meeting place for the discussion of MIR-related research, developments, methods, tools and experimental results. Its main goal is to foster multidisciplinary exchange by bringing together researchers and developers, educators and librarians, as well as students and professional users.

Tutorials

I saw an interesting tutorial on Jazz music and a tutorial on source separation. After an introduction, which detailed the experimental basis of the system, a source separator was introduced. The REPET source separator is a relatively simple system that yields reasonable results to split accompaniment from foreground melody.

Posters & Talks

The approach and the dataset used in N-gram Based Statistical Makam Detection on Makam Music in Turkey Using Symbolic Data is very interesting. More than 800 pieces of makam music where transcribed manually and analysed. Details about the dataset are available in the following paper: A Turkish Makam Music Symbolic Database for Music Information Retrieval: SymbTr.

Assigning a Confidence Threshold on Automatic Beat Annotation in Large Datasets by Zapata et al. shows a very interesting way to do exactly what the title says. Descriptive titles are descriptive.

A very practical tool to do melody extraction was presented by Justin Salamon. He created a Vamp Plugin with the name Melodia. Unfortunately the plugin is currently only available for windows, but Linux and OS X versions are in the pipeline. More about the algorithm implemented and background information can be found in the paper Justin presented: Statistical Characterisation of Melodic Pitch Contours and its Application for Melody Extraction. Another Vamp Plugin for melody visualization was also presented: Pitch Content Visualization Tools for Music Performance Analysis.

The ongoing work by Ceril Bohak and Matija Marolt on segmentation of folk music could be very useful to apply on Afican musics. The paper is called Finding Repeating Stanzas in Folk Songs.


~ CIM 2012 - Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means

Logo Universiteit UtrechtWhat follows is about the Conference on Interdisciplinary Musicology (CIM2012) and the 15th international Conference of the Gesellschaft fur Musikfoschung. First this text will give information about our contribution to CIM2012: Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means and then a number of highlights of the conference follow. The joint conference took place from the 4th to the 8th of september 2012.

In 2012, CIM will tackle the subject of History. Hosted by the University of Göttingen, whose one time music director Johann Nikolaus Forkel is widely regarded as one of the founders of modern music historiography, CIM12 aims to promote collaborations that provoke and explore new methods and methodologies for establishing, evaluating, preserving and communicating knowledge of music and musical practices of past societies and the factors implicated in both the preservation and transformation of such practices over time.

### Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means

Our contribution ton CIM 2012 is titled “Revealing and Listening to Scales From the Past; Tone Scale Analysis of Archived Central-African Music Using Computational Means”:[CIM12_Submission.pdf]. The aim was to show how tone scales of the past, e.g. organ tuning, can be extracted and sonified. During the demo special attention was given to historic Central African tuning systems. The presentation I gave is included below and or available for “download”:[2012.09.05-Revealing_and_listening_to_scales_from_the_past__tone_scale_analysis_of_archived_Central-African_music_using_computational_means..ppt]

Highlights

What follows are some personal highlights for the Conference on Interdisciplinary Musicology (CIM2012) and the 15th international Conference of the Gesellschaft fur Musikfoschung. The joint conference took place from the 4th to the 8th of september 2012.

The work presented by Rytis Ambrazevicius et al. Modal changes in traditional Lithuanian singing: Diachronic aspect has a lot in common with our research, it was interesting to see their approach. Another highlight of the conference was the whole session organized by Klaus-Peter Brenner around Mbira music.

Rainer Polak gave a talk titled ‘Swing, Groove and Metre. Asymmetric Feels, Metric Ambiguity and Metric Transformation in African Musics’. He showed how research about rhythm in jazz research, music theory and empirical musicology ( amongst others) could be bridged and applied to ethnic music.

The overview Eleanore Selfridge-Field gave during her talk Between an Analogue Past and a Digital Future: The Evolving Digital Present was refreshing. She had a really clear view on all the different ways musicology and digital media can benifit from each-other.

From the concert programme I found two especially interesting: the lecture-performance by Margarete Maierhofer-Lischka and Frauke Aulbert of Lotofagos, a piece by Beat Furrer and Burdocks composed and performed by Christian Wolff and a bunch of enthusiastic students.


~ ICMC 2012 - Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition

Logo Universiteit UtrechtAt this years ICMC Conference, ICMC 2012 we presented a paper describing a way to experiment with tone scales and how to use Tarsos as a compositional tool. What follows are some pointers to the presentation, paper and to other interesting talks that were presented there.

ICMC 2012 was organized in Ljubljana from the 9 to 14 septembre and had a very dense program of talks, posters, presentations, demos and concerts.

Since 1974 the International Computer Music Conference has been the major international forum for the presentation of the full range of outcomes from technical and musical research, both musical and theoretical, related to the use of computers in music. This annual conference regularly travels the globe, with recent conferences in the Americas, Europe and Asia. This year we welcome the conference to Slovenia for the first time.

### Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition

Our contribution to the conference was a paper titled “Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition”:[icmc2012_submission_45.pdf].

If you want to cite our work, this BibTeX entry is included for your convenience:

```ruby\ \@inproceedings{cornelis2012sound_to_scale,\ author = {Olmo Cornelis and Joren Six},\ title = {{Sound to Scale to Sound, a Setup for Microtonal Exploration and Composition}},\ booktitle = {{Proceedings of the 2012 International Computer Music Conference,\ (ICMC 2012)}},\ year = {2012},\ publisher = {The International Computer Music Association}\ }\ ```

Program highlights

What follows are a number of pointers to my personal program highlights.

Verena Thomas presented two very well polished software tools. One to detect patterns in scores, called motifviewer and a tool to search in score databases in a multi-modal way. The Probado tool does score-to-audio alignment and much more.

Gibber is an impressive live-coding environment with an easy syntax. Since it is all done with javascript you can start playing with it immediately. Overtone Another live-coding environment, presented at the conference by Sam Aaron, was equally impressive. It is programmed using the Closure language.

At ICMC there were a number of tools to assist in composition. One of those is The Bach Project, by Andrea Agostini. Togheter with CatART by Diemo Swartz it forms a very expressive platform to work with sound, which was demonstrated by Aaron Einbond and Christopher Trapani in their paper titled Precise Pitch Control In Real Time Corpus-Based Concatenative Synthesis. Diemo Swartz presented work on Audio Mosaicing, it can be seen as a follow-up to AuidioGuild by Ben Hackbarth.

I also got to know the work by Thomas Grill, on his website a nice piece of software can be found a Python implementation of the Non Stationary Gabor Transform (NSGT). Another software system I got to know is the functional signal processing programming language FAUST

My personal highlights of the concert programme include the works by Johannes Kreidler, Aura Pon, Daniel Mayer, Alexander Schubert and the remarkable performance by Dexter Ford. The concept behind Soundlog by Johannes Kretz was also interesting.


~ Analytical Approaches To World Music - Microtonal Scale Exploration in Central Africa

At the 2012 AAWM (Analytical Approaches To World Music) conference we presented a way to explore tone scales in the music of Central Africa. Since the audience consisted of (ethno)musicologists, the main focus of the presentation was on the applicication part, the technical aspects were only briefly mentioned.

The extended abstract can be consulted: Towards the tangible: microtonal scale exploration in Central-African music

The conference program itself was very diverse and interesting.


~ TarsosDSP Release 1.2

Today a new version of the TarsosDSP library was released. TarsosDSP is a small library to do audio processing in Java. It features two new pitch detectors. An AMDF (Average Magnitude Difference Function) pitch detector, contributed by Eder Souza of Brazil and a faster implementation of YIN kindly provided by Matthias Mauch of Queen Mary University, London.

[![Pitch Detector in Java](/files/attachments/336/PitchDetector.png "Pitch Detector in Java")](/releases/TarsosDSP/TarsosDSP-1.2/TarsosDSP-1.2-Examples/PitchDetector-1.2.jar)

Find your oven fresh baked binaries at the TarsosDSP Release Repository.