Welcome

Hi, I'm Joren. Welcome to my website. I'm a researcher in the field of Music Informatics, Music Information Retrieval, and Computational Ethnomusicology. Here you can find a record of my research and other projects I have been working on. Learn more »

Contact

Joren Six
joren.six@ugent.be
University Ghent, IPEM

~ TarsosDSP on Android - Audio Processing in Java on Android

Audio on AndroidThis post explains how to get TarsosDSP running on Android. TarsosDSP is a Java library for audio processing. Its aim is to provide an easy-to-use interface to practical music processing algorithms implemented, as simply as possible, in pure Java and without any other external dependencies.

Since version 2.0 there are no more references to javax.sound.* in the TarsosDSP core codebase. This makes it easy to run TarsosDSP on Android. Audio Input/Output operations that depend on either the JVM or Dalvik runtime have been abstracted and removed from the core. For each runtime target a Jar file is provided in the TarsosDSP release directory.

The source code for the audio I/O on the JVM and the audio I/O on Android can be found on GitHub. To get an audio processing algorithm working on Android the only thing that is needed is to place TarsosDSP-Android-2.0.jar in the lib directory of your project.

The following example connects an AudioDispatcher to the microphone of an Android device. Subsequently, a real-time pitch detection algorithm is added to the processing chain. The detected pitch in Hertz is printed on a TextView element, if no pitch is present in the incoming sound, -1 is printed. To test the application download and install the TarsosDSPAndroid.apk application on your Android device. The source code is available as well.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
AudioDispatcher dispatcher = AudioDispatcherFactory.fromDefaultMicrophone(22050,1024,0);

PitchDetectionHandler pdh = new PitchDetectionHandler() {
        @Override
        public void handlePitch(PitchDetectionResult result,AudioEvent e) {
                final float pitchInHz = result.getPitch();
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        TextView text = (TextView) findViewById(R.id.textView1);
                        text.setText("" + pitchInHz);
                    }
                });                        
        }
};
AudioProcessor p = new PitchProcessor(PitchEstimationAlgorithm.FFT_YIN, 22050, 1024, pdh);
dispatcher.addAudioProcessor(p);
new Thread(dispatcher,"Audio Dispatcher").start();

Thanks to these changes, the fork of TarsosDSP kindly provided by GitHub user srubin, created for a programming assignment at UC Berkley, is not needed any more.

Have fun hacking audio on Android!


~ Haar Wavlet Transform in TarsosDSP

The TarsosDSP Java library for audio processing now contains an implementation of the Haar Wavelet Transform. A discrete wavelet transform based on the Haar wavelet (depicted at the right). This reversible transform has some interesting properties and is practical in signal compression and for analyzing sudden transitions in a file. It can e.g. be used to detect edges in an image.

As an example use case of the Haar transform, a simple lossy audio compression algorithm is implemented in TarsosDSP. It compresses audio by dividing audio into bloks of 32 samples, transforming them using the Haar wavelet Transform and subsequently removing samples with the least difference between them. The last step is to reverse the transform and play the audio. The amount of compressed samples can be chosen between 0 (no compression) and 31 (no signal left). This crude lossy audio compression technique can save at least a tenth of samples without any noticeable effect. A way to store the audio and read it from disk is included as well.

The algorithm works in real time and an example application has been implemented which operates on an mp3 stream. To make this work immediately, the avconv tool needs to be on your system’s path. Also implemented is a bit depth compressor, which shows the effect of (extreme) bit depth compression.

The example is available at the TarsosDSP release directory, the code can be found on the TarsosDSP github page.

  • Haar Wavelet Audio Compression

    Haar Wavelet Audio Compression


~ TarsosDSP Spectral Peak extraction

The TarsosDSP Java library for audio processing now contains a module for spectral peak extraction. It calculates a short time Fourier transform and subsequently finds the frequency bins with most energy present using a median filter. The frequency estimation for each identified bin is significantly improved by taking phase information into account. A method described in “Sethares et al. 2009 – Spectral Tools for Dynamic Tonality and Audio Morphing”.

The noise floor, determined by the median filter, the spectral information itself and the estimated peak locations are returned for each FFT-frame. Below a visualization of a flute can be found. As expected, the peaks are harmonically spread over the complete spectrum up until the Nyquist frequency.

  • Spectral peaks of a flute. The first 10 harmonic are detected up until the Nyquist frequency.

    Spectral peaks of a flute. The first 10 harmonic are detected up until the Nyquist frequency.


~ International School on Systematic Musicology and Sound and Music Computing (ISSSM) 2014, Genova

From 9 to 20 March 2014 I was a student at the International School on Systematic Musicology and Sound and Music Computing. The aim of the course was to:

Give students an intensive course in the most advanced and current topics in the research fields of systematic musicology and sound and music computing. Give students the opportunity to discuss their research proposals/project with an international staff of teachers representing a variety of expertise in different domains of systematic musicology and sound and music computing. Teach students the most recent knowledge and basic skills needed to start a PhD. Give students the opportunity to join the research communities on systematic musicology, on sound and music computing.

Next to the lectures, the informal meetings with the professors was very interesting. I got to add some things to my ‘to read’ list:

  • Rolf Bader, Calculation of Helmholtz frequency of a Renaissance vihuela string instrument with five tone hole
  • Schneider, A. & Frieler, K. (2009) Perception of harmonic and inharmonic sounds: Results from
    ear models
    . In S. Ystad, R. Kronland-Martinet & K. Jensen (Eds.), Computer music modeling and retrieval. Genesis of meaning in sound and music (pp. 18–44). Berlin: Springer.
  • Rolf Bader, Sound – Perception – Performance

ISSSM 2014 logo


~ TarsosDSP Paper and Presentation at AES 53rd International conference on Semantic Audio

TarsosDSP will be presented at the AES 53rd International conference on Semantic Audio in London . During the conference both a presentation and demonstration of the paper TarsosDSP, a Real-Time Audio Processing Framework in Java, by Joren Six, Olmo Cornelis and Marc Leman, in Proceedings of the 53rd AES Conference (AES 53rd), 2014. From their website:

Semantic Audio is concerned with content-based management of digital audio recordings. The rapid evolution of digital audio technologies, e.g. audio data compression and streaming, the availability of large audio libraries online and offline, and recent developments in content-based audio retrieval have significantly changed the way digital audio is created, processed, and consumed. New audio content can be produced at lower cost, while also large audio archives at libraries or record labels are opening to the public. Thus the sheer amount of available audio data grows more and more each day. Semantic analysis of audio resulting in high-level metadata descriptors such as musical chords and tempo, or the identification of speakers facilitate content-based management of audio recordings. Aside from audio retrieval and recommendation technologies, the semantics of audio signals are also becoming increasingly important, for instance, in object-based audio coding, as well as intelligent audio editing, and processing. Recent product releases already demonstrate this to a great extent, however, more innovative functionalities relying on semantic audio analysis and management are imminent. These functionalities may utilise, for instance, (informed) audio source separation, speaker segmentation and identification, structural music segmentation, or social and Semantic Web technologies, including ontologies and linked open data.

This conference will give a broad overview of the state of the art and address many of the new scientific disciplines involved in this still-emerging field. Our purpose is to continue fostering this line of interdisciplinary research. This is reflected by the wide variety of invited speakers presenting at the conference.

The paper presents TarsosDSP, a framework for real-time audio analysis and processing. Most libraries and frameworks offer either audio analysis and feature extraction or audio synthesis and processing. TarsosDSP is one of a only a few frameworks that offers both analysis, processing and feature extraction in real-time, a unique feature in the Java ecosystem. The framework contains practical audio processing algorithms, it can be extended easily, and has no external dependencies. Each algorithm is implemented as simple as possible thanks to a straightforward processing pipeline. TarsosDSP’s features include a resampling algorithm, onset detectors, a number of pitch estimation algorithms, a time stretch algorithm, a pitch shifting algorithm, and an algorithm to calculate the Constant-Q. The framework also allows simple audio synthesis, some audio effects, and several filters. The Open Source framework is a valuable contribution to the MIR-Community and ideal fit for interactive MIR-applications on Android. The full paper can be downloaded TarsosDSP, a Real-Time Audio Processing Framework in Java

A BibTeX entry for the paper can be found below.

1
2
3
4
5
6
@inproceedings{six2014tarsosdsp,
  author      = {Joren Six and Olmo Cornelis and Marc Leman},
  title       = {{TarsosDSP, a Real-Time Audio Processing Framework in Java}},
  booktitle   = {{Proceedings of the 53rd AES Conference (AES 53rd)}}, 
  year        =  2014
}
  • AES53

    AES53

  • Constant-Q

    Constant-Q

  • Flanger

    Flanger

  • Pitch Shifting

    Pitch Shifting

  • Samping

    Samping


~ Doctoral defense Olmo Cornelis - Exploring the Symbiosis of Western and non-Western Music

Woensdag 18 december 2013 organiseerde Olmo Cornelis een concert in het kader van zijn doctoraat. De dag erna volgde zijn verdediging. Nogmaals proficiat Olmo met het mooie eeh mbirapunt. Hieronder staat kort wat uitleg over het project en het concert.

In zijn onderzoeksproject ‘Exploring the symbiosis of Western and non-Western Music’ stelde Olmo Cornelis de beschrijving van Centraal-Afrikaanse muziek centraal. Deze werd verkend via computationele technieken die de klank als signaal
benaderden. De verkregen informatie zorgde voor beïnvloeding van het artistieke oeuvre waarin steeds een mengeling van impliciete en expliciete etnische invloeden spelen.

In het kader van de afronding van dit doctoraal onderzoek spelen het HERMESensemble, het Nadar Ensemble, Maja Jantar en Françoise Vanhecke op 18 december werk van Olmo Cornelis dat tijdens dit project geschreven werd. Het onderzoeksproject Exploring the symbiosis of Western and non-Western Music werd in 2008 geïnitieerd aan het Conservatorium / School of Arts van de HoGent en werd gefinancierd door het onderzoeksfonds Hogeschool Gent.

Beeld: Noel Cornelis, Reality of Possibilities, 2012


~ IPEM D-Jogger featured on RTBF

Yesterday, December 4th 2013, the RTBF was at IPEM to do a small feature on research currently going on at the institue. The RTBF is the public broadcasting organization of the French Community of Belgium, the southern, French-speaking part of Belgium. The clip shows the D-Jogger in action, with me using it. The fragment is available on the RTBF website and is embedded below.


~ Evaluation and Recommendation of Pulse and Tempo Annotation in Ethnic Music - In Journal Of New Music Research

The journal paper Evaluation and Recommendation of Pulse and Tempo Annotation in Ethnic Music – In Journal Of New Music Research by Cornelis, Six, Holzapfel and Leman was published in a special issue about Computational Ethnomusicology of the Journal of New Music Research on the 20th of august 2013. Below you can find the abstract for the article, and the full text author version of the article itself.

Abstract: Large digital archives of ethnic music require automatic tools to provide musical content descriptions. While various automatic approaches are available, they are to a wide extent developed for Western popular music. This paper aims to analyze how automated tempo estimation approaches perform in the context of Central-African music. To this end we collect human beat annotations for a set of musical fragments, and compare them with automatic beat tracking sequences. We first analyze the tempo estimations derived from annotations and beat tracking results. Then we examine an approach, based on mutual agreement between automatic and human annotations, to automate such analysis, which can serve to detect musical fragments with high tempo ambiguity.

To read the full text you can either download Evaluation and Recommendation of Pulse ant Tempo Annotation in Ethnic Music, Author version. Or obtain the published version of Evaluation and Recommendation of Pulse ant Tempo Annotation in Ethnic Music, published version

Below the BibTex entry for the article is embedded.

1
2
3
4
5
6
7
8
9
10
@article{cornelis2013tempo_jnmr,
  author = {Olmo Cornelis, Joren Six, Andre Holzapfel, and Marc Leman},
  title = {{Evaluation and Recommendation of Pulse ant Tempo Annotation in Ethnic Music}},
  journal = {{Journal of New Music Research}},
  volume = {42},
  number = {2},
  pages = {131-149},
  year = {2013},
  doi = {10.1080/09298215.2013.812123}
}

~ Constant-Q Transform in Java with TarsosDSP

The DSP library for Taros, aptly named TarsosDSP, now includes an implementation of a Constant-Q Transform (as of version 1.6). The Constant-Q transform does essentially the same thing as an FFT, but has the advantage that each octave has the same amount of bins. This makes the Constant-Q transform practical for applications processing music. If, for example, 12 bins per octave are chosen, these can correspond with the western musical scale.

Also included in the newest release (version 1.7) is a way to visualize the transform, or other musical features. The visualization implementation is done together with Thomas Stubbe.

The example application below shows the Constant-Q transform with an overlay of pitch estimations. The corresponding waveform is also shown.

Constant-Q transform in Java

Find your oven fresh baked binaries at the TarsosDSP Release Repository.
The source code can be found at the TarsosDSP GitHub repository.


~ Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music - In Journal Of New Music Research

The journal paper Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music by Six, Cornelis, and Leman was published in a special issue about Computational Ethnomusicology of the Journal of New Music Research on the 20th of august 2013. Below you can find the abstract for the article, and pointers to audio examples, the Tarsos software, and the author version of the article itself.

Abstract: This paper presents Tarsos, a modular software platform used to extract and analyze pitch organization in music. With Tarsos pitch estimations are generated from an audio signal and those estimations are processed in order to form musicologically meaningful representations. Tarsos aims to offer a flexible system for pitch analysis through the combination of an interactive user interface, several pitch estimation algorithms, filtering options, immediate auditory feedback and data output modalities for every step. To study the most frequently used pitches, a fine-grained histogram that allows up to 1200 values per octave is constructed. This allows Tarsos to analyze deviations in Western music, or to analyze specific tone scales that differ from the 12 tone equal temperament, common in many non-Western musics. Tarsos has a graphical user interface or can be launched using an {\sc api} – as a batch script. Therefore, it is fit for both the analysis of individual songs and the analysis of large music corpora. The interface allows several visual representations, and can indicate the scale of the piece under analysis. The extracted scale can be used immediately to tune a {\sc midi} keyboard that can be played in the discovered scale. These features make Tarsos an interesting tool that can be used for musicological analysis, teaching and even artistic productions.

To read the full text you can either download Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music, Author version. Or obtain the published version of Tarsos, a Modular Platform for Precise Pitch Analysis of Western and Non-Western Music, published version

Ladrang Kandamanyura (slendro pathet manyura), is the name of the piece used in the article throughout section 2. The album on which the piece can be found is available at wergo. Below a thirty second fragment is embedded. You can also download the thirty second fragment to analyse it yourself.

Below the BibTex entry for the article is embedded.

1
2
3
4
5
6
7
8
9
10
11
12
@article{six2013tarsos_jnmr,
  author = {Six, Joren and Cornelis, Olmo and Leman, Marc},
  title = {Tarsos, a Modular Platform for Precise Pitch Analysis 
            of Western and Non-Western Music},
  journal = {Journal of New Music Research},
  volume = {42},
  number = {2},
  pages = {113-129},
  year = {2013},
  doi = {10.1080/09298215.2013.797999},
 URL = {http://www.tandfonline.com/doi/abs/10.1080/09298215.2013.797999}
}

Previous entries »