0110.be logo

~ ISMIR 2018 Conference - Automatic Analysis Of Global Music Recordings suggests Scale Tuning Universals

Thanks to the support of a travel grant by the faculty of Arts and Philosophy of Ghent University I was able to attend the ISMIR 2018 conference. A conference on Music Information Retrieval. I am co author on a contribution for the the Late-Breaking / Demos session

The structure of musical scales has been proposed to reflect universal acoustic principles based on simple integer ratios. However, some studying tuning in small samples of non-Western cultures have argued that such ratios are not universal but specific to Western music. To address this debate, we applied an algorithm that could automatically analyze and cross-culturally compare scale tunings to a global sample of 50 music recordings, including both instrumental and vocal pieces. Although we found great cross-cultural diversity in most scale degrees, these preliminary results also suggest a strong tendency to include the simplest possible integer ratio within the octave (perfect fifth, 3:2 ratio, ~700 cents) in both Western and non-Western cultures. This suggests that cultural diversity in musical scales is not without limit, but is constrained by universal psycho-acoustic principles that may shed light on the evolution of human music.


~ JGaborator - Fast Gabor spectral transforms in Java

Recently I have published a small library on github called JGaborator. The library calculates fine grained constant-Q spectral representations of audio signals quickly from Java. The calculation of a Gabor transform is done by a C++ library named Gaborator. A Java native interface (JNI) bridge to the C++ Gaborator is provided. A combination of Gaborator and a fast FFT library (such as pfft) allows fine grained constant-Q transforms at a rate of about 200 times real-time on moderate hardware. It can serve as a front-end for several audio processing or MIR applications.

For more information on the Gaborator C++ library by Andreas Gustafsson, please see the gaborator.com website or a talk by the author on the library called Exploring time-frequency space with the Gaborator

While the gaborator allows reversible transforms, only a forward transform (from time domain to the spectral domain) is currently supported from Java.A spectral visualization tool for sprectral information is part of this package. See below for a screenshot:

JGaborator screenshot


~ TISMIR journal article - A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’

As an extension of the ISMIR conferences the International Society for Music Information Retrievel started a new journal: TISMIR. The first issue contains an article of mine:
A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’. The abstract can be read here:

Claims made in many Music Information Retrieval (MIR) publications are hard to verify due to the fact that (i) often only a textual description is made available and code remains unpublished – leaving many implementation issues uncovered; (ii) copyrights on music limit the sharing of datasets; and (iii) incentives to put effort into reproducible research – publishing and documenting code and specifics on data – is lacking. In this article the problems around reproducibility are illustrated by replicating an MIR work. The system and evaluation described in ‘A Highly Robust Audio Fingerprinting System’ is replicated as closely as possible. The replication is done with several goals in mind: to describe difficulties in replicating the work and subsequently reflect on guidelines around reproducible research. Added contributions are the verification of the reported work, a publicly available implementation and an evaluation method that is reproducible.


~ JNMR article - Beyond documentation – The digital philology of interaction heritage

Marc Leman and myself have recently published an article in the Journal of New Music Research for a special issue on Digital Philology for Multimedia Cultural Heritage. Our contribution is titled Beyond documentation – The digital philology of interaction heritage

A philologist’s approach to heritage is traditionally based on the curation of documents, such as text, audio and video. However, with the advent of interactive multimedia, heritage becomes floating and volatile, and not easily captured in documents. We propose an approach to heritage that goes beyond documents. We consider the crucial role of institutes for interactive multimedia (as motor of a living culture of interaction) and propose that the digital philologist’s task will be to promote the collective/shared responsibility of (interactive) documenting, engage engineering in developing interactive approaches to heritage, and keep interaction-heritage alive through the education of citizens.


~ MIR Meetup Berlin - Acoustic Fingerprinting in Research

I was kindly invited by SoundCloud to give a presentation on “Acoustic fingerprinting in research”. The presentation took place during one of the “MIR Meetups” in Berlin on Monday, April 23, 2018. Before my presentation there was a presentation by Derek and Josh (both SoundCloud engineers) detailing the state of the internal fingerprinting system of SoundCloud.

During my presentation I gave an overview of various applications of acoustic fingerprinting in a music research environment and detailed how these applications can be handled and are implemented in Panako: an open source fingerprinting system

Below the slides used during the presentation can be found:


~ Engineering systematic musicology

The 11th of January I successfully completed my PhD training under mentorship of Marc Leman with a public defense at de Krook in Ghent.

I also handed in my dissertation titled Engineering systematic musicology: methods and services for computational and empirical music research (version of record). The dissertation bundles several of my publications and places them in a framework in the introduction and reflects upon these in the conclusion. The publications all contribute either directly to the field of systematic musicology (e.g. tone scale research) or contributes indirectly by facilitating specific research tasks (e.g. synchronization of multi-modal research data).

The presentation during my defense was meant for a broader audience. During the presentation I gave examples of the research topics I have been working and focused on how these are connected. The presentation titled Engineering systematic musicology can be seen by following the previous link and is included below. The slide with the live spectrogram and the slide with the map need to be started by double clicking otherwise they remain empty.

The presentation is essentially an interactive HTML5 website build with the reveal.js framework. This has the advantage that multimedia is well supported and all kinds of interactions can be scripted. The presentation above, for example, uses the web audio API for live audio visualization and the google maps API for interactive maps. Video integration is also seamless. It would be a struggle to achieve similar multi-media heavy presentations with other presentation software packages such as Impress, Keynote or Powerpoint.


~ HTML5 spectrogram on canvas with pitch estimation

To present my research in an accessible way I needed a reliable way to visualize audio, audio feature extraction and processing of audio features into a higher level representation. Canvas, HTML5, javascript and the Reveal.js presentation framework offered a solution.

I often need audio and video material embedded into presentations. I have had bad experiences with powerpoint/keynote and especially the LaTeX beamer package and multimedia: audio/video material does not start playing or at the wrong moment, finicky on codecs, limited compatibility, a clunky UX (whoever came up with the idea to show multimedia controls while hovering over e.g. an audio thumbnail should be reoriented towards back-end programming) all contribute to errors while handling audio/video. Moreover the interactive capabilities are limiting.

The component above is an interactive spectrogram which combines HTML5’s web audio API capabilities with the canvas element and some Javascript to glue things together. Note that this has been tested on Chrome and Firefox only.

To experiment with the capabilities you can either drag and drop mp3 files or analyse live audio from your microphone

This is based on the spectrogram implementation by GitHub user Boris Smus. The live pitch tracking is implemented by Peter Hayes which again is based on my own Java code.


~ IRCDL 2018 - Applications of Duplicate Detection in Music Archives: from Metadata Comparison to Storage Optimisation

Together with Federica Bressan I have contributed to the Italian Research Conference on Digital Libraries 2018:

“Since 2005, the Italian Research Conference on Digital Libraries has served as an important national forum focused on digital libraries and associated technical, practical, and social issues. IRCDL encompasses the many meanings of the term “digital libraries”, including new forms of information institutions; operational information systems with all manner of digital content; new means of selecting, collecting, organizing, and distributing digital content…"

The 26th of January Federica presented our joint contribution titled “Applications of Duplicate Detection in Music Archives: from Metadata Comparison to Storage Optimisation”. The work focuses on applications of duplicate detection for managing digital music archives. It aims to make this mature music information retrieval (MIR) technology better known to archivists and provide clear suggestions on how this technology can be used in practice. More specifically applications are discussed to complement meta-data, to link or merge digital music archives, to improve listening experiences and to re-use segmentation data.

The version of record of the article and an author version are available. The presentation is available here as well.


~ International Symposium on Computational Ethnomusicological Archiving

This weekend the University Hamburg – Institute for Systematic Musicology and more specifically Christian D. Koehn organized the International Symposium on Computational Ethnomusicological Archiving. The symposium featured a broad selection of research topics (physical modelling of instruments, MIR research, 3D scanning techniques, technology for (re)spacialisation of music, library sciences) which all had a relation with archiving musics of the world:

How could existing digital technologies in the field of music information retrieval, artificial intelligence, and data networking be efficiently implemented with regard to digital music archives? How might current and future developments in these fields benefit researchers in ethnomusicology? How can analytical data about musical sound and descriptive data about musical culture be more comprehensively integrated?

I was able to attend the symposium and contributed with a talk titled Challenges and opportunities for computational analysis of wax cylinders and by chairing a panel discussion. The symposium was kindly sponsored by the VolkswagenStiftung. The talk had the following abstract:

In this presentation we describe our experience of working with computational analysis on digitized wax cylinder recordings. The audio quality of these recordings is limited which poses challenges for standard MIR tools. Unclear recording and playback speeds further hinder some types of audio analysis. Moreover, due to a lack of systematical meta-data notation it is often uncertain where a single recording originates or when exactly it was recorded. However, being the oldest available sound recordings, they are invaluable witnesses of various musical practices and they are opportunities to improve the understanding of these practices. Next to sketching these general concerns, we present results of the analysis of pitch content of 400 wax cylinder recordings from Indiana University (USA) and from the Royal Museum from Central Africa (Belgium). The scales of the 400 recordings are mapped and analyzed as a set. It is found that the fifth is almost always present and that scales with four and five pitch classes are organized similarly and differ from those with six and seven pitch classes, latter center around intervals of 170 cents, and former around 240 cents.


~ 4th International Digital Libraries for Musicology workshop (DLfM 2017)

DLFM logoI have contributed to the 4th International Digital Libraries for Musicology workshop (DLfM 2017) which was organized in Shanghai, China. It was a satellite event of the ISMIR 2017 conference. Unfortunately I did not mange to find funding to attend the workshop, I did however contribute as co-author to two proceeding papers. Both were presented by Reinier de Valk (thanks again).

MIRchiving: Challenges and opportunities of connecting MIR research and digital music archives

By Reinier de Valk (DANS), Anja Volk (Utrecht University), Andre Holzapfel (KTH Royal Institute of Technology) , Aggelos Pikrakis (University of Piraeus), Nadine Kroher (University of Seville – IMUS) and Joren Six (Ghent University – IPEM). Next to the version of record there is also an author version available of the contribution titled MIRchiving: Challenges and opportunities of connecting MIR research and digital music archives.

This study is a call for action for the music information retrieval (MIR) community to pay more attention to collaboration with digital music archives. The study, which resulted from an interdisciplinary workshop and subsequent discussion, matches the demand for MIR technologies from various archives with what is already supplied by the MIR community. We conclude that the expressed demands can only be served sustainably through closer collaborations. Whereas MIR systems are described in scientific publications, usable implementations are often absent. If there is a runnable system, user documentation is often sparse—-posing a huge hurdle for archivists to employ it. This study sheds light on the current limitations and opportunities of MIR research in the context of music archives by means of examples, and highlights available tools. As a basic guideline for collaboration, we propose to interpret MIR research as part of a value chain. We identify the following benefits of collaboration between MIR researchers and music archives: new perspectives for content access in archives, more diverse evaluation data and methods, and a more application-oriented MIR research workflow.

Applications of duplicate detection: linking meta-data and merging music archives: The experience of the IPEM historical archive of electronic music

By Federica Bressan, Joren Six and Marc Leman (Ghent University – IPEM). Next to the version of record there is also an author version available of the contribution titled Applications of duplicate detection: linking meta-data and merging music archives: The experience of the IPEM historical archive of electronic music.

This work focuses on applications of duplicate detection for managing digital music archives. It aims to make this mature music information retrieval (MIR) technology better known to archivists and provide clear suggestions on how this technology can be used in practice. More specifically applications are discussed to complement meta-data, to link or merge digital music archives, to improve listening experiences and to re-use segmentation data. The IPEM archive, a digitized music archive containing early electronic music, provides a case study.

The full DLfM 2017 proceedings are published by ACM.