0110.be logo

~ MIDImorphosis: recording audio and sensor data

During an experiment which monitors a music performance it might be a requirement to record music, video and sensor data synchronously. Recording analog sensors (balance boards, accelerometers, light sensors, distance sensors) together with audio and video is often problematic. Ideally standard DAW software can be used to record both audio and sensor data. A system is presented here that makes it relatively straightforward to record sensor data together with audio/video.

The basic idea is simple: a microcontroller is programmed to appear as a class compliant MIDI device. Analog measurements on the micro-controller are translated to a specific MIDI protocol. The MIDI data, on the capturing side, can then be converted again into the original sensor data. This setup has several advantages:



screenshot of signal visualization
Fig: Visualization in html of analog sensor data, captured as MIDI


While the concept is relatively simple, there are many details to get right. Please consult the MIDImorphosis github page which details the system that consists of an analog sensor, a MIDI protocol and a clocking infrastructure.



~ LW Research Day 2019 on Digital Humanities

On the 9th of September 2019 the second research day organized by the faculty of Arts and Philosophy of Ghent University took place. The theme of the day was ‘Digital Humanities’ and the program gave an overview of the breadth of research at our faculty with topics as logic, history, archeology, chemistry, geography

Together with Jeska, I presented an ongoing study on musical interaction. In the study one of the measurements was the body movement of two participants. This is done with boards that are equipped with weight sensors. The data that comes out of this can be inspected for synchronisation, quality and quantity of movement, movement periodicities.


The hardware is the work of Ivan Schepers, the software used to capture and transmit messages is called “the MIDImorphosis” and developed by me. The research is in collaboration with Jeska Buhman, Marc Leman and Alessandro Dell’Anna. An article with detailed findings is forthcoming.


~ AAWM/FMA 2019 - Birmingham

I am currently in Birmingham, UK at the 2019 at the joint Analytical Approaches to World Music (AAWM) and Folk Music Conference. The opening concert by the RBC folk ensemble already provided the most lively and enthusiastic conference opening probably ever. Especially considering the early morning hour (9.30). At the conference, two studies will be presented on which I collaborated:

Automatic comparison of human music, speech, and bird song suggests uniqueness of human scales

Automatic comparison of human music, speech, and bird song suggests uniqueness of human scales by Jiei Kuroyanagi, Shoichiro Sato, Meng-Jou Ho, Gakuto Chiba, Joren Six, Peter Pfordresher, Adam Tierney, Shinya Fujii and Patrick Savage

The uniqueness of human music relative to speech and animal song has been extensively debated, but rarely directly measured. We applied an automated scale analysis algorithm to a sample of 86 recordings of human music, human speech, and bird songs from around the world. We found that human music throughout the world uniquely emphasized scales with small-integer frequency ratios, particularly a perfect 5th (3:2 ratio), while human speech and bird song showed no clear evidence of consistent scale-like tunings. We speculate that the uniquely human tendency toward scales with small-integer ratios may relate to the evolution of synchronized group performance among humans.

Automatic comparison of global children’s and adult songs

Automatic comparison of global children’s and adult songs by Shoichiro Sato, Joren Six, Peter Pfordresher, Shinya Fujii and Patrick Savage

Music throughout the world varies greatly, yet some musical features like scale structure display striking crosscultural similarities. Are there musical laws or biological constraints that underlie this diversity? The “vocal mistuning” hypothesis proposes that cross-cultural regularities in musical scales arise from imprecision in vocal tuning, while the integer-ratio hypothesis proposes that they arise from perceptual principles based on psychoacoustic consonance. In order to test these hypotheses, we conducted automatic comparative analysis of 100 children’s and adult songs from throughout the world. We found that children’s songs tend to have narrower melodic range, fewer scale degrees, and less precise intonation than adult songs, consistent with motor limitations due to their earlier developmental stage. On the other hand, adult and children’s songs share some common tuning intervals at small-integer ratios, particularly the perfect 5th (~3:2 ratio). These results suggest that some widespread aspects of musical scales may be caused by motor constraints, but also suggest that perceptual preferences for simple integer ratios might contribute to cross-cultural regularities in scale structure. We propose a “sensorimotor hypothesis” to unify these competing theories.


~ trix: Realtime audio over IP

At work we have a really nice piano and I wanted to be able to broadcast a live performance over the internet with low latency to potential live listeners. In all honesty, only my significant other gets moderately lukewarm about the idea of hearing me play live. Anyhow:

I did not find any practical tool to easily pump audio over the internet. I did find something that was very close called trx by Mark Hills: trx is a simple toolset for broadcasting live audio from Linux. It unfortunately only works with the ALSA audio system and is limited to Linux. I decided to extend it to support macOS and Pulse Audio. I also extended its name to form trix.

Audio Transmitter/Receiver over Ip eXchange (trix) is a simple toolset for broadcasting live audio from Linux or macOS. It sends and receives encoded audio over IP networks, via an audio interface. If audio interfaces are properly configured, a low-latency point-to-point or multicast broadband audio connection can be achieved. This could be used for networked music performances. The inclusion of the intermediate rtAudio library provides support for various audio input and outputs.

More information on trix can be found on the trix github page.

Latency

The system can be configured for low latency use. The whole chain is dependent several different components which each add to the total latency: audio input latency, encoder (algorithmic) delay, network latency and finally audio output latency.

Thanks to the use of RtAudio it should be possible to use low latency API’s to access audio devices (ASIO on windows or Jack on Unix). This means that audio input and output latencies can be as low as the hardware allows. The opus encoder/decoder that is used has a low algorithmic delay. By default it has a 25ms delay but it can be configured to only 2.5ms (see here). The network latency (and jitter) is very much dependent on the distance to cover. On a local network this can be kept low, when using wide area networks (the internet) control is lost and latencies can add up depending on the number of hops to take. Jitter can be problematic if the smallest possible buffers are used: then dropouts might occur and this might affect the audio in a noticeable way.


~ Audio marker finder

I have uploaded a small piece of software which allows users to find a specific audio marker in audio streams. It is mainly practical to synchronise a camera (audio/video) recording with other audio with the same marker. The marker is a set of three beeps. These three beeps are found with millisecond accurate precision within the audio streams under analysis. By comparing the timing of marker synchronization becomes possible. It can be regarded as an alternative for the movie clapper boards.

Screenshot of the Audio marker finder

The source code for the audio marker finder is on GitHub. The software is used in the Art Science Interaction Lab of the Krook. Below you can download the Audio marker finder and the marker itself.


~ Validity and reliability of peak tibial accelerations as real-time measure of impact loading during over-ground rearfoot running at different speeds - Journal of Biomechanics

With the goal in mind to reduce common runner injuries we first need to measure some running style characteristics. Therefore, we have developed a sensor to measure how hard a runners foot repeatedly hits the ground. This sensor has been compared with laboratory equipment which proofs that its measurements are valid and can be repeated. The main advantages of our sensor is that it can be used ‘in the wild’, outside the lab on the runners regular tours. We want to use this sensor to provide real-time biofeedback in order to change running style and ultimately reduce injury risk.

We have published an article on this sensor in the journal of Biomechanics:
Pieter Van den Berghe, Joren Six, Joeri Gerlo, Marc Leman, Dirk De Clercq,
Validity and reliability of peak tibial accelerations as real-time measure of impact loading during over-ground rearfoot running at different speeds, (author version)
Journal of Biomechanics,
2019

Studies seeking to determine the effects of gait retraining through biofeedback on peak tibial acceleration (PTA) assume that this biometric trait is a valid measure of impact loading that is reliable both within and between sessions. However, reliability and validity data were lacking for axial and resultant PTAs along the speed range of over-ground endurance running. A wearable system was developed to continuously measure 3D tibial accelerations and to detect PTAs in real-time. Thirteen rearfoot runners ran at 2.55, 3.20 and 5.10 m*s-1 over an instrumented runway in two sessions with re-attachment of the system. Intraclass correlation coefficients (ICCs) were used to determine within-session reliability. Repeatability was evaluated by paired T-tests and ICCs. Concerning validity, axial and resultant PTAs were correlated to the peak vertical impact loading rate (LR) of the ground reaction force. Additionally, speed should affect impact loading magnitude. Hence, magnitudes were compared across speeds by RM-ANOVA. Within a session, ICCs were over 0.90 and reasonable for clinical measurements. Between sessions, the magnitudes remained statistically similar with ICCs ranging from 0.50 to 0.59 for axial PTA and from 0.53 to 0.81 for resultant PTA. Peak accelerations of the lower leg segment correlated to LR with larger coefficients for axial PTA (r range: 0.64–0.84) than for the resultant PTA per speed condition. The magnitude of each impact measure increased with speed. These data suggest that PTAs registered per stand-alone system can be useful during level, over-ground rearfoot running to evaluate impact loading in the time domain when force platforms are unavailable in studies with repeated measurements.


~ Nano4Sports in Team Scheire

‘Team Scheire’ is a Flemish TV program with a similar concept as BBC Two’s ‘The Big Life Fix’. In the program, makers create ingenious new solutions to everyday problems and build life-changing solutions for people in desperate need.

One of the cases is Ben. Ben loves to run but has a recurring running related injury. To monitor Ben’s running and determine a maximum training length a sensor was developed that measures the impact and the amount of steps taken. The program makers were interested in the results of the Nano4Sports project at UGent. One of the aims of that project is to build those type of sensors and knowhow related to correct interpretation of data and use of such devices. Below a video with some background information can be found:

The solution build for the program is documented in a Github Repository One of the scientific results of the Nano4Sports project can be found in an article for the Journal of Biomechanics titled Validity and reliability of peak tibial accelerations as real-time measure of impact loading during over-ground rearfoot running at different speeds.


~ ISMIR 2018 Conference - Automatic Analysis Of Global Music Recordings suggests Scale Tuning Universals

Thanks to the support of a travel grant by the faculty of Arts and Philosophy of Ghent University I was able to attend the ISMIR 2018 conference. A conference on Music Information Retrieval. I am co author on a contribution for the the Late-Breaking / Demos session

The structure of musical scales has been proposed to reflect universal acoustic principles based on simple integer ratios. However, some studying tuning in small samples of non-Western cultures have argued that such ratios are not universal but specific to Western music. To address this debate, we applied an algorithm that could automatically analyze and cross-culturally compare scale tunings to a global sample of 50 music recordings, including both instrumental and vocal pieces. Although we found great cross-cultural diversity in most scale degrees, these preliminary results also suggest a strong tendency to include the simplest possible integer ratio within the octave (perfect fifth, 3:2 ratio, ~700 cents) in both Western and non-Western cultures. This suggests that cultural diversity in musical scales is not without limit, but is constrained by universal psycho-acoustic principles that may shed light on the evolution of human music.


~ JGaborator - Fast Gabor spectral transforms in Java

Recently I have published a small library on github called JGaborator. The library calculates fine grained constant-Q spectral representations of audio signals quickly from Java. The calculation of a Gabor transform is done by a C++ library named Gaborator. A Java native interface (JNI) bridge to the C++ Gaborator is provided. A combination of Gaborator and a fast FFT library (such as pfft) allows fine grained constant-Q transforms at a rate of about 200 times real-time on moderate hardware. It can serve as a front-end for several audio processing or MIR applications.

For more information on the Gaborator C++ library by Andreas Gustafsson, please see the gaborator.com website or a talk by the author on the library called Exploring time-frequency space with the Gaborator

While the gaborator allows reversible transforms, only a forward transform (from time domain to the spectral domain) is currently supported from Java.A spectral visualization tool for sprectral information is part of this package. See below for a screenshot:

JGaborator screenshot


~ TISMIR journal article - A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’

As an extension of the ISMIR conferences the International Society for Music Information Retrievel started a new journal: TISMIR. The first issue contains an article of mine:
A Case for Reproducibility in MIR: Replication of ‘A Highly Robust Audio Fingerprinting System’. The abstract can be read here:

Claims made in many Music Information Retrieval (MIR) publications are hard to verify due to the fact that (i) often only a textual description is made available and code remains unpublished – leaving many implementation issues uncovered; (ii) copyrights on music limit the sharing of datasets; and (iii) incentives to put effort into reproducible research – publishing and documenting code and specifics on data – is lacking. In this article the problems around reproducibility are illustrated by replicating an MIR work. The system and evaluation described in ‘A Highly Robust Audio Fingerprinting System’ is replicated as closely as possible. The replication is done with several goals in mind: to describe difficulties in replicating the work and subsequently reflect on guidelines around reproducible research. Added contributions are the verification of the reported work, a publicly available implementation and an evaluation method that is reproducible.


Previous blog posts

31-07-2018 ~ JNMR article - Beyond documentation – The digital philology of interaction heritage

26-04-2018 ~ MIR Meetup Berlin - Acoustic Fingerprinting in Research

02-02-2018 ~ Engineering systematic musicology

25-01-2018 ~ HTML5 spectrogram on canvas with pitch estimation

23-01-2018 ~ IRCDL 2018 - Applications of Duplicate Detection in Music Archives: from Metadata Comparison to Storage Optimisation

24-11-2017 ~ International Symposium on Computational Ethnomusicological Archiving

28-10-2017 ~ 4th International Digital Libraries for Musicology workshop (DLfM 2017)

31-07-2017 ~ ESCOM 2017 - Regularity and asynchrony when tapping to tactile, auditory and combined pulses

16-06-2017 ~ AES 2017 - A framework to provide fine-grained time-dependent context for active listening experiences

30-03-2017 ~ Computational Ethnomusicology: Methodologies for a New Field