Hi, I'm Joren. Welcome to my website. I'm a research software engineer in the field of Music Informatics and Digital Humanities. Here you can find a record of my research and projects I have been working on. Learn more »
I have asked ChatGPT to generate 3D models. ChatGPT can not generate 3D models directly but 3D models can generated via intermediary OpenSCAD scripts: OpenSCAD provides a scripting language to describe objects which can be combined to form 3D models. ChatGPT understands the syntax of this scripting language and generates perfectly cromulent scripts. I have asked two versions of ChatGPT to generate a 3D model of a house, a cat, a stick figure, a chair and a tree. The results are interesting…
The models immediately make the difference between ChatGPT 3.5 Turbo and ChatGPT 4.0 clear: 4.0 generates much better models with, at least, recognizable elements: a chair has four legs, a cat has a head and a tail. It is impressive that reasonable 3D models are generated but there is still room for improvement: proportions are not respected and elements are not always connected. Anyway, if the 3D-models can be seen as a way to visualize code quality, then 4.0 is a clear improvement and it makes me curious about future ChatGPT versions. It also made me reflect on a couple of aspects of LLMs in general.
Fig: a black box generating 3D models.
ephemerality The response of a LLM to a prompt is ephemeral: the same prompt causes a different response depending on context, randomness and the position of heavenly bodies - or so it seems. Traditional software systems follow a strict set of clear rules and provide deterministic, predictable and reliable results. The inverse is true for LLMs which takes some getting used to. As a user, an LLM is effectively a vantablack box - there is no way to know why a certain response was given instead of another.
Updates LLMs services - and SaaS in general - have an additional feature which makes them even more unpredictable: updates to systems can happen without notice. After a recent unannounced update, for example, ChatGPT 4 started to produce gibberish. This adds another layer to the already uncontrollable and ephemeral nature of responses to LLM prompts.
To counter the ephemeral quality of prompt responses, I have 3D printed the generated 3D models. Some pictures can be found below. I find that these physical, tangible, immutable objects provide a comforting counterbalance to the digital, ephemeral nature of LLM responses. Additionally, it highlights the absurdity of the generated models.
There are other ways to solidify ephemerality: crochet patterns, juggling patterns, guitar tablature, music notation all have some kind of structured text representation which LLMs can generate and which can have a physical representation. I would encourage people to bring prompt responses to the physical world: it really makes the - current - limitations of LLMs very clear.
At the Ghent Center for Digital Humanities (GhentCDH) we offer IT-services mainly for researchers in the Humanities at Ghent University. The services range from internal collaborative research tools to publicly facing science communication platforms. Technically, it is a mix of off the shelve software with or without modifications and custom solutions using several technical stacks. It is a challenge to keep these services running, secure and up-to-date for years with a limited budget.
In an attempt to make maintenance of these services more manageable we are in the process of containerizing our software. Running software in containers has advantages. One of the advantages is a guaranteed consistency across environments. Also, isolated software containers can be beneficial for security and stability. It also allows one to run different versions of a stack on the same server without running into compatibility problems.
Next to running software in containers, development in containers also has advantages. It allows you to switch projects easily without needing to install dependencies - e.g a specific database system version - directly on a development machine. The main advantage I see is that containerization promotes developer hygiene. Stereotypically, developers do not have the best hygiene and can use any available help. Containerization forces developers to think about separation of code and configuration, code and data and it forces to be explicit about dependencies and environmental assumptions.
The main disadvantage is that some configuration is needed to get the containers running and that there is a small performance penalty. The following might help with that first part.
Dockerized Python database development
To put the theory to the test my colleagues and I put together a GitHub repository with a dockerized Python development setup. It shows interaction between Python and a PostgreSQL database. The database system runs in a container and the development environment is also kept in a container. Both containers are started with docker compose and configured via a .env file.
The stack uses a recent Python version, PDM to resolve Python dependencies and SQLAlchemy to interact with the PostgreSQL database. The VS code editor allows developers to run and debug software in a container. The video below shows the startup procedure and setting a breakpoint in some Python code.
Vid: Starting a database server and development container. Running and debugging Python code in a container.
Note that this is just an example setup, your setup might look quite different. You might need a different stack, use a different container environment (e.g. podman) or IDE but the principle of container based development could stay the same.
I have put off using containers for quite a while and I am quite a late convert, but now that I am doing more technical work in a small team I do see the advantages of an easy-to-set up, controlled, containerized development with explicitly defined dependencies. If you have no experience with containers yet, I would encourage you to at least try container based development out and see where it could help you!
A couple of months ago, OnTracx, a Ghent University sports-tech spin-off launched with the ‘dream of a world where every runner can stay injury-free’. That dream is based on a firmly grounded interdisciplinary research project, which I was fortunate to contributed to. The research project - headed by the UGent sports science department - developed a music-based bio-feedback system to reduce footfall shock while running with the aim to lower common running-related injury risk. I fondly remember soldering and programing the first cluncky prototypes, now already eight years ago!
In my role, I contributed to several key papers that form the foundation of OnTracx. Notably, the ‘validity and reliability’ paper, which has become the most cited work in my academic portfolio, which at least indicates academic interest. The main author of the paper is now doing a post-doc in Harvard, so he must have been doing something right! Additionally, I am also recognized as co-inventor on a patent related to the system.
Van den Berghe, P., Lorenzoni, V., Derie, R., Six, J., Gerlo, J., Leman, M., & De Clercq, D. (2021). Music-based biofeedback to reduce tibial shock in over-ground running: A proof-of-concept study. Scientific reports
Fig: schema of the low impact runner research system. Foot-fall impact is measured with wearable sensors and music-based feedback is given to the runner with the aim to avoid high impact.
The journey from research to commercial realization is always thrilling. As OnTracx steps into the market, I am filled with hope and anticipation for its success, mirroring and potentially exceeding the fruitful research track.
Luckily, a couple of days ago a piece of software appeared to capture Google Earth tiles -cubes- into a single 3D file. There you can select an area of interest via google maps and download a GLTF file which captures the landscape in 3D. The software needs an API key which can be requested via the Google Developer tools.
After downloading a GLTF file, the 3D model needs to be made 3D-printable. There are online GLTF to STL converters but a bit of care needs to be taken to end up with an actually printable STL. My selected area of interest only has slight height differences in the landscape which are handled by placing the STL file on a base which compensates for these differences. Your 3D slicer can also generate structure to support inclinations in the landscape.
The 3D model generated by Google Earth is quite noisy and can contain floating parts and holes. It may be needed to edit the STL mesh directly. Selecting a slightly shifted area of interest may also solve problems with the edges of the print: take care to chop less buildings in two.
Elektor, a hobby electronics magazine, recently featured an article on acoustic fingerprinting using the ESP32. It is included in a special edition on Espressive products like the ESP32. This article includes content previously published on this blog and other writings about Olaf.
Since the article is based on my writings, there was an agreement to allow one of their writers to compose the magazine article under my name. This was my first experience with having a ghostwriter – quite convenient, I must say. Although it’s somewhat apparent that the article is compiled from various sources, I am overall pleased with the outcome. It even made the front page!
Elektor has a rich history, dating back to the early 1960s when it was first published in Dutch as ‘Elektuur’. I have fond memories of browsing Elektuur at my nerdy uncle’s place. If anything, this article has certainly earned me some nerd credibility points in my uncle’s eyes.
Fig: Hammer vs. screw. Not the right tool for the job.
Let’s look at a few examples. Take visiting news website. On a news site, a user expects to be able to read current news, reviews, opinions, .. and there is no expectation of interactivity. Basically, a news site could work equally well on physical paper, as was the case for the last century or more. Ideally, a news site is a static HTML page with an easy to follow layout and some images, perhaps some static ads, with information flowing in a single direction.
There is something about surprising interfaces. Having a switch to turn on a light gets quite boring after a while. Turning on a light by clapping twice, on the other hand, has some kind of magic feel to it. In a recent Mr Beast video he and his gang visit a number of expensive houses and in one of those mansions there is a light operated by clapping twice. I am not sure about the blatant materialism, but it got me thinking on how to build a similar clap-operated light yourself.
So, what are the elements needed: first a microphone to pick up sound. Second an algorithm is needed that detects claps. And finally, something that reacts to claps: a light or something else.
Many devices have microphones so sound input is relatively easy, and with some creativity there are many things waiting to be ‘clap triggered’: vacuum robots, sunscreens, lights, in-house ventilation, … The main difficulty is implementing a efficient clap-detection algorithm. Luckily there are already a few described in the literature. I have based my ANSI C implementation on ‘Duxbury, C., et al (2003). Complex domain onset detection for musical signals’.
My version of the clap-detection algorithm has two parameters which might need adapting to fit your environment. The silence threshold determines the minimum loudness for a clap to be triggered. The onset threshold determines more or less how ‘percussive’ the sound needs to be: the idea is to only react to things sounding like a clap and not to e.g. a loud whistle or other sounds. This is what the onset threshold tries to control. You can try it out below:
Demo: click the ‘start audio’ to capture your microphone and try to clap clearly twice. Lower the parameters if nothing happens.
Clap detection on a micro-controller
With this working we now can try to run this code on a micro-controller. Running it on a micro-controller makes it more practical in daily use to e.g. switch on lights. A low-cost ESP32 with a MEMS microphone is a good platform: these microcontrollers are easy to use and have WiFi connectivity which opens the possibility to trigger commands to smart sockets or other WiFi-enabled devices. The pector GitHub repository contains an Arduino project to run the clap-detection algorithm on an ESP32 or similar device (Teensy, RP2040,… ).
Clap detection in the command line
Next to the main clap detection software, there is a small script to trigger commands when a clap is detected. In this case, the script waits for a double clap and then pushes updates to a git repository. There are two reasons for this: the first is that it is fun, the second is for bragging rights. Not that many people can say they once pushed source code simply by clapping twice. It is, however, a challenge to find people who have the patience to listen to me explaining what I have done and who are impressed by this feat, so maybe there is only one reason: it is fun. Below a screen capture can be found pushing code to the pector repository.
Vid: pushing code by clapping
Have a look at the pector GitHub repository for more info on how you can make your websites/apps/command line tools/devices clap controlled!
I have been asked to give a guest lecture introducing Music Information Retrieval for the course ‘Foundations of Musical Acoustics and Sonology’ at Ghent University. The lecture slides include interactive demos with live sound visualization and can be found below.
As we delve into the intricacies of how machines can analyze and understand musical content, students will gain insights into the cutting-edge research field that underpins modern music technology. From the algorithms powering music recommendation systems to the challenges of extracting meaningful information from audio signals, the lecture aims to ignite curiosity and inspire the next generation of musicologists in both music and technology. Get ready for an engaging session that promises to unlock the doors to a world where the science of sound meets the art of music.
I will be demoing an early digital music workstation at the Flanders 2023 Science Day. During the Science Day there will be demonstrations of several of the electronic music heritage instruments of the collection of IPEM, which used to be an early electronic music production studio. In the collection is a vintage analog synthesizer (an EMS Synthi 100), a Yamaha DX7, an analog plate reverb audio effect processor and, finally, a NeXTcube with a unique sound-card and early digital music workstation software.
The NeXTcube is an influential machine in computing history. The NeXTcube, with an additional soundcard, was also one of the first off-the-shelf devices for high-quality, real-time music applications. I have restored a NeXTcube to run an early version of MAX, an environment for interactive music applications. This combination of software and hardware was developed at IRCAM and was known as the IRCAM Musical Workstation or IRCAM Signal Processing Workstation. See my previous blog posts on Electronic Music and the NeXTcube and USBMIDI interface for the NeXTCube
Fig: the NeXTcube’s design stood out compared to the contemporary beige box PCs.
The IPEM collection of electronic music instruments is unique with the aim to reintroduce the instruments into daily music practice an turn them into living heritage. For example in 2020, the Dewaele Brothers released the album made exclusively on the IPEM ‘EMS Synthi 100’ synthesizer. The NeXTcube demo will be hands-on as well. See you there!
I did a thing, and, similar to most stuff made here, it is quite a bit of effort and rather pointless. In that sense, it is a bit like life itself. Anyhow, it seems that the Halloween tradition of trick-or-treating has found a strong foothold in mainland Europe. Due to social embeddedness, I prepared Halloween themed projection that responds to my door-bell. I have a glass door, which is ideal for scary projections. The idea is to have a continuous door projection but with a twist: when kids press the doorbell a projected ghost reacts and rushes towards them along with a loud ghostly scream.
This blog post details the technical setup with the intention to inspire similar projects and serve as documentation for next year. First we need a way react to the doorbell.
Doorbell trigger setup
I sourced a couple of FSR‘s from a “sound book” that I had taken apart. Most of these sound books with e.g. animal sounds are meant for toddlers and have a some type of button and a small electronics circuit to make sound. Some of these books work with FSR ’buttons’ which are similar in size to a doorbell. I took a single FSR from such a book.
I attached the FSR to a “Teensy LC” micro-controller with an additional resistor and put it in a small 3D-printed case. The Teensy was programmed to emit a MIDI Note On event when the FSR/doorbell is pressed. A Note Off follows when the button is released. Once it is connected via USB to a computer it is essentially regarded as a digital piano with only a single key. Making a micro-controller pretend to be a standard MIDI device is very practical since the message passing protocol is standardized and well supported by many types of systems. MIDI is also optimized for low-latency communication. Via the Web MIDIAPI there is even support for MIDI in web browsers.
While software like Resolume allows for complex interactive video projections, my requirements are more modest: I need a continuous background video and I want the ‘scare’ video and audio to appear when the doorbell is triggered. I opted for a browser-based solution: multi-media capabilities, scripting and MIDI support are all present in modern browsers. Running things in a browser has advantages: there is no need for specialized software, it is easy to program, easy to run, relatively stable and future-proof. The proof-of-concept can be seen below. For the actual projection on a window or door you need to first cover the glass with a thin layer of white paper which lets most light through. A white paper tablecloth works well.
Demo: click the ‘start video’ to start the background video and click doorbell if you dare…