Since a couple of months FFmpeg supports audio transcription via OpenAI Whisper and Wisper-cpp. This allows to automatically transcribe interviews and podcasts or generate subtitles for videos. Most packaged versions of the command line tool ffmpeg do not ship with this option enabled. Here we show how to do this on macOS with the Homebrew package manager. On other platforms similar configuration will apply.
On macOS there is a prepared Homebrew keg which allows to enable or disable the many ffmpeg options. If you already have ffmpeg without options installed you may need to uninstall the current version and install a version with chosen options. See below on how to do this:
# check if you already have ffmpeg with whisper enabled
ffmpeg --help filter=whisper
# uninstall current ffmpeg, it will be replaced with a version with whisper
brew uninstall ffmpeg
# add a brew tap which provides options to install ffmpeg from source
brew tap homebrew-ffmpeg/ffmpeg
# this commands adds most common functionality and other default functions
brew install homebrew-ffmpeg/ffmpeg/ffmpeg \
--with-fdk-aac \
--with-jpeg-xl \
--with-libgsm \
--with-libplacebo \
--with-librist \
--with-librsvg \
--with-libsoxr \
--with-libssh \
--with-libvidstab \
--with-libxml2 \
--with-openal-soft \
--with-openapv \
--with-openh264 \
--with-openjpeg \
--with-openssl \
--with-rav1e \
--with-rtmpdump \
--with-rubberband \
--with-speex \
--with-srt \
--with-webp \
--with-whisper-cpp
Installation will take a while since many dependencies are required for the many options. Once the build is finished the whisper filter should be available in FFmpeg. See below on how this should look, once correctly installed:
Last Friday, I had the pleasure of facilitating a hands-on workshop in Luxembourg as part of MuTechLab workshop series, organized by Luc Nijs at the University of Luxembourg. Together with Bart Moens from XRHIL and IPEM, we presented a system to control musical parameters with body movement.
MuTechLab is a series of workshops for music teachers who wish to dive into the world of music technology. Funded by the Luxembourgish National Research Fund (FNR, PSP-Classic), the initiative brings together educators eager to explore how technology can enhance music education and creative practice.
What we built and presented
During the workshop, participants got hands-on experience with the EMI-Kit (Embodied Music Interface Kit) – an open-source, low-cost system that allows musicians to control Digital Audio Workstation (DAW) parameters through body movement.
The EMI-Kit consists of:
- A wearable sensor device (M5StickC Plus2) that captures body orientation and gestures
- A receiver unit (M5Stack STAMP S3A) that converts sensor data to MIDI messages
Unlike expensive commercial alternatives, EMI-Kit is fully open source, customizable, and designed specifically for creative music practice and embodied music interaction practice and research.
The Experience
Teachers experimented with mapping natural body movements – pitch, yaw, roll, and tap gestures – to various musical parameters in their DAWs. The low-latency wireless system made it possible to move and control sound, opening up new possibilities for expressive musical performance and pedagogy.
Learn More
Interested in exploring embodied music interaction yourself? Check out:
The EMI-Kit project as-is is a demonstrator to inspire educators to embrace these tools and imagine new ways of teaching and creating music. The EMI-Kit as a platform can - with some additional programming - be a good basis to control musical parameters using various sensors. Have fun with checking out the EMI-Kit.
Workshop hardware set
ESP32-S3 USB MIDI receivers
Participant package - with sender and receiver pair
I’ve just pushed some updates to mot — a command-line application for working with OSC and MIDI messages. My LLM tells me that these are exciting updates but I am not entirely sure that this is the case. Let me know if this ticks your box and seek professional help.
1. Scriptable MIDI Processor via Lua
I have implemented a MIDI processor that lets you transform, filter, and generate MIDI messages using Lua scripts.
Why is this useful? MIDI processors act as middlemen between your input devices and output destinations.You can do the following on incoming MIDI messages:
Filter - Block unwanted messages - channels - or select specific ranges
Route - Send different notes to different channel
Generate - Create complex patterns from simple input
The processor reads incoming MIDI from a physical device, processes it through your Lua script, and outputs the modified messages to a virtual MIDI port that your DAW or synth can receive. Some examples:
# Generate chords from single notes
mot midi_processor --script scripts/chord_generator.lua 06666# Transpose notes up by one octave
mot midi_processor --script scripts/example_processor.lua 06666
2. Network Discovery via mDNS
OSC receivers now advertise themselves on the network using mDNS/Bonjour with the _osc._udp service type.
This makes mot compatible with the EMI-kit — the Embodied Music Interface Kit developed at IPEM, Ghent University. OSC-enabled devices can automatically discover mot receivers on your network, eliminating manual configuration if the OSC sources add this functionality.