~ FFmpeg with Whisper support on macOS via Homebrew
» By Joren on Wednesday 22 October 2025Since a couple of months FFmpeg supports audio transcription via OpenAI Whisper and Wisper-cpp. This allows to automatically transcribe interviews and podcasts or generate subtitles for videos. Most packaged versions of the command line tool ffmpeg do not ship with this option enabled. Here we show how to do this on macOS with the Homebrew package manager. On other platforms similar configuration will apply.
On macOS there is a prepared Homebrew keg which allows to enable or disable the many ffmpeg options. If you already have ffmpeg without options installed you may need to uninstall the current version and install a version with chosen options. See below on how to do this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# check if you already have ffmpeg with whisper enabled
ffmpeg --help filter=whisper
# uninstall current ffmpeg, it will be replaced with a version with whisper
brew uninstall ffmpeg
# add a brew tap which provides options to install ffmpeg from source
brew tap homebrew-ffmpeg/ffmpeg
# this commands adds most common functionality and other default functions
brew install homebrew-ffmpeg/ffmpeg/ffmpeg \
--with-fdk-aac \
--with-jpeg-xl \
--with-libgsm \
--with-libplacebo \
--with-librist \
--with-librsvg \
--with-libsoxr \
--with-libssh \
--with-libvidstab \
--with-libxml2 \
--with-openal-soft \
--with-openapv \
--with-openh264 \
--with-openjpeg \
--with-openssl \
--with-rav1e \
--with-rtmpdump \
--with-rubberband \
--with-speex \
--with-srt \
--with-webp \
--with-whisper-cpp
Installation will take a while since many dependencies are required for the many options. Once the build is finished the whisper filter should be available in FFmpeg. See below on how this should look, once correctly installed:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
ffmpeg version 8.0 Copyright (c) 2000-2025 the FFmpeg developers
built with Apple clang version
...
Filter whisper
Transcribe audio using whisper.cpp.
Inputs:
#0: default (audio)
Outputs:
#0: default (audio)
whisper AVOptions:
model <string> ..F.A...... Path to the whisper.cpp model file
language <string> ..F.A...... Language for transcription ('auto' for auto-detect) (default "auto")
queue <duration> ..F.A...... Audio queue size (default 3)
use_gpu <boolean> ..F.A...... Use GPU for processing (default true)
gpu_device <int> ..F.A...... GPU device to use (from 0 to INT_MAX) (default 0)
destination <string> ..F.A...... Output destination (default "")
format <string> ..F.A...... Output format (text|srt|json) (default "text")
vad_model <string> ..F.A...... Path to the VAD model file
vad_threshold <float> ..F.A...... VAD threshold (from 0 to 1) (default 0.5)
vad_min_speech_duration <duration> ..F.A...... Minimum speech duration for VAD (default 0.1)
vad_min_silence_duration <duration> ..F.A...... Minimum silence duration for VAD (default 0.5)