Sync your media files

1. Select media files
... or drop your audio files
2. Verify timeline
3. Verify synchronization output
4. download results


More information

What is this for?
This page aims to synchronize media files which share common audio.
It is best explained with an example: Say you have a high quality microhone recording captured at the same time your camera was also recording a low quality audio stream. Now you want to synchronize and use the high quality audio for your video stream. This page allows you to automatically synchronize these recorings.
How do I use this?
Drop the media files you want to synchronize on the placeholder above. The first media file serves as a reference: the other files attempt to sync with the reference file and modify their signals to match the refence timeline.
What happens if I drop a video file?
The tool automatically uses the first audio stream in the video container for synchronisation.
What does the 'Download synchronized audio' button do?
It copies the reference audio stream to channel zero, the first synchronized stream is found on channel one and so forth. Wav files support up to 255 channels. Silence is added at the start or the start is trimmed to match the reference timeline. Note that the output is reduced to mono, 16kHz for now.
I do not want to share any audiovisual material. Can I use this page?
There is no media uploaded to the server or shared in any way. Transcoding and synchronization happens at client side thanks to the power of WebAssembly.
What is the meaning of the numbers in the table?
The first percentage, next to the file name, is the time-stretching factor. It says how much the reference needs to be stretched (or compressed) in time to match the reference. Note that this percentage is rounded and an indication. However, if it is not 100%, pay attention since that means the downloaded synchroinzed audio will only be synced on one point in the stream (the middle of the matches).

The next number gives an absolute number of matching fingerprints: e.g. 398. Next to it, the duration of the match is given. During this amount of seconds, matching fingerprints are found in the reference. It says something about the overlap of audio fragments.

The next percentage, in the bar graph, is the ratio between the number of extracted fingerprints and the matched fingerprints. If one in five extracted fingerprints match with the reference, it is, logically, 20%.

filename.wav 101%327 matches over 100s
29%

Example of a match: 327 fingerprints match in 100 seconds which means that this is a robust match. The 101% means that the file is about one or two percent slower than the reference. The 29% again indicates a robust match since about one in three fingerprints match.

What is the difference between the Download trimmed and Download extended?
The download button creates a wav file with as many channels as input files. The first file, the media file with reference audio file, can be found on channel zero 🤟. Each channel is synchronized to the first channel. In the trimmed download all channels have the same length as the reference audio. For the extended download all audio is kept, even when it starts before the reference or ends after the reference.
What does the Download JSON buton do?
It creates a JSON file containing the information on the matching audio. For the reference audio the duration is included. For all other audio the offsets and matching times are included for each matching fingerprint.