Auphonic Multitrack Processor

Auphonic Multitrack takes multiple parallel input audio tracks, analyzes and processes them individually as well as combined and creates the final mixdown automatically.
Leveling, dynamic range compression, gating, noise and hum reduction, crosstalk removal, ducking and filtering can be applied automatically according to the analysis of each track.
Loudness normalization and true peak limiting is used on the final mixdown.

Auphonic Multitrack does not work on macOS >= Ventura anymore!


Most of our new algorithms use custom hardware, therefore it is unfortunately not possible for us to update the current desktop apps - please consider using our web service instead!


Audio Algorithms

../_images/multitrack_track_algorithms.png ../_images/multitrack_master_algorithms.png

All included algorithms were trained with data from our Web Service and they keep learning and adapting to new audio signals every day.
Auphonic Multitrack is most suitable for programs where dialog/speech is the most prominent sound: podcasts, radio, broadcast, lecture and conference recordings, film and videos, screencasts etc.
It is not built for music-only productions.
For more details about our multitrack algorithms please see Multitrack Post Production Algorithms!

The following algorithms are included in the Auphonic Multitrack Processor:

Multitrack Adaptive Leveler

The Multitrack Adaptive Leveler analyzes the content of all tracks using machine learning techniques and then balances any variations in loudness by:

  • Classifying between music, background and speech segments

  • Analysing which speaker is active in each track in order to produce a balanced loudness between each speaker

  • Applying Dynamic range compression on speech tracks only whilst music segments are kept as natural as possible

  • Correcting Loudness variations within each track that may be caused by changing microphone distance, different songs in music tracks etc.

  • Isolating unwanted segments (noise, wind, breathing, silence etc.) and then excluding them from being amplified

For more details see our Multitrack Audio Example 1, Multitrack Audio Example 3 and our Singletrack Adaptive Leveler Examples.

Adaptive Noise Gate

If audio is recorded with multiple microphones and all signals are mixed, the noise of all tracks will add up as well. The Adaptive Noise Gate decreases the volume of segments where a speaker is inactive, but does not change segments where a speaker is active. All parameters of the gate (threshold, ratio, sustain, etc.) are set automatically according to the current context. This results in much less noise in the final mixdown.
For details and audio examples see Multitrack Audio Example 2 and the other Multitrack Audio Examples.

Crossgate: Crosstalk (Spill) Removal

When recording multiple people with multiple microphones in one room, the voice of speaker 1 will also be recorded in the microphone of speaker 2 and creates a crosstalk (spill), reverb or echo-like effect.
Our multitrack algorithms know exactly when and in which track a speaker is active and can therefore remove the same signal (crosstalk) from all other tracks. This results in a more direct signal and decreases ambience and reverb.
Listen to the details in Multitrack Audio Example 4 and the other Multitrack Audio Examples.

Automatic Ducking and Foreground/Background Classifier

Auphonic automatically decides which parts of your track should be foreground or background: Speech tracks will always be in foreground. In music tracks, all segments (e.g. intros, songs, background music, etc.) are classified as background or foreground segments and mixed to the production accordingly. If our classifiers do not work for your content, it is possible to force the track to be foreground or background.
We also support ducking to automatically reduce the level of a track if speakers in other tracks are active. This is useful for intros/outros, translated speech or for music segments, which should be softer if someone is speaking.
Listen to the Automatic Ducking Example and to the Fore/Background Audio Example.

Multitrack Noise and Hum Reduction

Our Noise Reduction algorithms remove broadband background noise in audio signals with slowly varying backgrounds. First each track is segmented in regions with different background noise characteristics, then a noise print is extracted in each region and removed from the audio signal. In automatic mode, a classifier decides if and how much noise reduction is necessary.
The Hum Reduction algorithms identify power line hum and all its partials in each track. Afterwards the partials are removed as necessary with sharp filters.
For more details and audio examples see Noise Reduction and Multitrack Audio Examples.

Loudness Normalization with True Peak Limiter

Global Loudness Normalization calculates the loudness of the final mixdown and applies a constant gain to reach a defined target level. The loudness is calculated according to latest broadcast standards (ITU-R BS.1770) and Auphonic supports loudness targets for television (EBU R128, ATSC A/85), radio, podcasts, mobile and more.
A True Peak Limiter, with 4x oversampling to avoid intersample peaks, is used to limit the final output signal to the selected maximum true peak level and ensures compliance with the selected loudness target.
For more details see Global Loudness Normalization.

High Pass Filter

An adaptive High Pass Filter cuts unwanted low frequencies, depending on the context (speech, music or noise) of each track.
For more details see Multitrack Adaptive Filtering.

Presets and Parallel Processing


The Auphonic Multitrack Processor includes a Parallel Task Queue (multi core processing) with configurable CPU, RAM and disk usage.
Just drag and drop your tracks into the program window and the multitrack production will be processed in parallel (with each track processed on its own core), using your current settings. During processing, the progress is shown in the application.
All your current settings (audio algorithm parameters, parameters of individual tracks, output file options, intros/outros, warnings, hardware settings, etc.) can be saved as Presets.

Processing Statistics and Warnings


Audio Processing Statistics of the master and individual tracks display details about what our algorithms are changing in your files.
They can be used to check compliance with Loudness Standards (Program Loudness, Maximum True Peak Level, LRA - see EBU TECH 3341, Section 1 ) and certain regulations for commercials (Max Momentary, Max Short-term Loudness - see Section 2.2). It also shows how much our Adaptive Leveler changes your levels (Gain mean, min, max), statistics about your input tracks (SNR, Background and Signal Level) and much more.

Statistics can be exported as files (manually or automatically) in machine readable JSON, in YAML, or in a human readable text format. The exact file format is the same as in our Web Service and is documented here.
It is also possible to setup Warnings for quality control or as alerts to manually check problematic segments in your audio: for example, you don’t want a MaxMomentary loudness >= -19 LUFS or an Output Loudness Range >= 20 LU. Warnings will be displayed in the application and are also exported to processing statistics files.

Supported Input and Output File Formats


We support a wide range of input and a limited selection of output file formats. Sample rate and bit depth conversions are using the high quality resampling and dithering algorithms of SoX.
Please use lossless audio formats whenever possible.
It is also possible to export all individual, processed tracks as separate files in WAV, WAV (float) or FLAC format for further editing.

Supported Input File Formats:

WAV, WAV (float), AIFF, FLAC, MP3, Ogg Vorbis, Opus
Mac OS X only: MP4/M4A/M4B, AAC, ALAC, CAF, AC3, MP2, 3GP

Supported Output File Formats:

WAV, WAV (float), AIFF, MP3 (via lame), FLAC, Ogg Vorbis, Opus
Mac OS X only: AAC (M4A)

Parameters and User Interface

Below is a list with descriptions of all available parameters and screenshots.

Start Screen


Empty start screen of the Auphonic Multitrack Processor.
The Master Audio Algorithms Box (bottom left) shows basic controls to enable/disable the Adaptive Leveler, Crossgate, and to select a Target Loudness level.
Drag and drop files or folders to load audio tracks.

Files Loaded


Now four tracks are added to the Auphonic Multitrack Processor and we are ready to run.
The Track Audio Algorithms Box (top right) shows controls to configure the algorithms for a specific track.
Our Noise and Hum Reduction algorithms can be disabled, set to Auto our set to a specific reduction amount in dB.
Possible values for the parameter Fore/Background are Auto, Foreground, Background or Duck this Track.



During processing, a progress bar is shown at the top right.
Audio Processing Statistics of a selected track are displayed in the right table: the format is exactly the same as in our web service and is documented here. Statistics can be exported as files in JSON, YAML or in a human readable text format.

Audio Processing Statistics


After processing, you can see the Audio Processing Statistics of the generated master output file.
Change the name in Output File Basename before processing to set the exact output filename.

Algorithm Details Preferences


The Algorithm Details Preferences might be used by audio experts to manually adjust details of the loudness measurement algorithms and to select a custom maximum true peak level.
If you don’t know what that means, please leave it at the default values!

Maximum peak level

Maximum true peak level of processed output files. Use Auto for a reasonable value according to the selected loudness target: -1dBTP for EBU R128 and higher, -2dBTP for ATSC A/85 and lower.

Peak measurement algorithm

Select sample or true peak measurement for final limiter.

LUFS gate

Disable loudness measurement gate as defined in ITU BS.1770-2. Use Auto to disable the gate only if ATSC A/85 is selected as loudness target.

Output Format Preferences


Preference tab to select the Output Audio Format. The final multitrack mixdown will be encoded into this format.

Output format

Set format of processed output file.

Sample rate

Set sample rate of the output file.

Bit depth

Set precision of the output file.

Mono mixdown

Always mixdown output files to mono.

Bitrate for lossy compression

Set target bitrate for lossy compressed output files.

Output Files Preferences


Details about the location of the final mixdown Output File.
It’s also possible to export the individual, processed tracks for further editing and to export processing statistics.

Output folder

Leave empty to put output files in input file folder.

If output file exists

Select what should be done if an output file already exists.

Export processed tracks

Export all processed tracks as separate files in the selected format. They will be place in the output folder.

Export processing statistics file

Automatically export file with processing statistics for each audio file.

Intro/Outro Preferences


Select an Intro and/or Outro file, which will be added at the start/end of your multitrack production.

Intro audio file

Select an intro audio file which will be prepended to all processed audio files. Leave empty to disable the intro file.

Outro audio file

Select an outro audio file which will be appended to all processed audio files. Leave empty to disable the outro file.

Warnings Preferences


Warnings can be used for quality control or as alerts to manually check problematic segments. They will be displayed in the Auphonic Multitrack application and are also exported to processing statistics files.

Output Max Short-term Loudness (Max S) exceeds

Warn if output Maximum Short-term Loudness exceeds threshold.

Output Max Momentary Loudness (Max M) exceeds

Warn if output Maximum Momentary Loudness exceeds threshold.

Output Loudness Range (LRA) exceeds

Warn if output Loudness Range exceeds threshold.

Hardware Preferences


Adjust RAM, CPU usage and the temporary processing directory of our audio processing task queue.

Nr of CPUs to use

Max number of CPUs to use for parallel processing in our task queue.

Audio processing blocksize (RAM usage)

The higher the value, the more RAM is necessary. Please decrease if you get memory errors or system slowdowns!

Temporary processing directory (disk usage)

Temporary files will be created in this directory during processing (much faster on an SSD). Leave empty to use the default temporary directory of your operation system!