Multitrack Post Production Algorithms

An Auphonic Multitrack post production takes multiple parallel input audio tracks/files, analyzes and processes them individually as well as combined and creates the final mixdown automatically.
Leveling, dynamic range compression, gating, noise and hum reduction, crosstalk removal, ducking and filtering can be applied automatically according to the analysis of each track.
Loudness normalization and true peak limiting is used on the final mixdown.

All algorithms were trained with data from our web service and they keep learning and adapting to new audio signals every day.
Auphonic Multitrack is most suitable for programs, where dialogs/speech is the most prominent content: podcasts, radio, broadcast, lecture and conference recordings, film and videos, screencasts etc.
It is not built for music-only productions.

Audio examples with detailed explanations what our algorithms are doing can be found at


Please read the Multitrack Best Practice before using our multitrack algorithms!

Multitrack Adaptive Leveler

Similar to our singletrack Adaptive Leveler, the Multitrack Adaptive Leveler corrects level differences between tracks, speakers, music and speech within one audio production and applies dynamic range compression on each track to achieve a balanced overall loudness.
Using the knowledge from signals of all tracks allows us to produce much better results compared to our singletrack version.

We analyze all input audio tracks to classify speech, music and background segments and process them individually:

  • We know exactly which speaker is active in which track and can therefore produce a balanced loudness between tracks.
  • Loudness variations within one track (changing microphone distance, quiet speakers, different songs in music tracks etc.) are corrected.
  • Dynamic range compression is applied on each speech track automatically (see also Loudness Normalization and Compression ). Music segments are kept as natural as possible.
  • Background segments (noise, wind, breathing, silence etc.) are classified and won’t be amplified.

Annotated Multitrack Adaptive Leveler Audio Examples:
Please listen to our Multitrack Audio Example 1 , Multitrack Audio Example 3 and the Singletrack Adaptive Leveler Audio Examples .
They include detailed annotations to illustrate what our algorithms are doing!

Adaptive Noise Gate

If audio is recorded from multiple microphones and all signals are mixed, the noise of all tracks will add up as well. The Adaptive Noise Gate decreases the volume of segments where a speaker is inactive, but does not change segments where a speaker is active.
This results in much less noise in the final mixdown.

Our classifiers know exactly which speaker or music segment is active in which track. Therefore we can set all parameters of the Gate / Expander (threshold, ratio, sustain, etc.) automatically according to the current context.

Annotated Adaptive Noise Gate Audio Examples:
Please listen to our Multitrack Audio Example 2 and the other Multitrack Audio Examples .
They include detailed annotations to illustrate what our algorithms are doing!

Crossgate: Crosstalk (Spill, Reverb) Removal

When recording multiple people with multiple microphones in one room, the voice of speaker 1 will also be recorded in the microphone of speaker 2. This Crosstalk / Spill results in a reverb or echo-like effect in the final audio mixdown.
If you try to correct that using a Noise Gate / Expander , it is very difficult to set the correct parameters, because the crosstalk might be very loud.

Our multitrack algorithms know exactly when and in which track a speaker is active and can therefore remove the same or correlated signals (the crosstalk) from all other tracks. This results in a more direct signal and decreases ambience and reverb.

Annotated Crossgate Audio Examples:
Please listen to our Multitrack Audio Example 4 and the other Multitrack Audio Examples .
They include detailed annotations to illustrate what our algorithms are doing!

Multitrack Noise and Hum Reduction


Our Noise Reduction Algorithms remove broadband background noise and hiss in audio tracks with slowly varying backgrounds - see also singletrack Noise and Hiss Reduction.
First the audio of each track is analyzed and segmented in regions with different background noise characteristics and a Noise Print is extracted in each region.
Then a classifier decides how much noise reduction is necessary in each region (because too much noise reduction might result in artifacts) and removes the noise from the audio signal automatically.

The Multitrack Hum Reduction algorithms (included in Noise and Hum Reduction) identify and remove power line hum:
First each track is analyzed and segmented in regions with different hum characteristics and the hum base frequency (50Hz or 60Hz) and the strength of all its partials (100Hz, 150Hz, 200Hz, 250Hz, etc.) is classified in each region.
Afterwards the base frequency and all partials are removed according to their strength with sharp filters and broadband noise reduction.

Because noise profiles can be extracted in individual tracks, the multitrack noise and hum reduction algorithms can produce much better results compared to our singletrack Noise and Hiss Reduction and Hum Reduction algorithms.
It is also possible to activate Noise and Hum Reduction only in a few specific tracks, which were recorded e.g. in a bad environment.

See also our Noise Reduction Usage Tips!

Annotated Multitrack Noise Reduction Audio Examples:
Please listen to our Multitrack Audio Examples , Noise Reduction Audio Examples and Hum Reduction Audio Examples .
They include detailed annotations to illustrate what our algorithms are doing!

Automatic Ducking, Foreground and Background Tracks


The parameter Fore/Background controls, if a track should be in foreground, in background or ducked.
Ducking automatically reduces the level of a track, if speakers in other tracks are active. This is useful for intros/outros, translated speech, or for music, which should be softer if someone is speaking.

If Fore/Background is set to Auto, Auphonic automatically decides which parts of your track should be in foreground or background:

  • Speech tracks will always be in foreground.
  • In music tracks, all segments (e.g. songs, intros, etc.) are classified as background or foreground segments and are mixed to the production accordingly.

The automatic fore/background classification should work most of the time.
However, in special or artistic productions, like a very complex background music, a background speech track, or if you do the ducking manually in your audio editor, you can force the track to be in foreground or in background.
See also How should I set the Parameter Fore/Background.

Annotated Multitrack Fore/Background/Ducking Audio Examples:
Please listen to the Automatic Ducking Audio Example and to the Fore/Background Example .
They include detailed annotations to illustrate what our algorithms are doing!

Multitrack Adaptive Filtering

Our adaptive High-Pass Filtering algorithm cuts disturbing low frequencies and interferences, depending on the context of each track.
First we classify the lowest wanted signal in every track segment: male/female speech base frequency, frequency range of music (e.g. lowest base frequency), noise, etc. Then all unnecessary low frequencies are removed adaptively in every audio segment, so that interferences are removed but the overall sound of the track is preserved.

We use zero-phase (linear) filtering algorithms to avoid asymmetric waveforms: in asymmetric waveforms, the positive and negative amplitude values are disproportionate - please see Asymmetric Waveforms: Should You Be Concerned? .
Asymmetrical waveforms are quite natural and not necessarily a problem. They are particularly common on recordings of speech, vocals and can be caused by low-end filtering. However, they limit the amount of gain that can be safely applied without introducing distortion or clipping due to aggressive limiting.

Loudness Normalization and True Peak Limiter


Global Loudness Normalization and a True Peak Limiter can be applied after the final mixdown of all individual tracks.
It works in the same way as our singletrack Global Loudness Normalization and True Peak Limiter.

If you export the individual, processed tracks of a Multitrack Production (for further editing), they are not loudness normalized yet!
The loudness normalization must always be the last step in the processing chain, after the final mixdown and all other adjustments!