Production

A production represents one processed audio or video file.
Click here to create a new production in our web system: https://auphonic.com/engine/upload/

Common settings and metadata for a group of productions (e.g a podcast series) can be stored in a Preset. You can select your preset when creating a new production.

Note

If you want to process multiple parallel tracks/files in one production, you can use a Multitrack Production instead!

Audio or Video Source

../_images/UploadMethod.png

Select your main input audio or video source file.
You can upload the file directly in the browser, use an HTTP link or any supported External Services (FTP, Dropbox, S3, Google Drive, SFTP, WedDAV and many more - please register the service first).

Supported audio and video filetypes:
MP2, MP3, MP4, M4A, M4B, M4V, WAV, OGG, OGA, OPUS, FLAC, ALAC, MPG, MOV, AC3, EAC3, AIF, AIFC, AIFF, AIFFC, AU, GSM, CAF, IRCAM, AAC, MPG, SND, VOC, VORBIS, VOX, WAVPCM, WMA, ALAW, APE, CAF, MPC, MPC8, MULAW, OMA, RM, TTA, W64, SPX, 3PG, 3G2, 3GPP, 3GP, 3GA, TS, MUS, AVI, DV, FLV, IPOD, MATROSKA, WEBM, MPEG, OGV, VOB, MKV, MK3D, MKA, MKS, QT, MXF.
Please let us know if you need an additional format.

For lossy codecs like MP3, please use a bitrate of 192k or higher!

Intro and Outro

../_images/IntroOutro.png

Automatically add an Intro/Outro to your production.
IMPORTANT: this feature is audio-only and does not work with video productions!

As intros/outros are intended to be used multiple times, they are only Loudness Normalized to match the loudness of your production without further Auphonic processing (no leveling, filtering, noise reduction, etc.). Therefore you should edit/process your intro/outro before.
For a detailed description of our intro/outro feature, please see the blog post Automatic Intros and Outros in Auphonic.

Select Intro File
Select your intro audio from a local file, HTTP or an External Service
(Dropbox, SFTP, S3, Google Drive, SoundCloud, etc. - please register the service first).
Intro Overlap
Set overlap time in seconds of intro end with main audio file start, for details see Overlapping Intros/Outros.
IMPORTANT: ducking must be added manually to intro audio file!
Select Outro File
Select your outro audio from a local file, HTTP or an External Service
(Dropbox, SFTP, S3, Google Drive, SoundCloud, etc. - please register the service first).
Outro Overlap
Set overlap time in seconds of outro start with main audio file end, for details see Overlapping Intros/Outros.
IMPORTANT: ducking must be added manually to outro audio file!

Basic Metadata

../_images/BasicMetadata.png

Basic metadata (title, cover image, artist, album, track) for your production.
Metadata tags and cover images from input files will be imported automatically in empty fields!

We correctly map metadata to multiple Output Files.
For details see the following blog posts: ID3 Tags Metadata (used in MP3 output files), Vorbis Comment Metadata (used in FLAC, Opus and Ogg Vorbis output files) and MPEG-4 iTunes-style Metadata (used in AAC, M4A/M4B/MP4 and ALAC output files).

Cover Image
Add a cover image or leave empty to import the cover image from your input file.
If a Video Output File or YouTube export is selected, Auphonic generates a video with cover/chapter image(s) automatically!

Extended Metadata

../_images/ExtendedMetadata.png

Extended metadata (subtitle, summary, genre, etc.) for your production.
Metadata tags from input files will be imported automatically in empty fields!

We correctly map metadata to multiple Output Files.
For details see the following blog posts: ID3 Tags Metadata (used in MP3 output files), Vorbis Comment Metadata (used in FLAC, Opus and Ogg Vorbis output files) and MPEG-4 iTunes-style Metadata (used in AAC, M4A/M4B/MP4 and ALAC output files).

Subtitle
A subtitle for your production, must not be longer than 255 characters!
Summary / Description
Here you can write an extended summary or description of your content.
Append Chapter Marks to Summary
Append possible Chapter Marks with time codes and URLs to your Summary.
This might be useful for audio players which don’t support chapters!
Create a Creative Commons License
Link to create your license at creativecommons.org.
Copy the license and its URL into the metadata fields License (Copyright) and License URL!
Tags, Keywords
Tags must be separated by comma signs!

Chapter Marks

../_images/ChapterMarks.png

Chapter marks, also called Enhanced Podcasts, can be used for quick navigation within audio files. One chapter might contain a title, an additional URL and a chapter image.
Chapters are written to all supported output file formats (MP3, AAC/M4A, Opus, Ogg, FLAC, ALAC, etc.) and exported to Soundcloud, YouTube and Spreaker. If a video Output File or YouTube export is selected, Auphonic generates a video with chapter images automatically.
For more information about chapters and which players support them, please see Chapter Marks for MP3, MP4 Audio and Vorbis Comment (Enhanced Podcasts).

Chapter marks can be entered directly in our web interface or we automatically Import Chapter Marks from your input audio file.
It’s also possible to import a simple Text File Format with Chapters, upload markers from various audio editors (Audacity Labels, Reaper Markers, Adobe Audition Session, Hindenburg, Ultraschall, etc.), or use our API for Adding Chapter Marks programmatically.
For details, please see How to Import Chapter Marks in Auphonic.

Chapter Start Time
Enter chapter start time in hh:mm:ss.mmm format (examples: 00:02:35.500, 1:30, 3:25.5).
NOTE: You don’t have to add the length of an optional Intro File here!
Chapter Title
Optional title of the current chapter.
Audio players show chapter titles for quick navigation in audio files.
Chapter URL
Enter an (optional) URL with further information about the current chapter.
Chapter Image
Upload an (optional) image with visual information, e.g. slides or photos.
The image will be shown in podcast players while listening to the current chapter, or exported to video Output Files.
Import Chapter Marks from File
Select a Text File Format with a timepoint (hh:mm:ss[.mmm]) and a chapter title in each line or Import Chapter Marks from Audio Editors.
NOTE: We automatically import chapter marks form your input audio file!

Output Files

../_images/OutputFiles.png

Add one or multiple output file formats (MP3, MP4, Ogg, WAV, Video, ...) with bitrate, channel and filename settings to a production (see Audio File Formats and Bitrates for Podcasts). All Metadata Fields and Chapter Markers will be mapped to multiple output files. See below for a list of other, specialized output formats.
With Auphonic you can process video input files as well, or automatically generate a video output file from input audio using Cover and Chapter images - for details see Video Input and Output.

Supported audio output file formats:

Other output file formats:

Output File Basename
Set basename (without extension) for all output files or leave it empty to take the original basename of your input file.
Output File Format
For an overview of audio formats see Audio File Formats and Bitrates for Podcasts.
Audio Bitrate (all channels)
Set combined bitrate of all channels of your audio output file.
For details see Audio File Formats and Bitrates for Podcasts.
Filename Suffix (optional)
Suffix for filename generation of the current output file, leave empty for automatic suffix!
Filename Ending, Extension
Filename extension of the current output file.
Mono Mixdown
Click here to force a mono mixdown of the current output file.
Split on Chapters
If you have Chapter Marks, this option will split your audio in one file per chapter.
All filenames will be appended with the chapter number and packed into one ZIP output file.

Speech Recognition

../_images/SpeechRecognition1.png

Auphonic built a layer on top of multiple engines to offer affordable speech recognition in over 80 languages:
1. First you have to connect to an external Speech Recognition Service at the External Services page.
2. Then you can select the speech recognition engine when creating a new Production or Preset.

We send small audio segments to the speech recognition engine and then combine the results, add punctuation and structuring to produce 3 Output Result Files: an HTML transcript, a WebVTT/SRT subtitle file and a JSON/XML speech data file.
If you use a Multitrack Production, we can automatically assign speaker names to all transcribed audio segments.

For more details about our speech recognition system, the available engines, the produced output files and for some complete examples in English and German, please see Speech Recognition.

Select Service
Select an external service for Automatic Speech Recognition. Please register a service first!
Select Language
Select a language/variant for speech recognition.

Google Speech API

Word and Phrase Hints
Add Word and Phrase Hints to improve speech recognition accuracy for specific keywords and phrases.
Metadata (chapters, tags, title, artist, album) will be added automatically!

Wit.ai Speech Recognition

Wit.ai Language
The language must be set directly in your Wit.ai App.
IMPORTANT: If you need multiple languages, you have to add an additional Wit.ai service for each language!

Amazon Transcribe

Custom Vocabulary
Add Custom Vocabularies to improve speech recognition accuracy for specific keywords and phrases.
Metadata (chapters, tags, title, artist, album) will be added automatically!

Speechmatics

Speechmatics Language
Select a language/variant for speech recognition with Speechmatics.

Publishing / External Services

../_images/ExternalServices.png

Copy one or multiple result files to any External Service (Dropbox, YouTube, (S)FTP, SoundCloud, GDrive, LibSyn, Archive.org, S3, etc.):
1. First you have to connect to an external service at the External Services page.
2. Then you can select the service when creating a new Production or Preset.

When exporting to Podcasting/Audio/Video Services (SoundCloud, YouTube, Libsyn, Podigee, Spreaker, Blubrry, Podlove, etc.), all metadata will be exported as well.
For a complete list and details about all supported services, see Auphonic External Services.

Select Service
Select an external service for outgoing file transfers. Please register your service first!
Output Files to copy
Select which Output File should be copied to the current external service.

YouTube Service

YouTube Privacy Settings
Set your video to Public (everyone can see it), Private (only your account can see it) or
Unlisted (everyone who knows the URL can see it, not indexed by YouTube).
YouTube Category
Select a YouTube category.

Facebook Service

Facebook Distribution Settings

Post to News Feed: The exported video is posted directly to your news feed / timeline.
Exclude from News Feed: The exported video is visible in the videos tab of your Facebook Page/User (see for example Auphonic’s video tab), but it is not posted to your news feed (you can do that later if you want).
Secret: Only you can see the exported video, it is not shown in the Facebook video tab and it is not posted to your news feed (you can do that later if you want).

For more details and examples please see the Facebook Export blog post.

Facebook Embeddable
Choose if the exported video should be embeddable in third-party websites.

SoundCloud Service

SoundCloud Sharing
Set your exported audio to Public or Private (not visible by other users).
SoundCloud Downloadable
Select if users should be able to download your audio on SoundCloud, otherwise only streaming is allowed.
SoundCloud Type
Select a SoundCloud type/category.
SoundCloud Audio File Export
Select an audio output file which should be exported to SoundCloud.
If set to Automatic, Auphonic will automatically choose a file.

Spreaker Service

Spreaker Collection / Show
Select your Spreaker Collection where this track should be published.
Each Collection has a separate RSS feed and can be created in your Spreaker Account.
Spreaker Sharing
Set your exported audio to Public or Private (not visible by other users).
Spreaker Downloadable
If disabled, listeners won’t be able to download this track and it won’t be included in your RSS feed.

Audio Algorithms

../_images/AudioAlgorithms.png

Enable/disable audio algorithms. For more details see Auphonic Post Production Algorithms!
Please don’t change our default values if you don’t know what these parameters mean - they should be a good starting point for most content!
For more control, please use our Advanced Audio Algorithm Parameters.

Adaptive Leveler
Corrects level differences within one file between speakers, music and speech, etc. to achieve a balanced overall loudness.
For more see Adaptive Leveler Details and Advanced Leveler Parameters.
Global Loudness Normalization with True Peak Limiter
Adjusts the global, overall loudness to the specified Loudness Target (using a True Peak Limiter), so that all processed files have a similar average loudness.
For more see Global Loudness Normalization with True Peak Limiter Detail and Advanced Loudness Normalization Parameters.
Loudness Target
Set a loudness target in LUFS for Loudness Normalization, higher values result in louder audio outputs.
The maximum true peak level will set automatically to -1dBTP for loudness targets >= -23 LUFS (EBU R128) and to -2dBTP for loudness targets <= -24 LUFS (ATSC A/85).
For details and examples see Global Loudness Normalization and True Peak Limiter.
High-Pass Filtering
Classifies the lowest wanted signal (male/female speech, base in music, etc.) and adaptively filters
unnecessary/disturbing low frequencies in each audio segment.
Automatic Noise and Hum Reduction
Classifies regions with different backgrounds and automatically removes noise and hum in each region.
For more see Noise Reduction, Hum Reduction Details and Advanced Noise&Hum Reduction Parameters.
Noise Reduction Amount
Maximum noise and hum reduction amount in dB, higher values remove more noise. In Auto mode, a classifier decides if and how much noise reduction is necessary (to avoid artifacts).
Set to a custom (non-Auto) value if you prefer more noise reduction or want to bypass our classifier, but be aware, this might result in artifacts or destroy music segments!

Advanced Audio Algorithms

../_images/AdvancedAudioAlgorithms.png

Enable/disable advanced parameters for our Audio Algorithms.

Warning

Please don’t use our advanced algorithm parameters if you don’t understand them!
Use our default settings instead, they are a good starting point for most content.

Note

Advanced algorithm parameters are currently in private beta mode: Join here!

Leveler Parameters

../_images/LevelerParameters.png

The following advanced parameters for our Adaptive Leveler allow you to customize which parts of the audio should be leveled, how much they should be leveled, and how much dynamic range compression should be applied.

Leveler Preset

Select a Leveler Preset to change the Adaptive Leveler algorithm.
This defines which parts of the audio should be leveled:

  • Default Leveler: Our classic, default leveling algorithm. Use it if you are unsure.
  • Foreground Only Leveler: This preset reacts slower and levels foreground parts only. Use it if you have background speech or background music, which should not be amplified.
  • Fast Leveler: A preset which reacts much faster. It is built for recordings with fast and extreme loudness differences, for example, to amplify very quiet questions from the audience in a lecture recording, to balance fast-changing soft and loud voices within one audio track, etc.
  • Amplify Everything: Amplify as much as possible. Similar to the Fast Leveler, but also amplifies non-speech background sounds like noise.
Leveler Dynamic Range

Our default Leveler tries to normalize all speakers to a similar loudness so that a consumer in a car or subway doesn’t feel the need to reach for the volume control. However, in other environments (living room, cinema, etc.) or in dynamic recordings, you might want more level differences (Dynamic Range, Loudness Range / LRA) between speakers and within music segments.

The parameter Dynamic Range controls how much leveling is applied: Higher values result in more dynamic output audio files (less leveling). If you want to increase the dynamic range by 3dB (or LU), just increase the Dynamic Range parameter by 3dB.
We also like to call this Loudness Comfort Zone: above a maximum and below a minimum possible level (the comfort zone), no leveling is applied. So if your input file already has a small dynamic range (is within the comfort zone), our leveler will be just bypassed.

Example Use Cases:
Higher dynamic range values should be used if you want to keep more loudness differences in dynamic narration or dynamic music recordings (live concert/classical).
It is also possible to utilize this parameter to generate automatic mixdowns with different loudness range (LRA) values for different target environments (very compressed ones like mobile devices or Alexa, very dynamic ones like home cinema, etc.).

Compressor

Select a preset for Micro-Dynamics Compression: The compressor reduces the volume of short and loud spikes like “p”, “t” or laughter (short-term dynamics) and also shapes the sound of your voice (it will sound more/less “processed” or “punchy”).
The Leveler, on the other hand, adjusts mid-term level differences, as done by a sound engineer, using the faders of an audio mixer, so that a listener doesn’t have to adjust the playback volume all the time.
For more details please see Loudness Normalization and Compression of Podcasts and Speech Audio.

Possible values are:

  • Auto: The compressor setting depends on the selected Leveler Preset. Medium compression is used in Foreground Only and Default Leveler presets, Hard compression in our Fast Leveler and Amplify Everything presets.
  • Soft: Uses less compression.
  • Medium: Our default setting.
  • Hard: More compression, especially tries to compress short and extreme level overshoots. Use this preset if you want your voice to sound very processed, our if you have extreme and fast-changing level differences.
  • Off: No short-term dynamics compression is used at all, only mid-term leveling. Switch off the compressor if you just want to adjust the loudness range without any additional micro-dynamics compression.
Separate MusicSpeech Parameters

Use the switch Separate MusicSpeech Parameters (top right), to see separate Adaptive Leveler parameters for music and speech segments, to control all leveling details separately for speech and music parts.

For dialog intelligibility improvements in films and TV, it is important that the speech/dialog level and loudness range is not too soft compared to the overall programme level and loudness range. This parameter allows you to use more leveling in speech parts while keeping music and FX elements less processed.
Note: Speech, music and overall loudness and loudness range of your production are also displayed in our Audio Processing Statistics!

Example Use Case:
Music live recordings or dynamic music mixes, where you want to amplify all speakers (speech dynamic range should be small) but keep the dynamic range within and between music segments (music dynamic range should be high).
Dialog intelligibility improvements for films and TV, without effecting music and FX elements.

Leveler Preset for Music Segments
Select separate Leveler Presets for music and speech segments.
The default music preset is Same as Speech (combined settings for music and speech segments).
Leveler Dynamic Range for Music Segments
Select separate Dynamic Range targets for music and speech segments.
The default music preset is Same as Speech (combined settings for music and speech segments).
Compressor for Music Segments
Select separate Compressor Presets for music and speech segments.
The default music preset is Same as Speech (combined settings for music and speech segments).

Loudness Normalization and True Peak Limiter Parameters

../_images/LoudnormParameters.png

Advanced parameters for our Global Loudness Normalization and True Peak Limiter algorithms.

Loudness Target
Set a loudness target in LUFS for Loudness Normalization, higher values result in louder audio outputs.
Maximum Peak Level
Maximum True Peak Level of the processed output file. Use Auto for a reasonable value according to the selected loudness target: -1dBTP for EBU R128 and higher, -2dBTP for ATSC A/85 and lower.

Noise and Hum Reduction Parameters

../_images/DenoiseParameters.png

In addition to the parameter (Noise) Reduction Amount, we offer two more advanced parameters to control the combination of our Noise and Hum Reduction algorithms.
Behavior of our Noise and Hum Reduction parameter combinations:

Noise Reduction Amount Hum Base Frequency Hum Reduction Amount  
Auto Auto Auto Automatic hum and noise reduction
Auto or > 0   Disabled No hum reduction, only denoise
Disabled 50Hz Auto or > 0 Force 50Hz hum reduction, no denoise
Disabled Auto Auto or > 0 Automatic dehum, no denoise
12dB 60Hz Auto or > 0 Always do dehum (60Hz) and denoise (12dB)
Noise Reduction Amount
Maximum noise and hum reduction amount in dB, higher values remove more noise. In Auto mode, a classifier decides if and how much noise reduction is necessary (to avoid artifacts).
Set to a custom (non-Auto) value if you prefer more noise reduction or want to bypass our classifier, but be aware, this might result in artifacts or destroy music segments!
Hum Reduction Base Frequency
Set the hum base frequency to 50Hz or 60Hz (if you know it), or use Auto to automatically detect the hum base frequency in each speech region.
Hum Reduction Amount
Maximum hum reduction in dB, higher values remove more hum.
In Auto mode, a classifier decides how much hum reduction is necessary for each speech region.
Set it to a custom value (> 0), if you prefer more hum reduction or want to bypass our classifier.
Use Disable Dehum to disable hum reduction and use our noise reduction algorithms only.