Category archives: Development

RSS feed of Development

We all know the problem: the content is perfectly prepared, and everything is in place, but the moment you hit the record button, your brain freezes, and what pops out of your mouth is a rain of “ums”, “uhs”, and “mhs” that no listener would enjoy.
Cleaning up a record like that by manually cutting out every single filler word is a painstaking task.

So we heard your requests to automate the filler word removing task, started implementing it, and are now very happy to release our new Automatic Filler Cutter feature. See our Audio ...

We're thrilled to introduce our Automatic Shownotes and Chapters feature. This AI-powered tool effortlessly generates concise summaries, intuitive chapter timestamps and relevant keywords for your podcasts, audio and video files.
See our Examples and the How To section below for details.

Why do I need Shownotes and Chapters?

In addition to links and other information, shownotes contain short summaries of the main topics of your episode, and inserted chapter marks allow you to timestamp sections with different topics of a podcast or video. This makes your content more accessible and user-friendly, enabeling listeners to quickly navigate to specific sections ...

In addition to our Leveler, Denoiser, and Adaptive 'Hi-Pass' Filter, we now release the missing equalization feature with the new Auphonic AutoEQ.
The AutoEQ automatically analyzes and optimizes the frequency spectrum of a voice recording, to remove sibilance (De-esser) and to create a clear, warm, and pleasant sound - listen to the audio examples below to get an idea about what it does.

Screenshot of manually ...

Today we release our first self-hosted Auphonic Speech Recognition Engine using the open-source Whisper model by OpenAI!
With Whisper, you can now integrate automatic speech recognition in 99 languages into your Auphonic audio post-production workflow, without creating an external account and without extra costs!

Whisper Speech Recognition in Auphonic

So far, Auphonic users had to choose one of our integrated external service providers (Wit.ai, Google Cloud Speech, Amazon Transcribe, Speechmatics) for speech recognition, so audio files were transferred to an external server, using external computing powers, that users had to pay for ...

Speechmatics released a new API including an enhanced transcription engine (2h free per month!) that we integrated into the Auphonic Web Service now.
In this blog post, we also compare the accuracy of all our integrated speech recognition services and present our results.


Automatic speech recognition is most useful to make audio searchable: Even if automatically generated transcripts are not perfect and might be difficult to read (spoken text is very different from written text), they are very valuable if you try to find a specific topic within a one-hour audio file or if you need the exact ...

Today we are thrilled to introduce revised parameters for the Adaptive Leveler to move our advanced algorithms out of beta.
The leveler can now run in three modes, which allow detailed Leveler Strength control and also the use of Broadcast Parameters (Max. Loudness Range, Max. Short-term Loudness, Max. Momentary Loudness) to limit the amount of leveling.

Photo by Gemma Evans.

When we first introduced our advanced parameters, we used the Maximum Loudness Range (MaxLRA) value to control the strength of our leveler. This gave good results, but it turned out that only pure speech programs give reliable and ...

Last weekend, at the Subscribe10 conference, we released Advanced Audio Algorithm Parameters for Multitrack Productions:

We launched our advanced audio algorithm parameters for Singletrack Productions last year. Now these settings (and more) are available for Multitrack Algorithms as well, which gives you detailed control for each track of your production.

The following new parameters are available:

Until recently, Amazon Transcribe supported speech recognition in English and Spanish only.
Now they included French, Italian and Portuguese as well - and a few other languages (including German) are in private beta.

Update March 2019:
Now Amazon Transcribe supports German and Korean as well.

https://auphonic.com/static/screenshots/inspector-mt-closed.png The Auphonic Audio Inspector on the status page of a finished Multitrack Production including speech recognition.
Please click on the screenshot to see it in full resolution!


Amazon Transcribe is integrated as speech recognition engine within Auphonic and offers accurate transcriptions (compared to other services) at low costs, including keywords / custom ...

In late August, we launched the private beta program of our advanced audio algorithm parameters. After feedback by our users and many new experiments, we are proud to release a complete rework of the Adaptive Leveler parameters:

In the previous version, we based our Adaptive Leveler parameters on the Loudness Range descriptor (LRA), which is included in the EBU R128 specification.
Although it worked, it turned out that it is very difficult to set a loudness range target for diverse audio content, which does include speech, background sounds, music parts, etc. The results were not predictable and ...

Large file uploads in a web browser are problematic, even in 2018. If working with a poor network connection, uploads can fail and have to be retried from the start.

At Auphonic, our users have to upload large audio and video files, or multiple media files when creating a multitrack production. To minimize any potential issues, we integrated various external services which are specialized for large file transfers, like FTP, SFTP, Dropbox, Google Drive, S3, etc.

To further minimize issues, as of today we have also released resumable and chunked direct file uploads in the web browser to ...