Speechmatics released a new API including an
enhanced transcription engine (2h free per month!) that we integrated into the Auphonic Web Service now.
In this blog post, we also compare the accuracy of all our integrated
speech recognition services and present our
results.
Automatic speech recognition is most useful to
make audio searchable:
Even if automatically generated
transcripts are not perfect and might be difficult to read (spoken text is very different from written text),
they are very valuable if you try to find a specific topic within a one-hour audio file or if
you need the exact time of a quote in an audio archive.
Currently, Auphonic supports the integration of the following four speech recognition services:
Wit.ai,
Google Cloud Speech,
Amazon Transcribe, and
Speechmatics.
All speech recognition services are improving very quickly lately, and we'll do our best to keep you updated –
getting closer and closer to perfection.
Most recently, Speechmatics developed a new Enhanced Model, that we added to our production services.
So now, you do have the choice between the Standard Model with faster results and medium good accuracy or the
Enhanced Model with slower results but very good accuracy.
For each transcription model, you can process
two hours of speech recognition per month for free (= 4h free per month combined). If you exceed the two hours per month and model, you will be
charged $1.25/h for Standard and $1.90/h for Enhanced Model. For high volumes, you may contact the
Speechmatics support for a discount.
How do other Speech Recognition Services compare to Speechmatics?
We tried to compare the relative ASR (Automatic Speech Recognition) quality of all services in English and German
– 'best' means just the best one of our integrated services.
As speech recognition services are evolving
very quickly, this is just a snapshot and may change again in the near future.
Wit.ai | Google Speech API | Amazon Transcribe | Speechmatics | ||
---|---|---|---|---|---|
Price | free, also for commercial use |
1+1h free per month, (Enhanced + Default Model), then ~$0.96-$2.16/h (depending on user settings) |
1h free per month, (first 12 months), then ~$1.44/h |
Standard 2h free per month, then ~$1.25/h, much cheaper for high volumes |
Enhanced 2h free per month, then ~$1.90/h much cheaper for high volumes |
ASR Quality English | basic | good (Enhanced Model) | very good | very good | best |
ASR Quality German | basic | basic (Default Model) | very good | very good | best |
Keyword Support | No | Yes | Yes | Yes | Yes |
Word Timestamps and Confidence | No | No | Yes | Yes | Yes |
Speed | fast | fast | much slower | medium | slower |
Supported Languages | ar, bn, my, zh, nl, en, fi, fr, de, hi, id, it, ja, ca, ko, ms, ml, mr, pl, pt, ru, si, es, sv, tl, ta, th, tr, ur, vi | most languages supported! 138 languages and dialects (see: Google Language Support) |
af-ZA, ar-AE, ar-SA, zh-CN, zh-TW, da-DK, nl-NL, en-AU, en-GB, en-IN, en-IE, en-NZ, en-AB, en-ZA, en-US, en-WL, fr-FR, fr-CA, fa-IR, de-DE, de-CH, he-IL, hi-IN, id-ID, it-IT, ja-JP, ko-KR, ms-MY, pt-PT, pt-BR, ru-RU, es-ES, es-US, ta-IN, te-IN, th-TH, tr-TR | ar, bg, yue, ca, hr, cs, da, nl, en, fi, fr, de, el, hi, hu, it, id, ja, ko, lv, lt, ms, cmn, no, pl, pt, ro, ru, sk, sl, es, sv, tr, uk | ar, bg, yue, ca, hr, cs, da, nl, en, fi, fr, de, el, hi, hu, it, id, ja, ko, lv, lt, ms, cmn, no, pl, pt, ro, ru, sk, sl, es, sv, tr, uk |
Try out Speechmatics in Auphonic
1. Connect Speechmatics to your Auphonic AccountSign in to your Auphonic account, go to Services, and add Speechmatics as an External Service:
- Enter a display name for the Speechmatics service in your Auphonic account.
- Sign up for a Speechmatics Account.
- On the Speechmatics page, go to “Manage Access” on the left, choose a name for your key and click “Generate API Key”. This API Key will only be shown once, so make sure you keep it safe! Copy your generated API Key to your Auphonic account into the form field “API Key”.
- For “Model Accuracy” please select “Standard Model” or “Enhanced Model”.
If you want to use both Standard and Enhanced Models of Speechmatics once in a while, you need to create two separate services (one service for each model) in your Auphonic account!
2. Add Speechmatics to your Auphonic Production
Once your Speechmatics and Auphonic Accounts are connected, you can either create a
preset or directly start your
production just like you are used to.
In section “Speech Recognition” you may set “Service” to “Speechmatics”,
select the language of your audio, add “Keywords” if you want, and you are ready to “Start Production”!
In your Speechmatics account menu “Track Usage”
there is a detailed list of your usage for the current month. For more information, you can also watch the following
Video Tutorial by Speechmatics about usage, limits, and billing.
Auphonic also includes a
Transcript Editor
directly in our HTML output file.
If you use Speechmatics
or Amazon Transcribe,
the editor displays word confidence values to instantly see which sections
should be checked manually:
Conclusion
Automatic Speech Recognition Services are evolving very quickly, and we've seen great improvements since our last comparisons in 2018 – especially in recognizing sloppy language, accents, and dialects.
With the new Enhanced Transcription Model by Speechmatics, we can now pass on further optimizations to you at a very reasonable price (4h free per month) – and we guess there are more improvements to come pretty soon.
Also, please let us know if you get different results comparing ASR services or if you compare services in other languages!