Closed Domain improvements

We've now made accuracy on closed_domain Corpora more accurate, and reduced the potential for overfitting transcripts to the vocabulary found in the Corpus.

If you send in speech which doesn't match the vocabulary in you're closed_domain Corpus, you'll get a JSON response like this:

{
    "transcript": {
        "alternates": [], 
        "confidence": 0.83, 
        "corpus_id": "489", 
        "id": 359, 
        "status": "completed", 
        "text": "", 
        "warning": "Speech doesn't seem to match closed domain Corpus vocabulary."
    }
}

Which is helpful for knowing when your audio contains speech you don't intend to recognize.

You can read more about closed domain models in the docs here: https://docs.assemblyai.com/http/#http:postcorpus

Transcribe without a Corpus

You can now make transcripts without a Corpus using both the /stream and /transcript endpoints!

For example, you can now run the following curl command to stream audio to the API for a realtime transcript without creating a Corpus:

curl -v -X POST \
    --header 'authorization: your-secret-api-token' \
    --header "Transfer-Encoding: chunked" \
    --header "Content-Type: audio/wav" \
    --data-binary @/path/to/some/speech.wav \
    "https://api.assemblyai.com/v1/stream

We'll use our generic models to generate the transcript when you don't supply a corpus_id. This can be helpful to compare the accuracy with/without the Corpus you're creating...to see if your Corpus is helping or hurting accuracy.

Just keep in mind that transcripts without a Corpus will be up to 20% less accurate than those with a Corpus!

Holiday grab bag

We've been working to improve overall usability on many fronts. Here are areas where we've made recent changes:

  • Added transcript confidence scores to the trends charts
  • Removed corpus size limits
  • You can now stream wav files in any format
  • Added an updated field to the corpus objects
  • Updated the docs Quick start and Concepts

Improved Audio Processing Stack

Last week we discovered that part of our audio processing stack was slightly distorting the signal of audio before it was sent into our neural networks for transcription. This distortion caused a decrease in accuracy.

We've fixed this and accuracy has improved as a result.

New Neural Network Released

Today we released a new neural network that is showing significantly better accuracy and robustness compared to prior networks. This is due to architecture changes we've researched and deployed.

On internal test sets, we've brought the WER down by 46% relative.

This is now live for all developers and customers.

No published changelogs yet.

Surely AssemblyAI will start publishing changelogs very soon.

Check out our other public changelogs: Buffer, Mention, Respond by Buffer, JSFiddle, Olark, Droplr, Piwik Pro, Prott, Ustream, ViralSweep, StartupThreads, Userlike, Unixstickers, Survicate, Envoy, Gmelius, CodeTree