Skip to main content

Accuracy and Language

Transcription:Batch Real-Time Deployments:All

For more information on the full range of language packs offered by Speechmatics, please refer to the guide on supported language packs.

Accuracy

Specify whether to use a standard or enhanced model for transcription. The enhanced model is more accurate, but will take a little more time and will also impact pricing. By default, the standard model is used.

{
  "type": "transcription",
  "transcription_config": {
    "language": "en",
    "operating_point": "enhanced"
  }
}

Output locale

For the English language pack only, it is possible to specify the spelling rules to be used when generating the transcription, based on the output_locale configuration setting.

The three locales in English that are available are:

  • British English (en-GB)
  • US English (en-US)
  • Australian English (en-AU)

When transcribing in English, it is recommended to specify the locale. If no locale is specified then the spelling may be inconsistent within a transcript.

The following locales are supported for Chinese Mandarin:

  • Simplified Mandarin (cmn-Hans)
  • Traditional Mandarin (cmn-Hant)

The default is Simplified Mandarin.

An example configuration request is below:

{
  "type": "transcription",
  "transcription_config": {
    "language": "en",
    "output_locale": "en-GB"
  }
}

Domain language optimization

Some Speechmatics language packs are optimized for specific domains where high accuracy for specific vocabulary and terminology is required. Using the domain parameter provides additional transcription accuracy, and must be used in conjunction with a standard language pack (this is currently limited to the "finance" domain and supports the "en" language pack). An example of how this looks is below:

{
  "type": "transcription",
  "transcription_config": {
    "language": "en",
    "domain": "finance"
  }
}

It is expected that whilst there will be improvements for the specific domain there can be degradation in accuracy for other outside domains.