Skip to main content

Speechmatics ASR REST API (2.0.0)

Download OpenAPI specification:Download

The Speechmatics Automatic Speech Recognition REST API is used to submit ASR jobs and receive the results.

Jobs

Create a new job.

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Request Body schema: multipart/form-data
config
required
string

JSON containing a JobConfig model indicating the type and parameters for the recognition job.

data_file
string <binary>

The data file to be processed. Alternatively the data file can be fetched from a url specified in JobConfig.

text_file
string <binary>

For alignment jobs, the text file that the data file should be aligned to.

Responses

Response Schema:
id
required
string

The unique ID assigned to the job. Keep a record of this for later retrieval of your completed job.

Response samples

Content type
{
  • "id": "a1b2c3d4e5"
}

List all jobs.

query Parameters
created_before
string <date-time>

UTC Timestamp cursor for paginating request response. Filters jobs based on creation time to the nearest millisecond. Accepts up to nanosecond precision, truncating to millisecond precision. By default, the response will start with the most recent job.

limit
integer [ 1 .. 100 ]

Limit for paginating the request response. Defaults to 100.

include_deleted
boolean

Specifies whether deleted jobs should be included in the response. Defaults to false.

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Responses

Response Schema:
required
Array of objects (JobDetails)
Array
created_at
required
string <date-time>
Example: "2018-01-09T12:29:01.853047Z"

The UTC date time the job was created.

data_name
required
string

Name of the data file submitted for job.

text_name
string

Name of the text file submitted to be aligned to audio.

duration
integer >= 0

The file duration (in seconds). May be missing for fetch URL jobs.

id
required
string
Example: "a1b2c3d4e5"

The unique id assigned to the job.

status
required
string
Enum: "running" "done" "rejected" "deleted" "expired"

The status of the job.

  • running - The job is actively running
  • done - The job completed successfully.
  • rejected - The job was accepted at first, but later could not be processed by the transcriber.
  • deleted - The user deleted the job.
  • expired - The system deleted the job. Usually because the job was in the done state for a very long time.
object (JobConfig)

JSON object that contains various groups of job configuration parameters. Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

lang
string

Optional parameter used for backwards compatibility with v1 api

Array of objects (JobDetailError)

Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

Response samples

Content type
{
  • "jobs": [
    ]
}

Get job details, including progress and any error reports.

path Parameters
jobid
required
string
Example: a1b2c3d4e5

ID of the job.

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Responses

Response Schema:
required
object (JobDetails)

Document describing a job, including the status and config used. This model will be returned when you get job details or list all jobs.

created_at
required
string <date-time>
Example: "2018-01-09T12:29:01.853047Z"

The UTC date time the job was created.

data_name
required
string

Name of the data file submitted for job.

text_name
string

Name of the text file submitted to be aligned to audio.

duration
integer >= 0

The file duration (in seconds). May be missing for fetch URL jobs.

id
required
string
Example: "a1b2c3d4e5"

The unique id assigned to the job.

status
required
string
Enum: "running" "done" "rejected" "deleted" "expired"

The status of the job.

  • running - The job is actively running
  • done - The job completed successfully.
  • rejected - The job was accepted at first, but later could not be processed by the transcriber.
  • deleted - The user deleted the job.
  • expired - The system deleted the job. Usually because the job was in the done state for a very long time.
object (JobConfig)

JSON object that contains various groups of job configuration parameters. Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

lang
string

Optional parameter used for backwards compatibility with v1 api

Array of objects (JobDetailError)

Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

Response samples

Content type
{
  • "job": {
    }
}

Delete a job and remove all associated resources.

path Parameters
jobid
required
string
Example: a1b2c3d4e5

ID of the job to delete.

query Parameters
force
boolean

When set, a running job will be force terminated. When unset (default), a running job will not be terminated and request will return HTTP 423 Locked.

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Responses

Response Schema:
required
object (JobDetails)

Document describing a job, including the status and config used. This model will be returned when you get job details or list all jobs.

created_at
required
string <date-time>
Example: "2018-01-09T12:29:01.853047Z"

The UTC date time the job was created.

data_name
required
string

Name of the data file submitted for job.

text_name
string

Name of the text file submitted to be aligned to audio.

duration
integer >= 0

The file duration (in seconds). May be missing for fetch URL jobs.

id
required
string
Example: "a1b2c3d4e5"

The unique id assigned to the job.

status
required
string
Enum: "running" "done" "rejected" "deleted" "expired"

The status of the job.

  • running - The job is actively running
  • done - The job completed successfully.
  • rejected - The job was accepted at first, but later could not be processed by the transcriber.
  • deleted - The user deleted the job.
  • expired - The system deleted the job. Usually because the job was in the done state for a very long time.
object (JobConfig)

JSON object that contains various groups of job configuration parameters. Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

lang
string

Optional parameter used for backwards compatibility with v1 api

Array of objects (JobDetailError)

Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

Response samples

Content type
{
  • "job": {
    }
}

Get the transcript for a transcription job.

path Parameters
jobid
required
string
Example: a1b2c3d4e5

ID of the job.

query Parameters
format
string
Enum: "json-v2" "txt" "srt"

The transcription format (by default the json-v2 format is returned).

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Responses

Response Schema:
format
required
string
Example: "2.1"

Speechmatics JSON transcript format version number.

required
object (JobInfo)

Summary information about an ASR job, to support identification and tracking.

required
object (RecognitionMetadata)

Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.

required
Array of objects (RecognitionResult)
Example: [[{"channel":"channel_1","start_time":0.55,"end_time":1.2,"type":"word","alternatives":[{"confidence":0.95,"content":"Hello","language":"en","speaker":"S1","display":{"direction":"ltr"}}]}]]
object
Example: {"de":[{"start_time":0.5,"end_time":1.3,"content":"Guten Tag, wie geht es dir?","speaker":"UU"}],"fr":[{"start_time":0.5,"end_time":1.3,"content":"Bonjour, comment ça va?","speaker":"UU"}]}

Translations of the transcript into other languages. It is a map of ISO language codes to arrays of translated sentences.

Response samples

Content type
{
  • "format": "2.1",
  • "job": {
    },
  • "metadata": {
    },
  • "results": [
    ],
  • "translations": {
    }
}

Get the aligned text file for an alignment job.

path Parameters
jobid
required
string
Example: a1b2c3d4e5

ID of the job.

query Parameters
tags
string
Enum: "word_start_and_end" "one_per_line"

Control how timing information is added to the text file provided as input to the alignment job. If set to word_start_and_end, SGML tags are inserted at the start and end of each word, for example <time=0.41>. If set to one_per_line square bracket tags are inserted at the start of each line, for example [00:00:00.4] . The default is word_start_and_end.

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Responses

Response Schema:
string <binary>

Response samples

Content type
No sample

Get the usage statistics.

query Parameters
since
string <date>

Include usage after the given date (inclusive). This is a ISO-8601 calendar date format: YYYY-MM-DD.

until
string <date>

Include usage before the given date (inclusive). This is a ISO-8601 calendar date format: YYYY-MM-DD.

header Parameters
Authorization
required
string

Customer API token

X-SM-EAR-Tag
string

Early Access Release Tag

Responses

Response Schema: application/json
since
required
string <date-time>
Example: "2021-10-14T00:55:00Z"
until
required
string <date-time>
Example: "2022-12-01T00:00:00Z"
required
Array of objects (UsageDetails)
required
Array of objects (UsageDetails)

Response samples

Content type
application/json
{
  • "since": "2021-09-12T00:00:00Z",
  • "until": "2022-01-01T23:59:59Z",
  • "summary": [
    ],
  • "details": [
    ]
}

Job Config

This model should be used when you create a new job. It will also be returned as a part of response in a number of requests. This includes when you get job details or get the transcript for a transcription job.

Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur. For more details, please refer to Notifications in the documentation.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

type
required
string (JobType)
Enum: "alignment" "transcription"
object (DataFetchConfig)
object (DataFetchConfig)
object (AlignmentConfig)
Example: {"language":"en"}
object (TranscriptionConfig)
Example: {"language":"en","output_locale":"en-GB","additional_vocab":[{"content":"Speechmatics","sounds_like":["speechmatics"]},{"content":"gnocchi","sounds_like":["nyohki","nokey","nochi"]},{"content":"CEO","sounds_like":["C.E.O."]},{"content":"financial crisis"}],"diarization":"channel","channel_diarization_labels":["Caller","Agent"]}
Array of objects (NotificationConfig)
Example: [[{"url":"https://collector.example.org/callback","contents":["transcript:json-v2"],"auth_headers":["Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM"]}]]
object (TrackingData)
Example: {"title":"ACME Q12018 Earnings Call","reference":"/data/clients/ACME/statements/segs/2018Q1-seg8","tags":["quick-review","segment"],"details":{"client":"ACME Corp","segment":8,"seg_start":963.201,"seg_end":1091.481}}
object (OutputConfig)
object (TranslationConfig)
{
  • "type": "alignment",
  • "fetch_data": {
    },
  • "fetch_text": {
    },
  • "alignment_config": {
    },
  • "transcription_config": {
    },
  • "notification_config": [
    ],
  • "tracking": {
    },
  • "output_config": {
    },
  • "translation_config": {
    }
}

Job Details

Returned when you get job details, list all jobs or delete a job. This model includes the status and config that was used.

created_at
required
string <date-time>
Example: "2018-01-09T12:29:01.853047Z"

The UTC date time the job was created.

data_name
required
string

Name of the data file submitted for job.

text_name
string

Name of the text file submitted to be aligned to audio.

duration
integer >= 0

The file duration (in seconds). May be missing for fetch URL jobs.

id
required
string
Example: "a1b2c3d4e5"

The unique id assigned to the job.

status
required
string
Enum: "running" "done" "rejected" "deleted" "expired"

The status of the job.

  • running - The job is actively running
  • done - The job completed successfully.
  • rejected - The job was accepted at first, but later could not be processed by the transcriber.
  • deleted - The user deleted the job.
  • expired - The system deleted the job. Usually because the job was in the done state for a very long time.
object (JobConfig)

JSON object that contains various groups of job configuration parameters. Based on the value of type, a type-specific object such as transcription_config is required to be present to specify all configuration settings or parameters needed to process the job inputs as expected.

If the results of the job are to be forwarded on completion, notification_config can be provided with a list of callbacks to be made; no assumptions should be made about the order in which they will occur.

Customer specific job details or metadata can be supplied in tracking, and this information will be available where possible in the job results and in callbacks.

lang
string

Optional parameter used for backwards compatibility with v1 api

Array of objects (JobDetailError)

Optional list of errors that have occurred in user interaction, for example: audio could not be fetched or notification could not be sent.

{
  • "created_at": "2018-01-09T12:29:01.853047Z",
  • "data_name": "string",
  • "text_name": "string",
  • "duration": 0,
  • "id": "a1b2c3d4e5",
  • "status": "running",
  • "config": {
    },
  • "lang": "string",
  • "errors": [
    ]
}

Transcript

Returned when you get the transcript for a transcription job. It includes metadata about the job, such as the transcription config that was used.

format
required
string
Example: "2.1"

Speechmatics JSON transcript format version number.

required
object (JobInfo)

Summary information about an ASR job, to support identification and tracking.

required
object (RecognitionMetadata)

Summary information about the output from an ASR job, comprising the job type and configuration parameters used when generating the output.

required
Array of objects (RecognitionResult)
Example: [[{"channel":"channel_1","start_time":0.55,"end_time":1.2,"type":"word","alternatives":[{"confidence":0.95,"content":"Hello","language":"en","speaker":"S1","display":{"direction":"ltr"}}]}]]
object
Example: {"de":[{"start_time":0.5,"end_time":1.3,"content":"Guten Tag, wie geht es dir?","speaker":"UU"}],"fr":[{"start_time":0.5,"end_time":1.3,"content":"Bonjour, comment ça va?","speaker":"UU"}]}

Translations of the transcript into other languages. It is a map of ISO language codes to arrays of translated sentences.

{
  • "format": "2.1",
  • "job": {
    },
  • "metadata": {
    },
  • "results": [
    ],
  • "translations": {
    }
}