Natural Language Processing

API that defines how to turn recorded speech into text

This API is in proposal status and is not yet available for general use. If you have questions or suggestions to improve this API please reach out to the Dynepic team.

The Natural Language Processing API allows MOTAR applications to submit recorded speech in an audio file formatted as MP3, MP4, FLAC, or WAV. MOTAR will convert the speech to text. To use this API, start a job by obtaining a speech to text job id and upload URL. After uploading the audio file, pass the job ID into the get text endpoint to obtain the transcription.

Obtain an Upload URL for an audio file that will be converted to text

GET https://api.motar.io/nlp/v1/start-speech-to-text

This API takes a multipart/form-data upload of the audio file and returns a UUID of the transcription job. You can use this ID to retrieve the transcript through the endpoint below.

Headers

Name

Type

Description

Authorizaton*

String

Bearer Token

{
    "speech_to_text_job_id": "UUID",
    "status": "AWAITING_UPLOAD",
    "creation_time": "2022-11-01T09:08:07-05:00",
    "start_time": "2022-11-01T09:08:08-05:00",
    "upload_url": "https://veryLongURL/"
}

Once the client receives the pre-signed URL, perform a multipart/form-data POST of the audio file to the URL. Then call the get-transcription endpoint to obtain the status and results of the transcription.

Retrieves information about a transcription job.

GET https://api.motar.io/nlp/v1/get-text

If the transcription is not finished, the API returns a status of "IN_PROCESS". When completed, the API returns the transcription and metadata about it.

Query Parameters

Name

Type

Description

speech_to_text_job_id

String

UUID of the transcription job

Headers

Name

Type

Description

Authorization*

String

Bearer Token

{
    "speech_to_text_job_id": "UUID",
    "status": "IN_PROCESS"
}

{
    "speech_to_text_job_id": "UUID",
    "results": {
        "transcripts": [
            {"transcript": "This is a test recording."}
        ],
        "items": [
            {
                "start_time": "0.0",
                "end_time": "0.38",
                "alternatives": [ { "confidence": "1.0", "content": "This" } ],
                "type": "pronunciation"
            },
            {
                "start_time": "0.38",
                "end_time": "0.52",
                "alternatives": [ { "confidence": "1.0", "content": "is" } ],
                "type": "pronunciation"
            },
            {
                "start_time": "0.52",
                "end_time": "0.63",
                "alternatives": [ { "confidence": "1.0", "content": "a" } ],
                "type": "pronunciation"
            },
            {
                "start_time": "0.63",
                "end_time": "1.17",
                "alternatives": [ { "confidence": "1.0", "content": "test"} ],
                "type": "pronunciation"
            },
            {
                "start_time": "1.17",
                "end_time": "2.09",
                "alternatives": [ { "confidence": "0.9998", "content": "recording" } ],
                "type": "pronunciation"
            },
            {
                "alternatives": [ { "confidence": "0.0", "content": "." } ],
                "type": "punctuation"
            }
        ]
    },
    "status": "COMPLETED"
}

PreviousATO NextUser Communications

Last updated 7 months ago