Skip to content

Retrieving results

Use correct API token

It is impodent that the correct API token is used for retrieval of results. And only the token which was used for creating and uploading can be used for retrieval.

URL: GET https://api.scriptix.io/api/v2/speech-to-text/session/${sessionId}

Request headers

The following headers need to be present

Parameter Value Description
x-zoom-s2t-key Scriptix Batch API Token API key of type real-time needed for authorization, this has to be the same as used for session initialization.

URL query parameters

Key Type Description
break_on_silence integer [0 .. 20000]

Default: 1500

Controls the maximum amount between two words in a segment is permitted. When the maximum is reached a new subtitle segment is created. By setting the value to 0 this limitation will be removed. This parameter only works for controlling subtitle output and will be ignored for application/json.

format string

Default: application/json

Enum: application/json, application/ttml+xml, text/sbv, text/srt, text/vtt

Control the output format for the result. Works the same as the Accept header, when both are set the query parameter gets preference.

Output formats
  • application/json - Zoom Media JSON format
  • application/ttml+xml - TTML subtitle format
  • text/sbv - SBV subtitle format
  • text/srt - SRT subtitle format
  • text/vtt - WebVTT subtitle format
Value must be URL encoded.
max_line_length integer [0 .. 200]

Default: 37

Controls the maximum length of a line in a segment block, when the maximum is reached a new line is added to the segment. By setting the value to 0 this limitation will be removed. This parameter only works for controlling subtitle output and will be ignored for application/json.

max_line_words integer [0 .. 100]

Default: 12

Controls the maximum number of words on a line in a segment block, when the maximum is reached a new line is added to the segment. By setting the value to 0 this limitation will be removed. This parameter only works for controlling subtitle output and will be ignored for application/json.

max_segment_duration integer [0 .. 200000]

Default: 7000

Controls the maximum number of milliseconds a segment will be visible, when the maximum is reached a new subtitle segment is created. By setting the value to 0 this limitation will be removed. This parameter only works for controlling subtitle output and will be ignored for application/json.

max_segment_lines integer [0 .. 4]

Default: 2

Controls the maximum number lines in a segment, when the maximum is reached a new subtitle segment is created. By setting the value to 0 this limitation will be removed. This parameter only works for controlling subtitle output and will be ignored for application/json.

max_segment_words integer [0 .. 100]

Default: 24

Controls the maximum number of words in a segment, when the maximum is reached a new subtitle segment is created. By setting the value to 0 this limitation will be removed. This parameter only works for controlling subtitle output and will be ignored for application/json.

min_segment_gap integer [0 .. 1000]

Default: 120

Controls the minimum amount of time in milliseconds between two segments. If the gap between the two segments is not large enough it will be extended by shortening segment one with 2/3rd of the minimal gap and segment two with 1/3rd of the minimal gap. This parameter only works for controlling subtitle output and will be ignored for application/json.

Response codes

Status code Description Payload
200 New session initialized
400 Bad parameter
401 Unauthorized
415 Content Invalid
422 Body Invalid
500 Server Error

Responses

application/json

Key Type Description
language string The configured language for this session
sessionId string Unique identifier for this session.
done boolean Indicator whether processing is done
results ResultObject[] Array of transcription results
JSON Response
{
    "language": "en-us",
    "sessionId": "49b5b257000004000006ccdfc8",
    "zoom_id": "49b5b257000004000006ccdfc8",
    "done": true,
    "results": [
        {
            "result": [
                "word",
                time_start,
                time_end,
                confidence
            ],
            "text": "string",
            "speaker": "unk"
        }
    ]
}

application/ttml+xml

TTML Response
<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xml:lang="en-us">
<head>
    <metadata xmlns:ttm="http://www.w3.org/ns/ttml#metadata">
        <ttm:title>Scriptix TTML</ttm:title>
    </metadata>
    <styling xmlns:tts="http://www.w3.org/ns/ttml#styling">
        <style xml:id="s1" tts:textAlign="center" tts:fontFamily="Arial" tts:fontSize="100%"/>
    </styling>
    <layout xmlns:tts="http://www.w3.org/ns/ttml#layout">
        <region xml:id="bottom" tts:displayAlign="after" tts:extent="80% 40%" tts:origin="10% 50%"/>
    </layout>
</head>
<body region="bottom" style="s1">
    <div>
        <p begin="00:00:00.000" end="00:00:02.850" style="s1" region="bottom">the quick brown fox jumps over the</p>
        <p begin="00:00:02.850" end="00:00:03.570" style="s1" region="bottom">lazy dog</p>
    </div>
</body>
</tt>

text/sbv

SBV Response
00:00:00.000 --> 00:00:02.850
the quick brown fox jumps over the

00:00:02.850 --> 00:00:03.570
lazy dog

text/srt

SBV Response
1
00:00:00.000 --> 00:00:02.850
the quick brown fox jumps over the

2
00:00:02.850 --> 00:00:03.570
lazy dog

text/vtt

SBV Response
WEBVTT

NOTE This file has been generated by Zoom Media

00:00.000 --> 00:02.850
the quick brown fox jumps over the

00:02.850 --> 00:03.570
lazy dog