Skip to main content
WSS
wss://api.deepl.com
/
v3
/
voice
/
realtime
/
connect?token=
{token}
Messages
token
type:string
required

This is the ephemeral authentication token obtained from the Get Streaming URL endpoint. The token is valid for one-time use only and must be passed as a query parameter when establishing the WebSocket connection.

Source Media Chunk
type:object

The message contains a base64-encoded binary chunk of audio data. The audio encoding must be the same that was specified in the Get Streaming URL request.

The chunk size must not be more than 100 kilobyte or one second in duration. The recommended duration is 50 - 250 milliseconds to achieve the best tradeoff between latency and quality. The interval between chunks must not be less than half of the duration of the preceding chunk and not exceed 30 seconds. Otherwise you'll run into rate limits or the session will time out and the stream closes forcibly.

For PCM data the chunk size must be a multiple of the frame size aka encoding unit.

End of Source Media
type:object

The message indicates the end of source media data. It causes the finalization of tentative transcript segments and triggers the emission of final transcript updates, end of transcript messages and the end of stream message. No more data chunks can be sent afterwards. It marks the end of the stream input.

Source Transcript Update
type:object

The message contains an update to the transcription of the supplied media in the source language.

Each message is an incremental addition to the already received updates of the source transcript with concluded and tentative text segments. Concluded segments are fixed and will only appear once, while tentative segments may be updated in subsequent messages as more audio is processed.

Clients should merge the concluded segments into a final transcript and update the tentative segments as new updates arrive.

Target Transcript Update
type:object

The message contains an update to the transcription of the supplied media in the target language.

Each message is an incremental addition to the already received updates of the target transcript with concluded and tentative text segments. Concluded segments are fixed and will only appear once, while tentative segments may be updated in subsequent messages as more audio is processed.

Clients should merge the concluded segments into a final transcript and update the tentative segments as new updates arrive.

End of Source Transcript
type:object

The message indicates that the source transcript is complete and no further updates will be sent. It gets emitted after client sends End of Source Media.

End of Target Transcript
type:object

This message indicates that the target transcript is complete and no further updates will be sent. It gets emitted after client sends End of Source Media.

End of Stream
type:object

This message indicates that all outputs are complete and the stream ended. It is the very last message the client will receive after it sends End of Source Media. You can safely close the connection after you received this message.

Error
type:object

This message reports errors encountered during audio processing or streaming. It includes an error code, reason code, and a human-readable message. You should close and reopen the stream after receiving an error message.