Skip to main content
POST
/
v3
/
voice
/
realtime
curl --request POST \
--url https://api.deepl.com/v3/voice/realtime \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--data '{
"source_media_content_type": "audio/ogg; codecs=opus",
"source_language": "en",
"source_language_mode": "auto",
"target_languages": [
"de",
"fr",
"es"
]
}'
{
  "streaming_url": "wss://api.deepl.com/v3/voice/realtime/connect",
  "token": "VGhpcyBpcyBhIGZha2UgdG9rZW4K",
  "session_id": "4f911080-cfe2-41d4-8269-0e6ec15a0354"
}

Authorizations

Authorization
string
header
required

Authentication with Authorization header and DeepL-Auth-Key authentication scheme. Example: DeepL-Auth-Key <api-key>

Body

application/json
source_media_content_type
enum<string>
required

The audio format for streaming, which specifies container, codec, and encoding parameters. See the table below for supported formats. If audio/auto is specified, the server will auto-detect the container and codec for all supported combinations, except PCM. That requires explicit encoding parameters. All formats need to be single channel audio.

Content TypeContainerCodec
audio/autoAuto-detect: FLAC / Matroska / MPEG / Ogg/WebMAuto-detect AAC / FLAC / MP3/ OPUS
audio/flacFLAC (flac)FLAC, interleaved
audio/mpegMPEG (mp3/m4a)MP3
audio/oggOgg (ogg/oga)Auto-detect FLAC / OPUS
audio/webmWebM (webm)OPUS
audio/x-matroskaMatroska (mkv/mka)Auto-detect: AAC / FLAC / MP3 / OPUS
audio/ogg;codecs=flacOgg (ogg/oga)FLAC, interleaved
audio/ogg;codecs=opusOgg (ogg/oga)OPUS
audio/pcm;encoding=s16le;rate=8000-PCM signed 16-bit little-endian, 8000 Hz, interleaved
audio/pcm;encoding=s16le;rate=16000-PCM signed 16-bit little-endian, 16000 Hz, interleaved
audio/pcm;encoding=s16le;rate=44100-PCM signed 16-bit little-endian, 44100 Hz, interleaved
audio/pcm;encoding=s16le;rate=48000-PCM signed 16-bit little-endian, 48000 Hz, interleaved
audio/webm;codecs=opusWebM (webm)OPUS
audio/x-matroska;codecs=aacMatroska (mkv/mka)AAC
audio/x-matroska;codecs=flacMatroska (mkv/mka)FLAC, interleaved
audio/x-matroska;codecs=mp3Matroska (mkv/mka)MP3
audio/x-matroska;codecs=opusMatroska (mkv/mka)OPUS

We recommend the following bitrates as good tradeoff between quality and bandwidth:

  • AAC: 96 kbps
  • FLAC: 256 kbps (16000 Hz)
  • MP3: 128 kbps
  • OPUS: 32 kbps (recommendation for low bandwidth scenarios)
  • PCM: 256 kbps (16000 Hz, default recommendation)
Available options:
audio/auto,
audio/flac,
audio/mpeg,
audio/ogg,
audio/webm,
audio/x-matroska,
audio/ogg;codecs=flac,
audio/ogg;codecs=opus,
audio/pcm;encoding=s16le;rate=8000,
audio/pcm;encoding=s16le;rate=16000,
audio/pcm;encoding=s16le;rate=44100,
audio/pcm;encoding=s16le;rate=48000,
audio/webm;codecs=opus,
audio/x-matroska;codecs=aac,
audio/x-matroska;codecs=flac,
audio/x-matroska;codecs=mp3,
audio/x-matroska;codecs=opus
Example:

"audio/ogg;codecs=opus"

source_language
enum<string>

The source language of the audio stream. It can be left empty or must be one of the supported Voice API source languages and comply with IETF BCP 47 language tags.

Available options:
de,
cs,
en,
es,
fr,
id,
it,
ja,
ko,
nl,
pl,
pt,
ro,
ru,
sv,
tr,
uk,
zh
Example:

"en"

source_language_mode
enum<string>
default:auto

Controls how the source_language value is used.

  • auto: Treats source language as a hint; server can override
  • fixed: Treats source language as mandatory; server must use this language
Available options:
auto,
fixed
Example:

"fixed"

target_languages
enum<string>[]

List of target languages for translation. The stream will emit translations for each language. The maximum allowed target languages per stream is 5. Language identifiers must comply with IETF BCP 47.

Maximum length: 5
Example:
["de", "fr", "es"]
glossary_id
string

A unique ID assigned to a glossary.

Example:

"def3a26b-3e84-45b3-84ae-0c0aaf3525f7"

formality
enum<string>
default:default

Sets whether the translated text should lean towards formal or informal language. Possible options are:

  • default - use the default formality for the target language
  • formal/more - for a more formal language
  • informal/less - for a more informal language
Available options:
default,
formal,
more,
informal,
less
Example:

"formal"

Response

Successfully obtained streaming URL and token.

streaming_url
string
required

The WebSocket URL to use for establishing the stream connection.

Example:

"wss://api.deepl.com/v3/voice/realtime/connect"

token
string
required

A unique ephemeral token for authentication with the streaming endpoint. Pass this as a query parameter when connecting to the streaming URL. This token is ephemeral and valid for a short time and one-time use only.

Example:

"VGhpcyBpcyBhIGZha2UgdG9rZW4K"

session_id
string

Internal use only. A unique identifier for the requested stream.

Example:

"4f911080-cfe2-41d4-8269-0e6ec15a0354"