Skip to main content

Documentation Index

Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Pick any voice ID from the tables below and set it on session.output.voice in a session.update before session.ready. session.output is immutable once the session is established, so the voice can’t be changed mid-conversation.
{
  "type": "session.update",
  "session": {
    "output": { "voice": "ivy" }
  }
}

Language support

The voice agent’s input (speech recognition) and output (speech synthesis) cover different sets of languages:
  • Input (understood): 🇺🇸 English, 🇫🇷 French, 🇩🇪 German, 🇮🇹 Italian, 🇵🇹 Portuguese, and 🇪🇸 Spanish.
  • Output (spoken): those six, plus 🇮🇳 Hindi, 🇯🇵 Japanese, 🇰🇷 Korean, 🇨🇳 Mandarin, and 🇷🇺 Russian.
The agent can speak a language it can’t transcribe from user audio. This is useful for translation-style flows where the user speaks one of the recognized languages and the agent replies in another.

Choose a voice by language

Every voice supports every output language. The difference between the two tables is the voice’s primary accent:
  • For an English accent (American or British) carried into other languages, pick from Voices.
  • For a native accent in a specific non-English language, pick the matching language-specific voice.

Voices

These voices have an American or British English accent. They speak 🇺🇸 English, 🇫🇷 French, 🇩🇪 German, 🇮🇹 Italian, 🇵🇹 Portuguese, 🇪🇸 Spanish, 🇮🇳 Hindi, 🇨🇳 Mandarin, 🇷🇺 Russian, 🇰🇷 Korean, and 🇯🇵 Japanese. Their English accent carries over into the other languages.
VoiceAccentDescriptionSample
ivy🇺🇸Professional, deliberate, smooth
james🇺🇸Conversational, professional, male
tyler🇺🇸Theatrical, energetic, chatty, jagged
winter🇺🇸Empathetic, aesthetic, conversational
sam🇺🇸Soft, conversational, young
mia🇺🇸Smooth, conversational, young
bella🇺🇸High-pitched, chatty
david🇺🇸Deep, calming, conversational
jack🇺🇸Smooth, direct, clear, fast-paced
kyle🇺🇸Chatty, nasal, expressive
helen🇺🇸Soft, older, calming
martha🇺🇸Southern, older, warm
river🇺🇸Slow, calming, ASMR
emma🇺🇸Lively, young, conversational
victor🇺🇸Deep, older
eleanor🇺🇸Deeper, older, calming
sophie🇬🇧Clear, smooth, instructive, simple
oliver🇬🇧Narrative, conversational

Language-specific voices

These voices have a native accent in a specific non-English language. They also speak 🇺🇸 English, 🇫🇷 French, 🇩🇪 German, 🇮🇹 Italian, 🇵🇹 Portuguese, 🇪🇸 Spanish, 🇮🇳 Hindi, 🇨🇳 Mandarin, 🇷🇺 Russian, 🇰🇷 Korean, and 🇯🇵 Japanese, and they code-switch naturally between their primary language and English.
VoiceNative accentDescriptionSample
arjun🇮🇳 Hindi/HinglishConversational
ethan🇨🇳 MandarinConversational
dmitri🇷🇺 RussianConversational
lukas🇩🇪 GermanBritish English accent, conversational, smooth
lena🇩🇪 GermanConversational, soft
pierre🇫🇷 FrenchConversational
mina🇰🇷 Korean
ren🇯🇵 Japanese
mei🇨🇳 Mandarin
joon🇰🇷 Korean
giulia🇮🇹 Italian
luca🇮🇹 Italian
lucia🇪🇸 Spanish
hana🇯🇵 Japanese
mateo🇪🇸 Spanish
diego🇨🇴 Spanish (Latin American)Colombian