Introducing Universal-3 Pro

Overview

Universal-3 Pro is our most powerful Voice AI model, designed to capture the “hard stuff” that traditional ASR models struggle with. It delivers state-of-the-art accuracy for entities, rare words, and domain-specific terminology out of the box, with code switching and optional prompting for more control. It’s also our fastest model, so you get the best accuracy without sacrificing speed. Universal-3 Pro is available for both pre-recorded (async) and streaming use cases. Configuration and settings differ between the two because streaming is optimized for real-time audio utterances typically under 10 seconds, with special efficiencies built into the model for low-latency turn detection and voice agent workflows. Based on your use case, navigate to the appropriate guide below:

Universal-3 Pro Async

For pre-recorded audio files. Supports long-form audio, prompting, keyterms prompting, and full language detection.

Universal-3 Pro Streaming

For real-time audio streams. Optimized for low-latency turn detection, voice agents, and live transcription.

Account Management

End-to-end examples

⌘I

Documentation Index

​Overview

Universal-3 Pro Async

Universal-3 Pro Streaming

Overview