Medical Mode
Supported models
u3-rt-prouniversal-streaming-englishuniversal-streaming-multilingualSupported languages
enesdefrSupported regions
US & EU
Medical Mode is an add-on that enhances streaming transcription accuracy for medical terminology — including medication names, procedures, conditions, and dosages. It is optimized for medical entity recognition to correct terms that other models frequently get wrong.
Medical Mode can be used with all of our Streaming STT models.
Enable Medical Mode by setting the domain connection parameter to "medical-v1". No other changes to your existing pipeline are required.
Medical Mode is billed as a separate add-on. See the pricing page for details.
Quickstart
Python
Python SDK
Javascript
JavaScript SDK
Python
Python SDK
Javascript
JavaScript SDK
Example output
Without Medical Mode:
With Medical Mode, lisprohumalog is updated to Lispro (Humalog) - following the standard medical convention of writing the generic name first, with the brand name in parentheses.
Use cases
Medical Mode is designed for healthcare AI applications where accurate medical terminology is critical:
- Ambient clinical documentation — Capture medication names, dosages, and clinical terms correctly during live patient encounters.
- Real-time medical scribes — Deliver accurate transcripts to clinicians during or immediately after a consult.
- Front-office voice agents — Handle drug names, provider names, and clinic-specific terminology in scheduling calls and insurance verification.
- Medical contact centers — Transcribe calls with correct medical vocabulary for downstream processing and quality assurance.
Combine with other features
Medical Mode works alongside other streaming features. You can combine it with:
- Streaming Diarization to identify who said what in clinical conversations
- Keyterms Prompting to further boost accuracy for specific medical terms unique to your use case
Python
Python SDK
Javascript
JavaScript SDK
Configuration for medical audio
Medical conversations — such as clinical dictation, patient encounters, and ambient scribes — have different speech patterns than typical voice agent interactions. Clinicians often pause mid-sentence to think, review a chart, or formulate a diagnosis. The default turn detection settings are optimized for fast-paced voice agent dialogues and can incorrectly fragment these natural pauses into separate turns.
To prevent premature turn boundaries in medical audio, increase the silence thresholds:
These values match the Conservative quick start configuration on the turn detection page. You can further adjust them based on your specific workflow — for example, a real-time medical scribe may benefit from a lower max_turn_silence (around 2000 ms) than a dictation application.
Python
Python SDK
Javascript
JavaScript SDK
Avoid setting end_of_turn_confidence_threshold to 0
If you are using a Universal Streaming model (not U3 Pro), do not set end_of_turn_confidence_threshold to 0. This completely disables semantic turn detection and forces a turn boundary at every silence, which is especially harmful for medical audio where mid-sentence pauses are common. See Turn detection for details.
HIPAA compliance
AssemblyAI offers a Business Associate Agreement (BAA) for customers who need to process Protected Health Information (PHI). AssemblyAI is SOC 2 Type 2, ISO 27001:2022, and PCI DSS v4.0 certified. Medical Mode does not change existing data handling or retention policies.
For BAA setup or enterprise pricing, contact our sales team.