Skip to main content

Documentation Index

Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

What to log for support

Every LLM Gateway response includes a request_id — a unique identifier for that specific request. Log this ID for every call, not just when something goes wrong. When you reach out to support@assemblyai.com, including the request_id lets us find the exact request in our logs in seconds. At minimum, capture the following for every request:
  • request_id from the response body
  • The model parameter used
  • The API region (US: llm-gateway.assemblyai.com, EU: llm-gateway.eu.assemblyai.com)
  • A timestamp for when the request was sent
  • The full HTTP status code and response body when a non-2xx response is returned
A minimal logging example:
import requests
import time

response = requests.post(
    "https://llm-gateway.assemblyai.com/v1/chat/completions",
    headers={"authorization": "<YOUR_API_KEY>"},
    json={
        "model": "claude-sonnet-4-6",
        "messages": [{"role": "user", "content": "What is the capital of France?"}],
        "max_tokens": 1000,
    },
)

result = response.json()
log_entry = {
    "timestamp": time.time(),
    "region": "us",
    "model": "claude-sonnet-4-6",
    "status_code": response.status_code,
    "request_id": result.get("request_id"),
    "error": result.get("error"),
}
print(log_entry)

Authentication errors (401 / 403)

Symptom: The API responds with 401 Unauthorized or 403 Forbidden.
{
  "error": {
    "code": 401,
    "message": "Unauthorized - Invalid or missing API key"
  }
}
Causes:
  • API key is missing, malformed, or expired.
  • API key is from a different account or region.
  • The Authorization header is misspelled (e.g. Authorisation or missing the header entirely).
Fixes:
  • Confirm your API key on the API Keys page.
  • Pass the key in the Authorization header — not as a query parameter and not prefixed with Bearer.
  • If you’re using EU data residency, make sure the key was generated for the EU region. See Cloud endpoints and data residency.

Bad request (400)

Symptom: The API responds with 400 Bad Request.
{
  "error": {
    "code": 400,
    "message": "Invalid request: missing required field 'model'"
  }
}
Causes:
  • A required field is missing (model, plus either messages or prompt).
  • The model value is not a supported model parameter — see Available models.
  • max_tokens is outside the valid range or exceeds the model’s context window.
  • A field is the wrong type (e.g. messages sent as a string instead of an array).
Fixes:
  • Validate your request payload against the Basic chat completions reference.
  • Echo the full error message — it includes the specific field that failed validation.

Rate limit exceeded (429)

Symptom: The API responds with 429 Too Many Requests. Cause: You exceeded the per-model rate limit within a 60-second window. Each model has its own limit. Fixes:
  • Read the rate limit headers on every response (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) to back off gracefully. See Rate limits for the full header reference.
  • Implement exponential backoff with jitter when you receive a 429.
  • Consider specifying fallback models so traffic spills over to a different model when the primary is rate-limited.
  • If you need a higher rate limit, contact support.

Model not found (404)

Symptom: The API responds with 404 Not Found and an error mentioning the model. Causes:
  • The model value is misspelled or has been deprecated.
  • The model isn’t available in the region you’re calling. For example, OpenAI models are only available in the US region — see Cloud endpoints and data residency.
Fixes:
  • Double-check the exact model parameter against Available models.
  • If you need EU data residency, switch to an EU-supported model (most Anthropic Claude and Google Gemini models).

Server errors (5xx)

Symptom: The API responds with 500, 502, 503, or 504. Causes:
  • Transient issues on AssemblyAI’s side or with the upstream model provider.
  • The upstream provider returned a timeout or unavailable response.
Fixes:
  • Retry with exponential backoff and jitter. Most 5xx errors are transient.
  • Check the AssemblyAI Status page for ongoing incidents.
  • If the error persists, contact support with the request_id, the model used, the timestamp, and the full error response body.

Streamed responses don’t appear

Symptom: You set stream: true but receive a single non-streamed response — or no response at all. Causes:
  • Streaming is currently supported on OpenAI models only. Other providers ignore the stream flag and return a regular response.
  • The HTTP client isn’t reading the response body as a stream of server-sent events (SSE).
Fixes:

Unexpected output or quality issues

Symptom: The model returns content you didn’t expect — wrong format, wrong language, hallucinations, or refusals. Fixes:
  • Capture the full request payload (model, messages, parameters), the full response, and the request_id. Send all three to support@assemblyai.com — quality issues are difficult to diagnose without the exact prompt.
  • For structured output, use Structured outputs with a JSON schema rather than prompting for JSON in free text.
  • For malformed JSON, enable Post-processing to automatically repair responses.
  • Try a different model — quality varies. See the LMArena scores for a comparison.

Contacting support

If you’ve worked through the steps above and still need help, email support@assemblyai.com with:
  • The request_id from the failing response (or several, for intermittent issues)
  • The model parameter used
  • The API region (US or EU)
  • A timestamp for when the request was sent
  • The HTTP status code and full error response body
  • A minimal reproducible example of the request payload (with your API key redacted)