Use this file to discover all available pages before exploring further.
Register tools in session.tools to let your agent take actions. The agent emits a tool.call; you run the tool and reply with a tool.result when reply.done is the latest event you’ve received.
import asyncio, json, websocketsURL = "wss://agents.assemblyai.com/v1/ws"TOOLS = [{ "type": "function", "name": "get_weather", "description": "Get current weather for any city. Use this whenever the user asks about weather, temperature, or conditions. Prefer calling this over guessing.", "parameters": { "type": "object", "properties": {"city": {"type": "string", "description": "City name (e.g. London)"}}, "required": ["city"], },}]async def main(): async with websockets.connect(URL, extra_headers={"Authorization": "Bearer YOUR_KEY"}) as ws: await ws.send(json.dumps({ "type": "session.update", "session": { "system_prompt": "You are a weather assistant. Call get_weather for weather questions. When in doubt, call the tool.", "greeting": "Hi! Ask me about the weather.", "tools": TOOLS, "output": {"type": "audio", "voice": "ivy"}, }, })) last_event, pending = None, [] async def flush_if_idle(): if last_event != "reply.done" or not pending: return for t in pending: await ws.send(json.dumps({"type": "tool.result", "call_id": t["call_id"], "result": json.dumps(t["result"])})) pending.clear() async for raw in ws: event = json.loads(raw); t = event.get("type") if t == "tool.call" and event["name"] == "get_weather": pending.append({"call_id": event["call_id"], "result": {"temp_c": 22, "description": "Sunny"}}) await flush_if_idle() elif t in ("reply.started", "input.speech.started"): last_event = t elif t == "reply.done": last_event = t if event.get("status") == "interrupted": pending.clear() else: await flush_if_idle()asyncio.run(main())
Treat description as “when should I reach for this?”, not “what does this do?”.
{ "description": "Get current weather for any city. Use this whenever the user asks about weather, temperature, conditions, what to wear, or anything weather-dependent. Prefer calling this over guessing."}
Name the trigger: “Call this when the user asks about X.”
Lead with format (“E.164”, “ISO-8601 date”, “lowercase”).
Always include an example.
Use enum for fixed sets.
required only for fields the tool truly can’t function without; otherwise the model interrogates the user.
parameters is not validated at session.update time. Malformed schemas (missing type: "object", broken enum) are accepted silently and break tool calling at runtime. Validate locally.
Strong tool descriptions: see above. Most “tool never fires” failures trace here.
Strong parameter descriptions: same idea applied per-field. Vague params produce missing or invented argument values, which the validator then rejects (or worse, your tool runs on garbage). Lead with format, include an example, use enum for fixed sets. See Parameters.
Default-to-call wording in system_prompt: “When in doubt, call the tool. A wasted call is fine. Answering wrong from memory is not.” Don’t stack exceptions.
Few-shot examples in system_prompt are the strongest behavioural signal:
User: “Where’s my order?”
You: [call search_orders] “Looks like it’s out for delivery today.”
Keep tool sets small (≤10 per phase). Past that, selection accuracy drops. See Progressive tool reveal.
Send tool.result when reply.done is the latest event you’ve received. Not earlier (agent is still mid-transition-phrase), not later (a new turn has started).
last_event: str | None = Nonepending_tools: list[dict] = []async def flush_if_idle(): if last_event != "reply.done" or not pending_tools: return for tool in pending_tools: await ws.send(json.dumps({ "type": "tool.result", "call_id": tool["call_id"], "result": json.dumps(tool["result"]), # JSON string })) pending_tools.clear()# In your event loop:if t == "tool.call": result = run_tool(event["name"], event["arguments"]) pending_tools.append({"call_id": event["call_id"], "result": result}) await flush_if_idle() # may already be idle if reply.done fired firstelif t in ("reply.started", "input.speech.started"): last_event = t # turn in flight, hold resultselif t == "reply.done": last_event = t if event.get("status") == "interrupted": pending_tools.clear() # agent moved on, drop stale results else: await flush_if_idle()
Two non-obvious bits:
Call flush_if_idle() from the tool.call handler. Your tool may return afterreply.done already fired.
Update last_event on reply.started / input.speech.started so results that become available mid-turn are held until that turn ends.
The error field is read verbatim by the model. Weak errors cause guessing loops; specific errors get clean recoveries.Weak (agent re-asks for everything):
{ "error": "Lookup failed." }
Strong (agent re-asks only for the field that failed):
{ "error": "Could not resolve DROPOFF 'Central train station'. Pickup resolved ('SW1A 1AA'). Ask the user for a UK postcode for the dropoff." }
Patterns: name the failing field, say what did work so the agent doesn’t re-ask for it, tell the agent what to ask for next.
During hold, the server does not emit transcript.user.delta or transcript.user in real time. Transcripts flush once the hold ends (tool.result or reply.create). Live captioning pauses during the hold; nothing is dropped.
For multi-step workflows (lookup → estimate → commit), don’t register all tools upfront. After each successful tool.result, send session.update adding the next phase’s tools, and update system_prompt to match.Why: a tool that isn’t in the current list can’t be called, so the model can’t fabricate a commit before the prerequisite step has run. Smaller per-phase tool sets also raise selection accuracy.
TIER_1_TOOLS = [lookup_postcode]TIER_2_TOOLS = [lookup_postcode, estimate_fare]TIER_3_TOOLS = [lookup_postcode, estimate_fare, book_ride, get_booking, track_driver, cancel_ride]tier_2_unlocked = tier_3_unlocked = Falseasync def maybe_unlock_next_tier(tool_name, result): global tier_2_unlocked, tier_3_unlocked if result.get("error"): return if not tier_2_unlocked and tool_name == "lookup_postcode" and result.get("postcode"): tier_2_unlocked = True await ws.send(json.dumps({"type": "session.update", "session": {"tools": TIER_2_TOOLS, "system_prompt": TIER_2_PROMPT}})) elif not tier_3_unlocked and tool_name == "estimate_fare" and result.get("estimated_fare"): tier_3_unlocked = True await ws.send(json.dumps({"type": "session.update", "session": {"tools": TIER_3_TOOLS, "system_prompt": TIER_3_PROMPT}}))
Update tools AND system_prompt together. Tool-only gating where the prompt still references a now-hidden tool can underperform not gating at all. The model hunts for a tool the prompt promised and stalls or improvises when it can’t find it. Strip or rewrite every prompt sentence that names a tool whose visibility changed.
Real users go off-script. Two patterns, used together:
Transition tools: revise_pickup, revise_dropoff, restart, end_call exposed in every state. Model picks the right escape; orchestrator rolls state back.
respond_freely: a no-op tool in every state for tangential questions (“are you a real person?”). Model calls it instead of leaving the state.
Gating makes hallucinations harmless (no real booking happens) but doesn’t suppress the spoken claim. Pair with prompt wording:
NEVER quote a fare, distance, time, confirmation number, name, or ETA unlessthose exact values came from a tool result in this conversation. If youhaven't seen a tool result, you do NOT have these values. Don't estimatethem. Don't guess. Don't say "around" a number.
s0: [check_availability, cancel_reservation] ← alwayss1a: [check_availability, create_reservation, cancel_reservation] ← if availables1b: [check_availability, add_to_waitlist, cancel_reservation] ← if full
Prompt focus: “Confirm party size, date, time. Call check_availability. If open, offer it and book. If not, offer the next two times or the waitlist.”The next tool depends on the prior result. Don’t expose both create_reservation and add_to_waitlist simultaneously. The model picks the wrong one ~30% of the time.
Prompt focus: “Before sharing any account info, call verify_identity. Never quote a balance or transaction you haven’t fetched. Never promise a dispute outcome; only the system can.”The anti-fabrication clause matters most here. A bank agent inventing a balance is a P0.