This guide will demonstrate how to use AssemblyAI’s LLM Gateway to process an audio file and find the best quotes via the chat completions endpoint.Documentation Index
Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
- Python
- JavaScript
import requests
import json
import time
API_KEY = "YOUR_API_KEY"
headers = {"authorization": API_KEY}
# Transcribe the audio file
print("Submitting audio for transcription...")
transcript_response = requests.post(
"https://api.assemblyai.com/v2/transcript",
headers=headers,
json={
"audio_url": "https://assembly.ai/wildfires.mp3",
"speaker_labels": True,
"speech_models": ["universal-3-pro", "universal-2"],
"language_detection": True
}
)
transcript_id = transcript_response.json()["id"]
# Poll for transcription completion
while True:
transcript_result = requests.get(
f"https://api.assemblyai.com/v2/transcript/{transcript_id}",
headers=headers
).json()
if transcript_result["status"] == "completed":
break
elif transcript_result["status"] == "error":
raise Exception(f"Transcription failed: {transcript_result['error']}")
time.sleep(3)
# Extract utterances with timestamps
utterances_data = [
{"text": u["text"], "start": u["start"], "end": u["end"], "speaker": u["speaker"]}
for u in transcript_result["utterances"]
]
# Create prompt with timestamped utterances
prompt = f"""You are analyzing a transcript with timestamped utterances. Each utterance includes the text content, speaker label, and start/end timestamps in milliseconds.
Here is the transcript data:
{json.dumps(utterances_data, indent=2)}
Task: Identify the 3-5 most engaging, impactful, or quotable utterances from this transcript.
Return your response as a JSON array with the following structure:
{{
"quotes": [
{{
"text": "exact quote text",
"start": start_timestamp_in_milliseconds,
"end": end_timestamp_in_milliseconds,
"speaker": "speaker_label",
"reason": "brief explanation of why this quote is engaging"
}}
]
}}
Return ONLY valid JSON, no additional text."""
# Use LLM Gateway to extract quotes
print("Submitting transcript to LLM Gateway for quote extraction...")
gateway_response = requests.post(
"https://llm-gateway.assemblyai.com/v1/chat/completions",
headers=headers,
json={
"model": "claude-sonnet-4-5-20250929",
"messages": [
{"role": "user", "content": prompt}
]
}
)
result = gateway_response.json()
content = result["choices"][0]["message"]["content"].strip().strip("```").removeprefix("json").strip()
quotes_json = json.loads(content)
print(json.dumps(quotes_json, indent=2))
const API_KEY = "YOUR_API_KEY";
const headers = {
"authorization": API_KEY,
"Content-Type": "application/json"
};
// Transcribe the audio file
console.log("Submitting audio for transcription...");
let res = await fetch(
"https://api.assemblyai.com/v2/transcript",
{
method: "POST",
headers,
body: JSON.stringify({
audio_url: "https://assembly.ai/wildfires.mp3",
speaker_labels: true,
speech_models: ["universal-3-pro", "universal-2"],
language_detection: true
})
}
);
if (!res.ok) throw new Error(`Error: ${res.status}`);
const transcriptResponse = await res.json();
const transcriptId = transcriptResponse.id;
// Poll for transcription completion
let transcriptResult;
while (true) {
res = await fetch(
`https://api.assemblyai.com/v2/transcript/${transcriptId}`,
{ headers }
);
if (!res.ok) throw new Error(`Error: ${res.status}`);
transcriptResult = await res.json();
if (transcriptResult.status === "completed") {
break;
} else if (transcriptResult.status === "error") {
throw new Error(`Transcription failed: ${transcriptResult.error}`);
}
await new Promise(resolve => setTimeout(resolve, 3000));
}
// Extract utterances with timestamps
const utterancesData = transcriptResult.utterances.map(u => ({
text: u.text,
start: u.start,
end: u.end,
speaker: u.speaker
}));
// Create prompt with timestamped utterances
const prompt = `You are analyzing a transcript with timestamped utterances. Each utterance includes the text content, speaker label, and start/end timestamps in milliseconds.
Here is the transcript data:
${JSON.stringify(utterancesData, null, 2)}
Task: Identify the 3-5 most engaging, impactful, or quotable utterances from this transcript.
Return your response as a JSON array with the following structure:
{
"quotes": [
{
"text": "exact quote text",
"start": start_timestamp_in_milliseconds,
"end": end_timestamp_in_milliseconds,
"speaker": "speaker_label",
"reason": "brief explanation of why this quote is engaging"
}
]
}
Return ONLY valid JSON, no additional text.`;
// Use LLM Gateway to extract quotes
console.log("Submitting transcript to LLM Gateway for quote extraction...");
res = await fetch(
"https://llm-gateway.assemblyai.com/v1/chat/completions",
{
method: "POST",
headers,
body: JSON.stringify({
model: "claude-sonnet-4-5-20250929",
messages: [
{ role: "user", content: prompt }
]
})
}
);
if (!res.ok) throw new Error(`Error: ${res.status}`);
const result = await res.json();
const quotesJson = JSON.parse(result.choices[0].message.content.replace(/^```json\n?/, '').replace(/```$/, '').trim());
console.log(JSON.stringify(quotesJson, null, 2));
Getting Started
Before we begin, make sure you have an AssemblyAI account and an API key. You can sign up for an AssemblyAI account and get your API key from your dashboard.Step-by-Step Instructions
Step 1: Install dependencies
Install the required library:- Python
pip install requests
Step 2: Set up your API key and headers
- Python
- JavaScript
import requests
import json
import time
API_KEY = "YOUR_API_KEY"
headers = {"authorization": API_KEY}
const API_KEY = "YOUR_API_KEY";
const headers = {
"authorization": API_KEY,
"Content-Type": "application/json"
};
Step 3: Transcribe the audio file
Next, we’ll use AssemblyAI to transcribe a file and save our transcript for later use. We’ll enablespeaker_labels to get utterances grouped by speaker.
- Python
- JavaScript
# Transcribe the audio file
print("Submitting audio for transcription...")
transcript_response = requests.post(
"https://api.assemblyai.com/v2/transcript",
headers=headers,
json={
"audio_url": "https://assembly.ai/wildfires.mp3",
"speaker_labels": True,
"speech_models": ["universal-3-pro", "universal-2"],
"language_detection": True
}
)
transcript_id = transcript_response.json()["id"]
# Poll for transcription completion
while True:
transcript_result = requests.get(
f"https://api.assemblyai.com/v2/transcript/{transcript_id}",
headers=headers
).json()
if transcript_result["status"] == "completed":
break
elif transcript_result["status"] == "error":
raise Exception(f"Transcription failed: {transcript_result['error']}")
time.sleep(3)
// Transcribe the audio file
console.log("Submitting audio for transcription...");
let res = await fetch(
"https://api.assemblyai.com/v2/transcript",
{
method: "POST",
headers,
body: JSON.stringify({
audio_url: "https://assembly.ai/wildfires.mp3",
speaker_labels: true,
speech_models: ["universal-3-pro", "universal-2"],
language_detection: true
})
}
);
if (!res.ok) throw new Error(`Error: ${res.status}`);
const transcriptResponse = await res.json();
const transcriptId = transcriptResponse.id;
// Poll for transcription completion
let transcriptResult;
while (true) {
res = await fetch(
`https://api.assemblyai.com/v2/transcript/${transcriptId}`,
{ headers }
);
if (!res.ok) throw new Error(`Error: ${res.status}`);
transcriptResult = await res.json();
if (transcriptResult.status === "completed") {
break;
} else if (transcriptResult.status === "error") {
throw new Error(`Transcription failed: ${transcriptResult.error}`);
}
await new Promise(resolve => setTimeout(resolve, 3000));
}
Step 4: Extract utterances with timestamps
Then we’ll take the timestampedutterances array from our transcript and format it as structured data. Utterances are grouped by speaker and include continuous speech segments.
- Python
- JavaScript
utterances_data = [
{"text": u["text"], "start": u["start"], "end": u["end"], "speaker": u["speaker"]}
for u in transcript_result["utterances"]
]
const utterancesData = transcriptResult.utterances.map(u => ({
text: u.text,
start: u.start,
end: u.end,
speaker: u.speaker
}));
Step 5: Use LLM Gateway to extract engaging quotes
Finally, we’ll provide the timestamped utterances to the LLM Gateway chat completions endpoint to extract the most engaging quotes from this transcript with their associated timestamps in a structured JSON format.- Python
- JavaScript
# Create prompt with timestamped utterances
prompt = f"""You are analyzing a transcript with timestamped utterances. Each utterance includes the text content, speaker label, and start/end timestamps in milliseconds.
Here is the transcript data:
{json.dumps(utterances_data, indent=2)}
Task: Identify the 3-5 most engaging, impactful, or quotable utterances from this transcript.
Return your response as a JSON array with the following structure:
{{
"quotes": [
{{
"text": "exact quote text",
"start": start_timestamp_in_milliseconds,
"end": end_timestamp_in_milliseconds,
"speaker": "speaker_label",
"reason": "brief explanation of why this quote is engaging"
}}
]
}}
Return ONLY valid JSON, no additional text."""
# Use LLM Gateway to extract quotes
print("Submitting transcript to LLM Gateway for quote extraction...")
gateway_response = requests.post(
"https://llm-gateway.assemblyai.com/v1/chat/completions",
headers=headers,
json={
"model": "claude-sonnet-4-5-20250929",
"messages": [
{"role": "user", "content": prompt}
]
}
)
result = gateway_response.json()
content = result["choices"][0]["message"]["content"].strip().strip("```").removeprefix("json").strip()
quotes_json = json.loads(content)
print(json.dumps(quotes_json, indent=2))
// Create prompt with timestamped utterances
const prompt = `You are analyzing a transcript with timestamped utterances. Each utterance includes the text content, speaker label, and start/end timestamps in milliseconds.
Here is the transcript data:
${JSON.stringify(utterancesData, null, 2)}
Task: Identify the 3-5 most engaging, impactful, or quotable utterances from this transcript.
Return your response as a JSON array with the following structure:
{
"quotes": [
{
"text": "exact quote text",
"start": start_timestamp_in_milliseconds,
"end": end_timestamp_in_milliseconds,
"speaker": "speaker_label",
"reason": "brief explanation of why this quote is engaging"
}
]
}
Return ONLY valid JSON, no additional text.`;
// Use LLM Gateway to extract quotes
console.log("Submitting transcript to LLM Gateway for quote extraction...");
let res = await fetch(
"https://llm-gateway.assemblyai.com/v1/chat/completions",
{
method: "POST",
headers,
body: JSON.stringify({
model: "claude-sonnet-4-5-20250929",
messages: [
{ role: "user", content: prompt }
]
})
}
);
if (!res.ok) throw new Error(`Error: ${res.status}`);
const result = await res.json();
const quotesJson = JSON.parse(result.choices[0].message.content.replace(/^```json\n?/, '').replace(/```$/, '').trim());
console.log(JSON.stringify(quotesJson, null, 2));
Example Response
{
"quotes": [
{
"text": "It is, it is. The levels outside right now in Baltimore are considered unhealthy. And most of that is due to what's called particulate matter, which are tiny particles, microscopic, smaller than the width of your hair, that can get into your lungs and impact your respiratory system, your cardiovascular system, and even your neurological, your brain.",
"start": 62350,
"end": 82590,
"speaker": "B",
"reason": "Defines particulate matter and explains how it harms health."
},
{
"text": "Yeah. So the concentration of particulate matter, I was looking at some of the monitors that we have was reaching levels of what are, in science speak, 150 micrograms per meter cubed, which is more than 10 times what the annual average should be in about four times higher than what you're supposed to have on a 24 hour average. And so the concentrations of these particles in the air are just much, much, much higher than we typically see. And exposure to those high levels can lead to a host of health problems.",
"start": 93550,
"end": 123350,
"speaker": "B",
"reason": "Gives specific concentration figures and links to health risks."
},
{
"text": "It's the youngest. So children, obviously, whose bodies are still developing, the elderly who are, you know, their bodies are more in decline and they're more susceptible to the health impacts of breathing, the poor air quality. And then people who have pre existing health conditions, people with respiratory conditions or heart conditions, can be triggered by high levels of air pollution.",
"start": 137610,
"end": 156650,
"speaker": "B",
"reason": "Highlights the most vulnerable groups affected by poor air quality."
},
{
"text": "Well, I think the fires are going to burn for a little bit longer. But the key for us in the US Is the weather system changing. Right now it's the weather systems that are pulling that air into our Mid Atlantic and Northeast region. As those weather systems change and shift, we'll see that smoke going elsewhere and not impact us in this region as much. I think that's going to be the defining factor. I think the next couple days we're going to see a shift in that weather pattern and start to push the smoke away from where we are.",
"start": 198280,
"end": 227480,
"speaker": "B",
"reason": "Offers an outlook on how weather patterns may reduce exposure."
},
{
"text": "I mean, that is one of the predictions for climate change. Looking into the future, the fire season is starting earlier and lasting longer and we're seeing more frequent fires. So yeah, this is probably something that we'll be seeing more, more frequently. This tends to be much more of an issue in the western U.S. so the eastern U.S. getting hit right now is a little bit new. But yeah, I think with climate change moving forward, this is something that is going to happen more frequently.",
"start": 241370,
"end": 267570,
"speaker": "B",
"reason": "Connects current event to longer-term climate change trends and future frequency."
}
]
}