A Large Language Model (LLM) is a machine learning model that uses natural language processing (NLP) to generate text. LLM Gateway is a unified API that provides access to 25+ models from Claude, GPT, Gemini, and more through a single interface. You can use LLM Gateway to analyze audio transcripts, for example to ask questions about a call, or to summarize a meeting.By the end of this tutorial, you’ll be able to use LLM Gateway to summarize an audio file.Here’s the full sample code for what you’ll build in this tutorial:
Python
JavaScript
import requestsimport time# Step 1: Transcribe the audiobase_url = "https://api.assemblyai.com"headers = { "authorization": "<YOUR_API_KEY>"}# You can use a local filepath:# with open("./my-audio.mp3", "rb") as f:# response = requests.post(base_url + "/v2/upload",# headers=headers,# data=f)# upload_url = response.json()["upload_url"]# Or use a publicly-accessible URL:upload_url = "https://assembly.ai/sports_injuries.mp3"data = { "audio_url": upload_url}response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)transcript_id = response.json()["id"]polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"while True: transcript = requests.get(polling_endpoint, headers=headers).json() if transcript["status"] == "completed": break elif transcript["status"] == "error": raise RuntimeError(f"Transcription failed: {transcript['error']}") else: time.sleep(3)# Step 2: Send transcript to LLM Gatewayprompt = "Provide a brief summary of the transcript."llm_gateway_data = { "model": "claude-sonnet-4-6", "messages": [ {"role": "user", "content": f"{prompt}\n\n{{{{ transcript }}}}"} ], "transcript_id": transcript_id, "max_tokens": 1000}response = requests.post( "https://llm-gateway.assemblyai.com/v1/chat/completions", headers=headers, json=llm_gateway_data)print(response.json()["choices"][0]["message"]["content"])
import fs from "fs-extra";// Step 1: Transcribe the audioconst base_url = "https://api.assemblyai.com";const headers = { authorization: "<YOUR_API_KEY>",};const path = "./my-audio.mp3";const audioData = await fs.readFile(path);let res = await fetch(`${base_url}/v2/upload`, { method: "POST", headers, body: audioData,});if (!res.ok) throw new Error(`Error: ${res.status}`);const uploadResponse = await res.json();const uploadUrl = uploadResponse.upload_url;const data = { audio_url: uploadUrl, // You can also use a URL of an audio or video file on the web};res = await fetch(base_url + "/v2/transcript", { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data),});if (!res.ok) throw new Error(`Error: ${res.status}`);const response = await res.json();const transcript_id = response.id;const polling_endpoint = base_url + `/v2/transcript/${transcript_id}`;let transcript;while (true) { res = await fetch(polling_endpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); transcript = await res.json(); if (transcript.status === "completed") { break; } else if (transcript.status === "error") { throw new Error(`Transcription failed: ${transcript.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); }}// Step 2: Send transcript to LLM Gatewayconst prompt = "Provide a brief summary of the transcript.";const llm_gateway_data = { model: "claude-sonnet-4-6", messages: [ { role: "user", content: `${prompt}\n\n{{ transcript }}` }, ], transcript_id: transcript_id, max_tokens: 1000,};res = await fetch("https://llm-gateway.assemblyai.com/v1/chat/completions", { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(llm_gateway_data),});if (!res.ok) throw new Error(`Error: ${res.status}`);const result = await res.json();console.log(result.choices[0].message.content);
If you run the code above, you’ll see the following output:
The transcript describes several common sports injuries - runner's knee,sprained ankle, meniscus tear, rotator cuff tear, and ACL tear. It providesdefinitions, causes, and symptoms for each injury. The transcript seems to benarrating sports footage and describing injuries as they occur to the athletes.Overall, it provides an overview of these common sports injuries that can resultfrom overuse or sudden trauma during athletic activities
When you pass transcript_id to LLM Gateway and include the {{ transcript }} tag in a message, the API substitutes the tag with the transcript’s text field before running the completion. It does not include other fields such as utterances or speaker labels. If you need speaker-separated context, format the utterances yourself and include them in your prompt.In this step, you’ll transcribe an audio file that you can later use with LLM Gateway.For more information about transcribing audio, see Transcribe an audio file.
Python
JavaScript
import requestsimport timebase_url = "https://api.assemblyai.com"headers = {"authorization": "<YOUR_API_KEY>"}# You can use a local filepath:# with open("./my-audio.mp3", "rb") as f:# response = requests.post(base_url + "/v2/upload", headers=headers, data=f)# upload_url = response.json()["upload_url"]# Or use a publicly-accessible URL:upload_url = "https://assembly.ai/sports_injuries.mp3"data = {"audio_url": upload_url}response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)transcript_id = response.json()["id"]polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"while True: transcript = requests.get(polling_endpoint, headers=headers).json() if transcript["status"] == "completed": break elif transcript["status"] == "error": raise RuntimeError(f"Transcription failed: {transcript['error']}") else: time.sleep(3)
import fs from "fs-extra";const base_url = "https://api.assemblyai.com";const headers = { authorization: "<YOUR_API_KEY>",};const path = "./my-audio.mp3";const audioData = await fs.readFile(path);let res = await fetch(`${base_url}/v2/upload`, { method: "POST", headers, body: audioData,});if (!res.ok) throw new Error(`Error: ${res.status}`);const uploadResponse = await res.json();const uploadUrl = uploadResponse.upload_url;const data = { audio_url: uploadUrl, // You can also use a URL of an audio or video file on the web};res = await fetch(base_url + "/v2/transcript", { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(data),});if (!res.ok) throw new Error(`Error: ${res.status}`);const response = await res.json();const transcript_id = response.id;const polling_endpoint = base_url + `/v2/transcript/${transcript_id}`;while (true) { res = await fetch(polling_endpoint, { headers }); if (!res.ok) throw new Error(`Error: ${res.status}`); const transcript = await res.json(); if (transcript.status === "completed") { break; } else if (transcript.status === "error") { throw new Error(`Transcription failed: ${transcript.error}`); } else { await new Promise((resolve) => setTimeout(resolve, 3000)); }}
Use existing transcriptIf you’ve already transcribed an audio file you want to use, you can get an existing transcript using its ID. You can find the ID for previously transcribed audio files in the Processing queue.
In this step, you’ll send the transcript ID to LLM Gateway along with a prompt to generate text output.The prompt is a text string that provides the LLM with instructions on how to generate the text output. You’ll write a prompt that references the transcript with a {{ transcript }} tag and send it to LLM Gateway using the chat completions API. Pass the transcript ID as the top-level transcript_id field — the API substitutes the tag with the transcript’s text before running the completion.
Only the first occurrence of {{ transcript }} in the first message that contains it is substituted — additional tags or tags in later messages are left as-is. The tag must be exactly {{ transcript }} (with the spaces); variants like {{transcript}} or {{ TRANSCRIPT }} are not substituted. The endpoint returns 404 if the transcript ID does not exist or belongs to a different account.
1
Python
JavaScript
Write a prompt with instructions on how the LLM should generate the text output.
prompt = "Provide a brief summary of the transcript."
Write a prompt with instructions on how the LLM should generate the text output.
const prompt = "Provide a brief summary of the transcript.";
2
Python
JavaScript
Send the transcript ID and prompt to LLM Gateway. The model parameter defines which LLM to use. For available models, see LLM Gateway Overview.
The transcript describes several common sports injuries - runner's knee, sprained ankle, meniscus tear, rotator cuff tear, and ACL tear. It provides definitions, causes, and symptoms for each injury. The transcript seems to be narrating sports footage and describing injuries as they occur to the athletes. Overall, it provides an overview of these common sports injuries that can result from overuse or sudden trauma during athletic activities
Want to make your LLM requests more resilient? Use fallback models to automatically switch to a backup model if your primary model is unavailable.
In this tutorial, you’ve learned how to generate LLM output based on your audio transcripts using LLM Gateway. The type of output depends on your prompt, so try exploring different prompts to see how they affect the output. Here’s a few more prompts to try.
“Provide an analysis of the transcript and offer areas to improve with exact quotes.”
“What’s the main take-away from the transcript?”
“Generate a set of action items from this transcript.”
To learn more about LLM Gateway and working with different models, see the following resources:
If you get stuck, or have any other questions, we’d love to help you out. Contact our support team at support@assemblyai.com or create a support ticket.