Apply LLM Gateway to Audio Transcripts

Overview

A Large Language Model (LLM) is a machine learning model that uses natural language processing (NLP) to generate text. LLM Gateway is a unified API that provides access to 25+ models from Claude, GPT, Gemini, and more through a single interface. You can use LLM Gateway to analyze audio transcripts, for example to ask questions about a call, or to summarize a meeting. By the end of this tutorial, you’ll be able to use LLM Gateway to summarize an audio file. Here’s the full sample code for what you’ll build in this tutorial:

Python
JavaScript

import requests
import time

# Step 1: Transcribe the audio
base_url = "https://api.assemblyai.com"

headers = {
  "authorization": "<YOUR_API_KEY>"
}

# You can use a local filepath:
# with open("./my-audio.mp3", "rb") as f:
# response = requests.post(base_url + "/v2/upload",
# headers=headers,
# data=f)
# upload_url = response.json()["upload_url"]
# Or use a publicly-accessible URL:

upload_url = "https://assembly.ai/sports_injuries.mp3"

data = {
  "audio_url": upload_url
}

response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)

transcript_id = response.json()["id"]
polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"

while True:
  transcript = requests.get(polling_endpoint, headers=headers).json()

  if transcript["status"] == "completed":
    break

  elif transcript["status"] == "error":
    raise RuntimeError(f"Transcription failed: {transcript['error']}")

  else:
    time.sleep(3)

# Step 2: Send transcript to LLM Gateway
prompt = "Provide a brief summary of the transcript."

llm_gateway_data = {
  "model": "claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": f"{prompt}\n\n{{{{ transcript }}}}"}
  ],
  "transcript_id": transcript_id,
  "max_tokens": 1000
}

response = requests.post(
  "https://llm-gateway.assemblyai.com/v1/chat/completions",
  headers=headers,
  json=llm_gateway_data
)
print(response.json()["choices"][0]["message"]["content"])

import fs from "fs-extra";

// Step 1: Transcribe the audio
const base_url = "https://api.assemblyai.com";

const headers = {
  authorization: "<YOUR_API_KEY>",
};

const path = "./my-audio.mp3";
const audioData = await fs.readFile(path);
let res = await fetch(`${base_url}/v2/upload`, {
  method: "POST",
  headers,
  body: audioData,
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const uploadResponse = await res.json();

const uploadUrl = uploadResponse.upload_url;

const data = {
  audio_url: uploadUrl, // You can also use a URL of an audio or video file on the web
};

res = await fetch(base_url + "/v2/transcript", {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify(data),
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const response = await res.json();

const transcript_id = response.id;
const polling_endpoint = base_url + `/v2/transcript/${transcript_id}`;

let transcript;
while (true) {
  res = await fetch(polling_endpoint, { headers });
  if (!res.ok) throw new Error(`Error: ${res.status}`);
  transcript = await res.json();

  if (transcript.status === "completed") {
    break;
  } else if (transcript.status === "error") {
    throw new Error(`Transcription failed: ${transcript.error}`);
  } else {
    await new Promise((resolve) => setTimeout(resolve, 3000));
  }
}

// Step 2: Send transcript to LLM Gateway
const prompt = "Provide a brief summary of the transcript.";

const llm_gateway_data = {
  model: "claude-sonnet-4-6",
  messages: [
    { role: "user", content: `${prompt}\n\n{{ transcript }}` },
  ],
  transcript_id: transcript_id,
  max_tokens: 1000,
};

res = await fetch("https://llm-gateway.assemblyai.com/v1/chat/completions", {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify(llm_gateway_data),
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const result = await res.json();
console.log(result.choices[0].message.content);

If you run the code above, you’ll see the following output:

The transcript describes several common sports injuries - runner's knee,
sprained ankle, meniscus tear, rotator cuff tear, and ACL tear. It provides
definitions, causes, and symptoms for each injury. The transcript seems to be
narrating sports footage and describing injuries as they occur to the athletes.
Overall, it provides an overview of these common sports injuries that can result
from overuse or sudden trauma during athletic activities

Before you begin

To complete this tutorial, you need:

Python or Node installed.
An AssemblyAI account with a credit card set up.
Basic understanding of how to Transcribe an audio file.

Step 1: Install prerequisites

Python
JavaScript

Install the package via pip:

pip install requests

Step 2: Transcribe an audio file

When you pass transcript_id to LLM Gateway and include the {{ transcript }} tag in a message, the API substitutes the tag with the transcript’s text field before running the completion. It does not include other fields such as utterances or speaker labels. If you need speaker-separated context, format the utterances yourself and include them in your prompt. In this step, you’ll transcribe an audio file that you can later use with LLM Gateway. For more information about transcribing audio, see Transcribe an audio file.

Python
JavaScript

import requests
import time

base_url = "https://api.assemblyai.com"

headers = {"authorization": "<YOUR_API_KEY>"}

# You can use a local filepath:
# with open("./my-audio.mp3", "rb") as f:
#     response = requests.post(base_url + "/v2/upload", headers=headers, data=f)
#     upload_url = response.json()["upload_url"]

# Or use a publicly-accessible URL:
upload_url = "https://assembly.ai/sports_injuries.mp3"

data = {"audio_url": upload_url}

response = requests.post(base_url + "/v2/transcript", headers=headers, json=data)

transcript_id = response.json()["id"]
polling_endpoint = base_url + f"/v2/transcript/{transcript_id}"

while True:
    transcript = requests.get(polling_endpoint, headers=headers).json()

    if transcript["status"] == "completed":
        break

    elif transcript["status"] == "error":
        raise RuntimeError(f"Transcription failed: {transcript['error']}")

    else:
        time.sleep(3)

import fs from "fs-extra";

const base_url = "https://api.assemblyai.com";

const headers = {
  authorization: "<YOUR_API_KEY>",
};

const path = "./my-audio.mp3";
const audioData = await fs.readFile(path);
let res = await fetch(`${base_url}/v2/upload`, {
  method: "POST",
  headers,
  body: audioData,
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const uploadResponse = await res.json();
const uploadUrl = uploadResponse.upload_url;

const data = {
  audio_url: uploadUrl, // You can also use a URL of an audio or video file on the web
};

res = await fetch(base_url + "/v2/transcript", {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify(data),
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const response = await res.json();

const transcript_id = response.id;
const polling_endpoint = base_url + `/v2/transcript/${transcript_id}`;

while (true) {
  res = await fetch(polling_endpoint, { headers });
  if (!res.ok) throw new Error(`Error: ${res.status}`);
  const transcript = await res.json();

  if (transcript.status === "completed") {
    break;
  } else if (transcript.status === "error") {
    throw new Error(`Transcription failed: ${transcript.error}`);
  } else {
    await new Promise((resolve) => setTimeout(resolve, 3000));
  }
}

Use existing transcriptIf you’ve already transcribed an audio file you want to use, you can get an existing transcript using its ID. You can find the ID for previously transcribed audio files in the Processing queue.

Python
JavaScript

transcript = requests.get("https://api.assemblyai.com/v2/transcript/YOUR_TRANSCRIPT_ID", headers=headers).json()

const res = await fetch("https://api.assemblyai.com/v2/transcript/YOUR_TRANSCRIPT_ID", { headers });
if (!res.ok) throw new Error(`Error: ${res.status}`);
const transcript = await res.json();

Step 3: Send transcript to LLM Gateway

In this step, you’ll send the transcript ID to LLM Gateway along with a prompt to generate text output. The prompt is a text string that provides the LLM with instructions on how to generate the text output. You’ll write a prompt that references the transcript with a {{ transcript }} tag and send it to LLM Gateway using the chat completions API. Pass the transcript ID as the top-level transcript_id field — the API substitutes the tag with the transcript’s text before running the completion.

Only the first occurrence of {{ transcript }} in the first message that contains it is substituted — additional tags or tags in later messages are left as-is. The tag must be exactly {{ transcript }} (with the spaces); variants like {{transcript}} or {{ TRANSCRIPT }} are not substituted. The endpoint returns 404 if the transcript ID does not exist or belongs to a different account.

Python
JavaScript

Write a prompt with instructions on how the LLM should generate the text output.

prompt = "Provide a brief summary of the transcript."

Write a prompt with instructions on how the LLM should generate the text output.

const prompt = "Provide a brief summary of the transcript.";

Python
JavaScript

Send the transcript ID and prompt to LLM Gateway. The model parameter defines which LLM to use. For available models, see LLM Gateway Overview.

llm_gateway_data = {
  "model": "claude-sonnet-4-6",
  "messages": [
    {"role": "user", "content": f"{prompt}\n\n{{{{ transcript }}}}"}
  ],
  "transcript_id": transcript_id,
  "max_tokens": 1000
}

result = requests.post(
  "https://llm-gateway.assemblyai.com/v1/chat/completions",
  headers=headers,
  json=llm_gateway_data
)

Send the transcript ID and prompt to LLM Gateway. The model parameter defines which LLM to use. For available models, see LLM Gateway Overview.

const llm_gateway_data = {
  model: "claude-sonnet-4-6",
  messages: [
    { role: "user", content: `${prompt}\n\n{{ transcript }}` },
  ],
  transcript_id: transcript_id,
  max_tokens: 1000,
};

let res = await fetch("https://llm-gateway.assemblyai.com/v1/chat/completions", {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify(llm_gateway_data),
});
if (!res.ok) throw new Error(`Error: ${res.status}`);
const result = await res.json();

Print the result.

Python
JavaScript

print(result.json()["choices"][0]["message"]["content"])

console.log(result.choices[0].message.content);

The output will look something like this:

 The transcript describes several common sports injuries - runner's knee,
 sprained ankle, meniscus tear, rotator cuff tear, and ACL tear. It provides
 definitions, causes, and symptoms for each injury. The transcript seems to be
 narrating sports footage and describing injuries as they occur to the athletes.
 Overall, it provides an overview of these common sports injuries that can
 result from overuse or sudden trauma during athletic activities

Want to make your LLM requests more resilient? Use fallback models to automatically switch to a backup model if your primary model is unavailable.

Next steps

In this tutorial, you’ve learned how to generate LLM output based on your audio transcripts using LLM Gateway. The type of output depends on your prompt, so try exploring different prompts to see how they affect the output. Here’s a few more prompts to try.

“Provide an analysis of the transcript and offer areas to improve with exact quotes.”
“What’s the main take-away from the transcript?”
“Generate a set of action items from this transcript.”

To learn more about LLM Gateway and working with different models, see the following resources:

Need some help?

If you get stuck, or have any other questions, we’d love to help you out. Contact our support team at support@assemblyai.com or create a support ticket.

Documentation Index

​Overview

​Before you begin

​Step 1: Install prerequisites

​Step 2: Transcribe an audio file

​Step 3: Send transcript to LLM Gateway

​Next steps

​Need some help?

Overview

Before you begin

Step 1: Install prerequisites

Step 2: Transcribe an audio file

Step 3: Send transcript to LLM Gateway

Next steps

Need some help?