Documentation Index Fetch the complete documentation index at: https://assemblyai.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
AssemblyAI’s LLM Gateway is a unified API providing access to 25+ models from Claude, GPT, Gemini, and more through a single interface.
It’s a powerful way to extract insights from transcripts generated from audio and video files. Given how varied the type of input and output could be for these use cases, the pricing for LLM Gateway is based on both input and output tokens.
Output tokens will vary depending on the model and the complexity of your request, but how do you determine the amount of input tokens you’ll be sending to LLM Gateway?
How many tokens does an audio file and your prompt contain? This guide will show you how to roughly calculate that information to help predict LLM Gateway’s input token cost ahead of time.
This guide calculates input token costs only .
Output token costs will vary based on the model used and the length of the generated response. To see the specific cost of each model (per 1M input and output tokens) applicable to your AssemblyAI account, refer to the Rates table on the Billing page of the dashboard.
Quickstart
import requests
import time
base_url = "https://api.assemblyai.com"
headers = { "authorization" : "YOUR_API_KEY" }
# Transcribe audio file
audio_url = "https://assembly.ai/wildfires.mp3"
data = { "audio_url" : audio_url, "speech_models" : [ "universal-3-pro" ]}
response = requests.post(base_url + "/v2/transcript" , headers = headers, json = data)
transcript_id = response.json()[ "id" ]
polling_endpoint = base_url + f "/v2/transcript/ { transcript_id } "
# Poll for completion
print ( "Waiting for transcription to complete..." )
while True :
transcript = requests.get(polling_endpoint, headers = headers).json()
if transcript[ "status" ] == "completed" :
break
elif transcript[ "status" ] == "error" :
raise RuntimeError ( f "Transcription failed: { transcript[ 'error' ] } " )
time.sleep( 3 )
# Define your prompt
prompt = "Provide a brief summary of the transcript."
# Calculate character count (transcript + prompt)
transcript_chars = len (transcript[ "text" ])
prompt_chars = len (prompt)
total_chars = transcript_chars + prompt_chars
print ( f " \n Total characters: { total_chars } " )
# Estimate tokens (roughly 4 characters = 1 token)
estimated_tokens = total_chars / 4
tokens_in_millions = estimated_tokens / 1_000_000
# Calculate input costs for different models
gpt5_cost = 1.25 * tokens_in_millions
claude_sonnet_cost = 3.00 * tokens_in_millions
gemini_pro_cost = 1.25 * tokens_in_millions
print ( f "Estimated input tokens: { estimated_tokens :,.0f} " )
print ( f " \n Estimated input costs:" )
print ( f "GPT-5: $ { gpt5_cost :.4f} " )
print ( f "Claude 4.5 Sonnet: $ { claude_sonnet_cost :.4f} " )
print ( f "Gemini 2.5 Pro: $ { gemini_pro_cost :.4f} " )
See all 48 lines
const baseUrl = "https://api.assemblyai.com" ;
const headers = { authorization: "YOUR_API_KEY" };
// Transcribe audio file
const audioUrl = "https://assembly.ai/wildfires.mp3" ;
const data = { audio_url: audioUrl , speech_models: [ "universal-3-pro" ] };
let res = await fetch ( ` ${ baseUrl } /v2/transcript` , {
method: "POST" ,
headers: { ... headers , "Content-Type" : "application/json" },
body: JSON . stringify ( data ),
});
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
const transcriptResponse = await res . json ();
const transcriptId = transcriptResponse . id ;
const pollingEndpoint = ` ${ baseUrl } /v2/transcript/ ${ transcriptId } ` ;
// Poll for completion
console . log ( "Waiting for transcription to complete..." );
let transcript ;
while ( true ) {
res = await fetch ( pollingEndpoint , { headers });
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
transcript = await res . json ();
if ( transcript . status === "completed" ) {
break ;
} else if ( transcript . status === "error" ) {
throw new Error ( `Transcription failed: ${ transcript . error } ` );
}
await new Promise (( resolve ) => setTimeout ( resolve , 3000 ));
}
// Define your prompt
const prompt = "Provide a brief summary of the transcript." ;
// Calculate character count (transcript + prompt)
const transcriptChars = transcript . text . length ;
const promptChars = prompt . length ;
const totalChars = transcriptChars + promptChars ;
console . log ( ` \n Total characters: ${ totalChars } ` );
// Estimate tokens (roughly 4 characters = 1 token)
const estimatedTokens = totalChars / 4 ;
const tokensInMillions = estimatedTokens / 1_000_000 ;
// Calculate input costs for different models
const gpt5Cost = 1.25 * tokensInMillions ;
const claudeSonnetCost = 3.00 * tokensInMillions ;
const geminiProCost = 1.25 * tokensInMillions ;
console . log ( `Estimated input tokens: ${ estimatedTokens . toLocaleString ( "en-US" , { maximumFractionDigits: 0 }) } ` );
console . log ( ` \n Estimated input costs:` );
console . log ( `GPT-5: $ ${ gpt5Cost . toFixed ( 4 ) } ` );
console . log ( `Claude 4.5 Sonnet: $ ${ claudeSonnetCost . toFixed ( 4 ) } ` );
console . log ( `Gemini 2.5 Pro: $ ${ geminiProCost . toFixed ( 4 ) } ` );
See all 56 lines
Step-by-Step Guide
Install dependencies
Install the required library:
Set up your API key
Import the necessary libraries and set your AssemblyAI API key, which can be found on your account dashboard :
import requests
import time
base_url = "https://api.assemblyai.com"
headers = { "authorization" : "YOUR_API_KEY" }
const baseUrl = "https://api.assemblyai.com" ;
const headers = { authorization: "YOUR_API_KEY" };
Transcribe your audio file
Transcribe your audio file using AssemblyAI:
audio_url = "https://assembly.ai/wildfires.mp3"
data = { "audio_url" : audio_url, "speech_models" : [ "universal-3-pro" ]}
response = requests.post(base_url + "/v2/transcript" , headers = headers, json = data)
transcript_id = response.json()[ "id" ]
polling_endpoint = base_url + f "/v2/transcript/ { transcript_id } "
# Poll for completion
print ( "Waiting for transcription to complete..." )
while True :
transcript = requests.get(polling_endpoint, headers = headers).json()
if transcript[ "status" ] == "completed" :
break
elif transcript[ "status" ] == "error" :
raise RuntimeError ( f "Transcription failed: { transcript[ 'error' ] } " )
time.sleep( 3 )
const audioUrl = "https://assembly.ai/wildfires.mp3" ;
const data = { audio_url: audioUrl , speech_models: [ "universal-3-pro" ] };
let res = await fetch ( ` ${ baseUrl } /v2/transcript` , {
method: "POST" ,
headers: { ... headers , "Content-Type" : "application/json" },
body: JSON . stringify ( data ),
});
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
const transcriptResponse = await res . json ();
const transcriptId = transcriptResponse . id ;
const pollingEndpoint = ` ${ baseUrl } /v2/transcript/ ${ transcriptId } ` ;
// Poll for completion
console . log ( "Waiting for transcription to complete..." );
let transcript ;
while ( true ) {
res = await fetch ( pollingEndpoint , { headers });
if ( ! res . ok ) throw new Error ( `Error: ${ res . status } ` );
transcript = await res . json ();
if ( transcript . status === "completed" ) {
break ;
} else if ( transcript . status === "error" ) {
throw new Error ( `Transcription failed: ${ transcript . error } ` );
}
await new Promise (( resolve ) => setTimeout ( resolve , 3000 ));
}
See all 28 lines
Calculate character count
We’ll count the characters in both the transcript and your prompt:
# Define your prompt
prompt = "Provide a brief summary of the transcript."
# Calculate character count (transcript + prompt)
transcript_chars = len (transcript[ "text" ])
prompt_chars = len (prompt)
total_chars = transcript_chars + prompt_chars
print ( f " \n Total characters: { total_chars } " )
// Define your prompt
const prompt = "Provide a brief summary of the transcript." ;
// Calculate character count (transcript + prompt)
const transcriptChars = transcript . text . length ;
const promptChars = prompt . length ;
const totalChars = transcriptChars + promptChars ;
console . log ( ` \n Total characters: ${ totalChars } ` );
For this specific file with the example prompt, the transcript contains approximately 4,880 characters and the prompt contains 42 characters, for a total of 4,922 characters.
Estimate tokens
Different LLM providers use different tokenization methods, but a rough estimate is that 4 characters equals approximately 1 token . This is based on guidance from:
# Estimate tokens (roughly 4 characters = 1 token)
estimated_tokens = total_chars / 4
tokens_in_millions = estimated_tokens / 1_000_000
print ( f "Estimated input tokens: { estimated_tokens :,.0f} " )
// Estimate tokens (roughly 4 characters = 1 token)
const estimatedTokens = totalChars / 4 ;
const tokensInMillions = estimatedTokens / 1_000_000 ;
console . log ( `Estimated input tokens: ${ estimatedTokens . toLocaleString ( "en-US" , { maximumFractionDigits: 0 }) } ` );
Language considerations Token counts can differ significantly across languages. Non-English languages
typically require more tokens per character than English. For instance, text
in languages like Spanish, Chinese, or Arabic may use 2-3 characters per token
instead of 4, resulting in higher token costs for the same amount of content.
LLM Gateway’s pricing is calculated per 1M input tokens. Here are the current rates for popular models:
# Calculate input costs for different models (rates per 1M tokens)
gpt5_cost = 1.25 * tokens_in_millions
claude_sonnet_cost = 3.00 * tokens_in_millions
gemini_pro_cost = 1.25 * tokens_in_millions
print ( f " \n Estimated input costs:" )
print ( f "GPT-5: $ { gpt5_cost :.4f} " )
print ( f "Claude 4.5 Sonnet: $ { claude_sonnet_cost :.4f} " )
print ( f "Gemini 2.5 Pro: $ { gemini_pro_cost :.4f} " )
// Calculate input costs for different models (rates per 1M tokens)
const gpt5Cost = 1.25 * tokensInMillions ;
const claudeSonnetCost = 3.00 * tokensInMillions ;
const geminiProCost = 1.25 * tokensInMillions ;
console . log ( ` \n Estimated input costs:` );
console . log ( `GPT-5: $ ${ gpt5Cost . toFixed ( 4 ) } ` );
console . log ( `Claude 4.5 Sonnet: $ ${ claudeSonnetCost . toFixed ( 4 ) } ` );
console . log ( `Gemini 2.5 Pro: $ ${ geminiProCost . toFixed ( 4 ) } ` );
For our example file with approximately 1,230 input tokens:
GPT-5 (gpt-5): ~$0.0015
Claude 4.5 Sonnet (claude-sonnet-4-5-20250929): ~$0.0037
Gemini 2.5 Pro (gemini-2.5-pro): ~$0.0015
These calculations estimate input token costs only . Output tokens are not included and will vary based on:
The model you choose
The complexity of your request
The length of the generated response
To see the complete pricing for both input and output tokens for all available models, visit the Rates table on the Billing page of your dashboard.
Next steps