Below is a side-by-side comparison of a basic snippet to transcribe a file by Google Speech-to-Text and AssemblyAI.
Google STT
AssemblyAI
from google.cloud import speechclient = speech.SpeechClient()audio = speech.RecognitionAudio( uri="gs://cloud-samples-tests/speech/Google_Gnome.wav")config = speech.RecognitionConfig( encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16, sample_rate_hertz=16000, language_code="en-US", model="video", # Chosen model)operation = client.long_running_recognize(config=config, audio=audio)print("Waiting for operation to complete...")response = operation.result(timeout=90)for i, result in enumerate(response.results): alternative = result.alternatives[0] print("-" * 20) print(f"First alternative of result {i}") print(f"Transcript: {alternative.transcript}")
import assemblyai as aaiaai.settings.api_key = "YOUR-API-KEY"transcriber = aai.Transcriber()# You can use a local filepath:# audio_file = "./example.mp3"# Or use a publicly-accessible URL:audio_file = ( "https://assembly.ai/sports_injuries.mp3")config = aai.TranscriptionConfig( speech_models=["universal-3-pro", "universal-2"], language_detection=True,)transcript = transcriber.transcribe(audio_file, config)if transcript.status == aai.TranscriptStatus.error: print(f"Transcription failed: {transcript.error}") exit(1)print(transcript.text)
from google.cloud import speechclient = speech.SpeechClient()
import assemblyai as aaiaai.settings.api_key = "YOUR-API-KEY"transcriber = aai.Transcriber()
When migrating from Google Speech-to-Text to AssemblyAI, you’ll first need to handle authentication and SDK setup:Get your API key from your AssemblyAI dashboard.
Things to know:
Store your API key securely in an environment variable
API key authentication works the same across all AssemblyAI SDKs
Here are helpful things to know when migrating your audio input handling:
There’s no need to specify the audio encoding format when using AssemblyAI - we have a transcoding pipeline under the hood which works on all supported file types so that you can get the most accurate transcription.
You can submit a local file, URL, stream, buffer, blob, etc., directly to our transcriber. Check out some common ways you can host audio files here.
You can transcribe audio files that are up to 10 hours long and you can transcribe multiple files in parallel. The default amount of jobs you can transcribe at once is 200 while on the PAYG plan.
print("Waiting for operation to complete...")response = operation.result(timeout=90)for i, result in enumerate(response.results): alternative = result.alternatives[0] print("-" * 20) print(f"First alternative of result {i}") print(f"Transcript: {alternative.transcript}")