Revisit Amazon Web Services re:Invent 2024’s biggest moments and watch keynotes and innovation talks on demand

 ✕

Home  »  Products  »  Amazon Transcribe

Amazon Transcribe

Automatically convert speech to text

Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capability to their applications. Using the Amazon Transcribe API, you can analyze audio files stored in Amazon S3 and have the service return a text file of the transcribed speech. You can also send a live audio stream to Amazon Transcribe and receive a stream of transcripts in real time.

Amazon Transcribe can be used for lots of common applications, including the transcription of customer service calls and generating subtitles on audio and video content. The service can transcribe audio files stored in common formats, like WAV and MP3, with time stamps for every word so that you can easily locate the audio in the original source by searching for the text. Amazon Transcribe is continually learning and improving to keep pace with the evolution of language.

Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capability to their applications. Using the Amazon Transcribe API, you can analyze audio files stored in Amazon S3 and have the service return a text file of the transcribed speech. You can also send a live audio stream to Amazon Transcribe and receive a stream of transcripts in real time.

Amazon Transcribe can be used for lots of common applications, including the transcription of customer service calls and generating subtitles on audio and video content. The service can transcribe audio files stored in common formats, like WAV and MP3, with time stamps for every word so that you can easily locate the audio in the original source by searching for the text. Amazon Transcribe is continually learning and improving to keep pace with the evolution of language.

Key Features

Easy-to-Read Transcriptions
Most speech recognition systems output a string of text without punctuation. Amazon Transcribe uses deep learning to add punctuation and formatting automatically, so that the output is more intelligible and can be used without any further editing.
Recognize Multiple Speakers
Amazon Transcribe is able to recognize when the speaker changes and attribute the transcribed text appropriately. This can significantly reduce the amount of work needed to transcribe audio with multiple speakers like telephone calls, meetings, and television shows.
Custom Vocabulary
Amazon Transcribe gives you the ability to expand and customize the speech recognition vocabulary. You can add new words to the base vocabulary and generate highly-accurate transcriptions specific to your use case, such as product names, domain-specific terminology, or names of individuals.
Channel Identification
Amazon Transcribe is able to process audio and video where each speaker is recorded on different channels. Contact centers stand to benefit significantly by submitting a single audio file to Amazon Transcribe, which will identify each channel and produce a single transcript with annotated by channel labels.
Support for a Wide Range of Use Cases

Amazon Transcribe is designed to provide accurate and automated transcripts for a wide range of audio quality. You can generate subtitles for any video or audio files, and even transcribe low quality telephony recordings such as customer service calls.

Streaming Transcription
With Amazon Transcribe, you can transcribe audio to text in real time. Using a secure connection over the HTTP 2 protocol, you can send a live audio stream to the service, and in return, receive a stream of text in real time.
Timestamp Generation
Amazon Transcribe returns a timestamp for each word, so that you can easily locate the audio in the original recording by searching for the text.

Key Features

Easy-to-Read Transcriptions

Most speech recognition systems output a string of text without punctuation. Amazon Transcribe uses deep learning to add punctuation and formatting automatically, so that the output is more intelligible and can be used without any further editing.

Recognize Multiple Speakers

Amazon Transcribe is able to recognize when the speaker changes and attribute the transcribed text appropriately. This can significantly reduce the amount of work needed to transcribe audio with multiple speakers like telephone calls, meetings, and television shows.

Custom Vocabulary

Amazon Transcribe gives you the ability to expand and customize the speech recognition vocabulary. You can add new words to the base vocabulary and generate highly-accurate transcriptions specific to your use case, such as product names, domain-specific terminology, or names of individuals.

Channel Identification

Amazon Transcribe is able to process audio and video where each speaker is recorded on different channels. Contact centers stand to benefit significantly by submitting a single audio file to Amazon Transcribe, which will identify each channel and produce a single transcript with annotated by channel labels.

Support for a Wide Range of Use Cases

Amazon Transcribe is designed to provide accurate and automated transcripts for a wide range of audio quality. You can generate subtitles for any video or audio files, and even transcribe low quality telephony recordings such as customer service calls.

Streaming Transcription

With Amazon Transcribe, you can transcribe audio to text in real time. Using a secure connection over the HTTP 2 protocol, you can send a live audio stream to the service, and in return, receive a stream of text in real time.

Timestamp Generation

Amazon Transcribe returns a timestamp for each word, so that you can easily locate the audio in the original recording by searching for the text.

Use Cases

Improving Customer Service
By converting audio input into text, Amazon Transcribe lets you build text analytics applications that can search and analyze voice input. Customer contact centers can use Amazon Transcribe to transcribe voice-based interactions, and mine the data for insights using other Amazon Web Services services like Amazon Comprehend to extract meaning and intent from conversations.
Cataloging Audio Archives
The service enables you to transcribe audio and video assets into fully searchable archives for compliance monitoring and risk management. Customers can use Amazon Transcribe to convert audio to text, and use Amazon Elasticsearch service to index and perform text-based search across their audio/video library.
Captioning/Subtitling Workflows
Amazon Transcribe can help content generation and media distributors improve reach and access by automatically generating time-stamped subtitles that can be displayed along with the video content.

Use Cases

Improving Customer Service

By converting audio input into text, Amazon Transcribe lets you build text analytics applications that can search and analyze voice input. Customer contact centers can use Amazon Transcribe to transcribe voice-based interactions, and mine the data for insights using other Amazon Web Services services like Amazon Comprehend to extract meaning and intent from conversations.

Cataloging Audio Archives

The service enables you to transcribe audio and video assets into fully searchable archives for compliance monitoring and risk management. Customers can use Amazon Transcribe to convert audio to text, and use Amazon Elasticsearch service to index and perform text-based search across their audio/video library.

Captioning/Subtitling Workflows

Amazon Transcribe can help content generation and media distributors improve reach and access by automatically generating time-stamped subtitles that can be displayed along with the video content.

Benefits

Unlock the value of audio and video content
Audio data is virtually impossible for computers to search and analyze. Amazon Transcribe makes it easy to convert recorded speech into text and integrate these capabilities into applications and downstream tasks. Transcribe can be applied to live audio and video streams or broadcast content for real-time subtitling or transcription.
Save time & money with accurate transcripts

Transcribe uses a deep learning process called automatic speech recognition (ASR) to deliver highly accurate transcripts. Define vocabulary words to generate more accurate transcriptions for domain-specific words and phrases like names or technical terminology.

Transform customer experiences
You can transform customer experiences with Transcribe’s optimized models for call transcription, live video subtitling, and clinical documentation.

Benefits

Unlock the value of audio and video content

Audio data is virtually impossible for computers to search and analyze. Amazon Transcribe makes it easy to convert recorded speech into text and integrate these capabilities into applications and downstream tasks. Transcribe can be applied to live audio and video streams or broadcast content for real-time subtitling or transcription.

Save time & money with accurate transcripts

Transcribe uses a deep learning process called automatic speech recognition (ASR) to deliver highly accurate transcripts. Define vocabulary words to generate more accurate transcriptions for domain-specific words and phrases like names or technical terminology.

Transform customer experiences

You can transform customer experiences with Transcribe’s optimized models for call transcription, live video subtitling, and clinical documentation.