Sign up for free

Generating video transcripts

api.video's AI-driven transcription feature can generate video transcripts using a single API call. You can now avoid manually uploading captions, as the generated transcripts are available as video captions in the standard WebVTT format.

Enable your audience to have seamless user experience regardless of their language or location, and also provide more inclusive and accessible content by inviting deaf or hard-of-hearing users!

api.video uses Whisper for multilingual speech recognition in videos. With data security in mind, we run Whisper's ASR models on our own infrastructure and do not expose data outside our service. You control who gets access to your videos and transcripts.

How to generate transcripts

To enable transcription, set these optional parameters when you create a video object using a POST request to the Create video object endpoint:

FieldTypeDescription
transcriptbooleanWhen true, the API generates a transcript for the video. The default value is false.
languagestringA valid language identifier using IETF language tags. You can use primary subtags like en or fr.

When the value in your request does not match any covered language, the API returns an error.

The API generates transcripts using the transcript parameter. You can define the video language, and the API creates a transcript of the video using the language you specify. If you do not specify a language for the video, the API will detect it automatically.

If you do not set the language parameter, the API analyzes the first 30 seconds of the video and determines the language. When this analysis does not have confident results, for example in case of low quality audio, the API does not generate a transcript.

To help you understand how a video's language was defined, the API returns the languageOrigin attribute in the response when you create a video object:

FieldTypeDescription
languageOriginstringReturns the origin of the last update on the video's language attribute.

The possible values are: api and auto.

- api means that the last update was requested from the API.
- auto means that the last update was done automatically by the API.

About the language parameter

When you set the language parameter, make sure that it matches the actual language used in the video. Your setting forces the API to transcribe in that language. Mismatching language settings or videos with dialogue in multiple languages can return low quality transcripts.

language is a permanent attribute of a video object. You can update it to another language using the PATCH /videos/{videoId} operation. This triggers the API to generate a new transcript using a different language.

When the API generates a transcript, it will be available as a caption for your video. Your audience will be able to select during video playback. You can interact with captions generated through transcription using the Captions endpoints.

Examples

Transcribe a video regardless of the language, using POST /videos:

{
  "title": "An awesome video",
  "transcript": true
}

Transcribe a video and force the source language to English, using POST /videos:

{
  "title": "Yet another awesome video",
  "language": "en",
  "transcript": true
}

Generate transcript for a previously uploaded video, using PATCH /videos/{videoId}:

{
  "language": "en",
  "transcript": true
}

Set a default language for a new video, using POST /videos:

{
  "title": "Another awesome video",
  "language": "en"
}

Set a default language for an existing video, using PATCH /videos/{videoId}:

{
  "language": "fr"
}

Supported languages

Transcription is currently available for these languages:

LanguageIETF Language Tag
Arabicar
Catalanca
Czechcs
Danishda
Germande
Greekel
Englishen
Spanishes
Persianfa
Finnishfi
Frenchfr
Hebrewhe
Hindihi
Croatianhr
Hungarianhu
Italianit
Japaneseja
Koreanko
Malayalamml
Dutchnl
Norwegian Nynorsknn
Norwegianno
Polishpl
Portuguesept
Russianru
Slovaksk
Sloveniansl
Telugute
Turkishtr
Ukrainianuk
Urduur
Vietnamesevi
Chinesezh

Next steps

  • Learn more about the WebVTT format and managing captions in the Captions guide
  • Check out other AI-driven features like video translation and AI summary in our blog

Was this page helpful?