Audio to subtitle

Upload an audio file, generate subtitles with Whisper, then review and export as SRT, VTT, or ASS. For video, use the Video to Subtitle tool.

Upload audio
Please upload audio only: MP3, WAV, M4A, AAC, FLAC, OGG, Opus.
No audio file selected.
Upload an audio file and generate subtitles to preview results here.

Convert audio to subtitles with AI

Powered by OpenAI Whisper, this free online tool transcribes audio files into accurate, timestamped subtitles you can edit and export as SRT, VTT, or ASS.

Powered by OpenAI Whisper

Uses OpenAI's Whisper model to deliver accurate speech-to-text transcription across many languages and accents.

Common audio formats

Upload MP3, WAV, M4A, AAC, FLAC, OGG, Opus, and more up to 100 MB. This page is for audio files only.

Automatic language detection

Let Whisper detect the spoken language, or manually select from 19 supported languages for more accurate results.

Export SRT, VTT, or ASS

Edit cues in the preview, pick SubRip, WebVTT, or Advanced SubStation Alpha, then download a file ready for players or editors.

How to generate subtitles from audio

  1. 1

    Upload your audio file

    Drag and drop or click to select an audio file from your device. Supports MP3, WAV, M4A, AAC, FLAC, OGG, and more up to 100 MB.

  2. 2

    Select language and transcribe

    Choose the spoken language or use auto-detect, then click Generate Subtitles. Whisper processes the audio and returns timed subtitles.

  3. 3

    Review, edit, and download

    Check the subtitle preview, double-click any cue to correct errors, choose SRT, VTT, or ASS, then download.

Audio to subtitle examples, transcription quality, and export tips

Generate subtitles from audio when you have a podcast, voice memo, interview, lecture, or narration track and need timed captions for editing or publishing.

Example input and output

Audio source
MP3, WAV, M4A, AAC, FLAC, OGG, or Opus speech file
Subtitle output
1 00:00:00,000 --> 00:00:03,200 Welcome to today's episode.

Best for

Podcast captions

Create subtitles or transcripts from spoken audio before publishing clips.

Interviews and lectures

Turn long-form spoken recordings into timed text for review and editing.

Narration workflows

Generate SRT or VTT captions for voiceover tracks used in videos.

Common file issues handled

Audio clarity

Clear speech and low background noise improve transcription accuracy.

Language selection

Manual language selection can help when auto-detect guesses incorrectly.

Editable output

Review cue text before downloading to catch names and specialized terms.

Multiple formats

Download SRT for broad support, VTT for web, or ASS for styled workflows.

Frequently Asked Questions

What audio formats are supported?
The tool accepts common audio formats such as MP3, WAV, M4A, AAC, FLAC, OGG, and Opus. The maximum file size is 100 MB. Video files such as MP4 are not supported on this page; use the Video to Subtitle tool instead.
How accurate is the automatic transcription?
Transcription accuracy depends on audio quality, background noise, and the clarity of speech. Whisper performs best with clear, single-speaker audio. You can always edit the generated subtitles before downloading.
Does it support multiple languages?
Yes. Whisper supports multilingual transcription, and the current UI lets you manually select 19 common languages or use automatic language detection.
Is there a file length limit?
Audio under 1 minute can be transcribed without signing in. For longer audio files, sign in to use free credits or upgrade to an active membership. The maximum file size is 100 MB.
What subtitle formats can I download?
You can export SubRip (.srt), WebVTT (.vtt), or Advanced SubStation Alpha (.ass). SRT and VTT work almost everywhere; ASS is useful when you need richer styling in compatible players.
Can I use this to add subtitles to a video?
Generate subtitles from your audio, edit if needed, then download SRT, VTT, or ASS and load it in your editor or player. To transcribe directly from a video file, use the Video to Subtitle tool.