Free · Private · In-Browser

VTT to TXT Converter — Extract Just the Dialogue

Open the converter

This VTT to TXT converter strips the WEBVTT header, timestamps, and STYLE / NOTE / REGION blocks from a WebVTT caption file and returns only the spoken text. The transformation happens in your browser, so transcripts of private material never leave your device.

WebVTT is the caption format produced by YouTube, Whisper, and HTML5 video tools. It is great for playback, but lousy for reading: header, metadata, cue identifiers, timestamps, and style markup all clutter the file. For summarization, search, translation, or quoting in a document, you almost always just want the words.

The output is a UTF-8 .txt file with one cue per line, in original cue order. Multi-line cues are joined into a single line with a space separator, and inline cue tags (such as <v Speaker>, <c.class>, or <i>) are stripped so only the spoken dialogue remains.

Use the copy-to-clipboard button to paste straight into an editor or chat, or download the .txt for archival or further processing.

Frequently asked questions

Are my captions uploaded?

No. The transformation runs locally in your browser. Captions and the extracted text never leave your device.

What about STYLE, NOTE, and REGION blocks?

They are stripped along with timestamps and cue identifiers. Only the spoken dialogue is kept.

Are inline cue tags removed?

Yes. Tags like <v Speaker>, <c.class>, or <i> are stripped from the text so the output reads cleanly.

How are multi-line cues handled?

Lines within a cue are joined with a space, so each cue becomes a single line in the output.

Can I feed the output into an LLM?

Yes — plain dialogue is the cleanest input for summarization, translation, or transcript search prompts.

Related tools

Files stay on your device. No login. Installs as a PWA on iPhone, Android, and desktop.
← Back to Free File Converter