Skip to content

Instantly share code, notes, and snippets.

@jones2126
Created November 25, 2025 14:19
Show Gist options
  • Select an option

  • Save jones2126/6e570d587dd11b8beb4214522ef702e1 to your computer and use it in GitHub Desktop.

Select an option

Save jones2126/6e570d587dd11b8beb4214522ef702e1 to your computer and use it in GitHub Desktop.
Python script to download transcripts from YouTube videos with optional timestamps.

YouTube Transcript Downloader

A simple Python script to download transcripts from YouTube videos with optional timestamps.

Description

This script downloads the transcript (subtitles/captions) from any YouTube video and saves it as a text file. It also fetches the video title and includes it at the top of the transcript for easy reference.

Features

  • Downloads transcripts from YouTube videos
  • Automatically fetches and includes video title
  • Optional timestamp inclusion in [MM:SS] format
  • Saves transcript to a text file named {video_id}_transcript.txt
  • Clean, readable output

Prerequisites

You'll need Python 3.6 or higher installed on your system.

Installation

  1. Install the required dependencies:
pip install youtube-transcript-api yt-dlp

Note: Make sure to use pip install (not apt-get) for Python packages.

Usage

Basic Usage

  1. Open youtube_transcript_download.py in a text editor
  2. Update the url variable with your YouTube video URL:
    url = "https://www.youtube.com/watch?v=YOUR_VIDEO_ID"
  3. Run the script:
    python youtube_transcript_download.py

The transcript will be printed to the console and saved to a file named {video_id}_transcript.txt.

With Timestamps

To include timestamps in the transcript, set include_timestamps to True:

include_timestamps = True

This will format each line like:

[00:15] Welcome to this video
[00:18] Today we're going to talk about...

Without Timestamps (Default)

Keep include_timestamps as False for a clean transcript without timestamps:

include_timestamps = False

Configuration

The script has two main configuration options in the === CONFIGURATION === section:

  • url: The YouTube video URL you want to download the transcript from
  • include_timestamps: Boolean flag (True or False) to include/exclude timestamps

Example Output

File: kE3hPpAanXg_transcript.txt

Title: Example Video Title

This is the first line of the transcript.
This is the second line.
And so on...

Troubleshooting

"No transcript found" error

  • The video may not have captions/subtitles available
  • Try a different video that you know has captions

"Could not fetch title" warning

  • The script will still work, but the title will show as "Unknown Title"
  • Check your internet connection

Import errors

  • Make sure you've installed both dependencies: youtube-transcript-api and yt-dlp
  • Use pip install not apt-get for Python packages

API Version Note

This script uses youtube-transcript-api version 1.x, which uses the instance-based approach:

ytt_api = YouTubeTranscriptApi()
transcript = ytt_api.fetch(video_id)
# youtube_transcript_download.py
from youtube_transcript_api import YouTubeTranscriptApi
import yt_dlp # New import for metadata
def get_transcript(video_url, include_timestamps=False):
"""
Download transcript from a YouTube video, with optional title.
Args:
video_url: Full YouTube URL
include_timestamps: If True, include [MM:SS] timestamps (default: False)
Returns:
Transcript as a string (with title header if available)
"""
# Extract video ID from URL
video_id = video_url.split("v=")[1].split("&")[0]
# Fetch title using yt-dlp
title = None
try:
ydl_opts = {'quiet': True} # Minimal output
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
info = ydl.extract_info(video_url, download=False)
title = info.get('title', 'Unknown Title')
except Exception as e:
print(f"Warning: Could not fetch title: {e}")
title = 'Unknown Title'
# Fetch transcript
ytt_api = YouTubeTranscriptApi()
# transcript = ytt_api.fetch(video_id)
try:
transcript = ytt_api.fetch(video_id)
except Exception as e: # Or specifically TranscriptsDisabled, NoTranscriptFound
return f"Error fetching transcript: {e}"
if include_timestamps:
lines = []
for snippet in transcript.snippets:
mins, secs = divmod(int(snippet.start), 60)
timestamp = f"[{mins:02d}:{secs:02d}]"
lines.append(f"{timestamp} {snippet.text}")
full_text = "\n".join(lines)
else:
full_text = "\n".join([snippet.text for snippet in transcript.snippets])
# Prepend title as header
full_text_with_title = f"Title: {title}\n\n{full_text}"
return full_text_with_title, video_id
# === CONFIGURATION ===
# url = "https://www.youtube.com/watch?v=eRhw-TYEKy4"
url = "https://www.youtube.com/watch?v=kE3hPpAanXg"
include_timestamps = False
# === RUN ===
text, video_id = get_transcript(url, include_timestamps)
print(text)
# Save to file
filename = f"{video_id}_transcript.txt"
with open(filename, "w", encoding="utf-8") as f:
f.write(text)
print(f"\nSaved to {filename}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment