How to summarize a YouTube video with ChatGPT

ChatGPT is free only through the web UI and it has an approx. 4000 character (~500 words) limitation for the input, aka. prompt.

There're easy to use methods to generate a summary for short videos.
E.g. https://notegpt.io/youtube-video-summarizer

There's a Chrome & Safari browser extension that does the trick very nicely for short videos:
YouTube Summary with ChatGPT & Claude

It can download a video's subtitles and feed it to ChatGPT, but it works well only for shorter (< 20-30 min) videos.
Anything longer and the extension has to cut corners by submitting only a part of the subtitles.
I think that in a standard browser environment an extension running on one page cannot open a new page and manipulate the new tab's content.
At least it would take something like this to copy over the entire subtitle of a YT video to ChatGPT in the browser.
You could probably do this in an Electron app though, but that's a lot of work and I've no experience with Electron.

I've worked out a half-automated, quick & dirty method on how to feed all of the subtitles into ChatGPT and get a summary for the entire video.

Of course this work best if the video has proper (human-written/verified) subtitles and not the YT auto-generated stuff.

First you download the video subtitles in SRT format using yt-dlp (imho youtube-dl is outdated and doesn't work for some videos).
yt-dlp --skip-download --write-subs --convert-subs srt --sub-langs en --output input "https://www.youtube.com/watch?v=pq34V_V5j18"

Now split the subtitle file into chunks of 45 subtitles (this chunk size will work nicely with ChatGPT's free web UI) and add some instructions to get the ChatGPT prompts.
I wrote a simple Awk script to do this, but you can apply a similar logic in any other language:
cat input.en.srt | awk '
BEGIN {
  cnt = 1
  idx = 1
  split_by = 45
  between = 1
  file = sprintf("subtitles_%03u.srt", idx)
  printf "Summarize the conversation of a video based on its subtitles in SRT format as seen below.\n\n" > file
}
/^[[:space:]]*$/ {
  if (between == 0) {
    cnt = cnt + 1
    if (cnt % split_by == 1) {
      idx = idx + 1
      file = sprintf("subtitles_%03u.srt", idx)
      printf "Continue the summary of the conversation based on the SRT subtitles below.\n\n" > file
    } else {
      print "" > file
    }
    between = 1
  }
  next
}
{
  print > file
  between = 0
}
'

You could even use ChatGPT to generate the source code to produce the splits in any programming language. Smile

Now start a new chat in ChatGPT and copy&paste each prompt one by one from the subtitles*.srt files in increasing order.

It will result in something like this:
https://chat.openai.com/share/a9131cba-5bc6-46e1-a931-47711606a6b0

This way you create a summary for each chunk of subtitles.
Of course you can finetune the ChatGPT prompts (in the Awk script) to produce a better result or even translate the result to another language.

E.g. you can feed the entire subtitle in chunks and ask for a summary of the video at the end.
cat input.en.srt | awk '
BEGIN {
  cnt = 1
  idx = 1
  split_by = 45
  between = 1
  file = sprintf("subtitles_%03u.srt", idx)
  print "There'"'"'s a video recording of a conversation for which I'"'"'ll give you the subtitles in SRT format." > file
  print "I'"'"'ll give you parts of the subtitles and after each part, confirm that you read and processed it by replying with a simple \"OK\"." >> file
  printf "Here comes the first part of the SRT formatted subtitles:\n\n" >> file
}
/^[[:space:]]*$/ {
  if (between == 0) {
    cnt = cnt + 1
    if (cnt % split_by == 1) {
      idx = idx + 1
      file = sprintf("subtitles_%03u.srt", idx)
      printf "Here comes the next part of the SRT formatted subtitles. Reply with a simple \"OK\" to let me know that you understood.\n\n" > file
    } else {
      print "" >> file
    }
    between = 1
  }
  next
}
{
  print >> file
  between = 0
}
END {
  idx = idx + 1
  file = sprintf("subtitles_%03u.srt", idx)
  print "That was the last part of the subtitles." >> file
  print "Summarize the video for me." >> file
}
'

An example output of the above is here:
https://chat.openai.com/share/6b6bc713-476d-4d0a-ae8a-15fcc128539f

Unfortunately my experience is that this way you only get a summary of the last chunk. Sad
E.g. if I try to ask questions on stuff that was mentioned in the first chunks (names, etc.), ChatGPT doesn't seem to "remember".