Skip to content

Instantly share code, notes, and snippets.

@jnv
jnv / srt2txt.rb
Last active November 20, 2022 16:11
Converts SRT subtitles to plain text, strips irrelevant parts. Requires gems: srt, sanitize. Created for Doctor Who text analysis project.
#!/usr/bin/env ruby
require "srt"
require "sanitize"
REJECT_LINES = [/Best watched using Open Subtitles MKV Player/,
/Subtitles downloaded from www.OpenSubtitles.org/, /^Subtitles by/,
/www.tvsubtitles.net/, /[email protected]/, /addic7ed/, /allsubs.org/,
/www.seriessub.com/, /www.transcripts.subtitle.me.uk/, /~ Bad Wolf Team/,
/^Transcript by/, /^Update by /, /UKsubtitles.ru/
]