Skip to content

Instantly share code, notes, and snippets.

@seungwon0
Created February 4, 2012 15:19
Show Gist options
  • Save seungwon0/1738443 to your computer and use it in GitHub Desktop.
Save seungwon0/1738443 to your computer and use it in GitHub Desktop.
Convert SRT into Text
#!/usr/bin/env perl
#
# srt2txt - Convert SRT into Text
#
# Seungwon Jeong <[email protected]>
#
# Copyright (C) 2012 by Seungwon Jeong
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see
# <http://www.gnu.org/licenses/>.
use strict;
use warnings;
use English qw< -no_match_vars >;
use HTML::Strip;
local $INPUT_RECORD_SEPARATOR = q{};
my $hs = HTML::Strip->new();
while ( defined( my $input = <> ) ) {
# Remove subtitle number and start/end time
$input =~ s{\A \d+ \n [\d:,]+ [ ]-->[ ] [\d:,]+ \n}{}xms;
# Strip HTML-like markup
$input = $hs->parse($input);
$hs->eof();
print $input;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment