pavelbinar/extract-subtitles-from-mkv.md

Forked from bmaeser/subtitle-extract.txt

Last active June 16, 2025 06:11

Star (141) You must be signed in to star a gist
Fork (23) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/pavelbinar/20a3366b54f41e355d2745c89091ec46.js"></script>
Save pavelbinar/20a3366b54f41e355d2745c89091ec46 to your computer and use it in GitHub Desktop.

Download ZIP

Extract subtitles from .mkv files on Mac OS X

Raw

extract-subtitles-from-mkv.md

Extract Subtitles From mkv

Instuctions available (moved) at REMOTE ORIGIN website: Extract Subtitles From mkv

sunsetrunner commented Dec 14, 2020

Works on M1 Mac mini. You just need to fire up Terminal in Rosetta 2 mode before remote installing brew. Don't forget to set a folder location for extracted SRT files. Otherwise it'll just save them under main User directory. Thanks for sharing.

larryy commented Dec 14, 2020 •

edited

Loading

If the tool is installed and in your $PATH it will run. The name of the file is irrelevant to the .srt subtitle problem. Unfortunately, there are multiple issues being discussed, not all having to do with subtitles.

That said, trying to extract text subtitles, like .srt, from a .mkv file that has bitmap subtitles won't work, whether from the command line or using a GUI utility like mkvToolnix. mkvToolnix will extract bitmap subtitles to a .mks file, and I suspect that's what this tool is doing as well. Either this tool or ffmpeg would have to implement OCR to convert bitmap subtitles to text subtitles. Sadly, neither one does, but it's kind of understandable, since open source OCR tools are not very good without a language model of some kind. There's a Mac application called Subtitle Extractor in the App Store that does this, but it has no language model, so it will make silly mistakes like replacing "silly" with "sil/y", "I'm" with "I 'm", "with" as "With", "won't" as "won 't" and on and on. It gets enough right that you could probably fix it by hand, but it would be tedious because there are a lot of errors. Better than nothing, I guess.

Consider it a complex feature request I guess, to implement OCR with a language model to convert bitmap subtitles to text subtitles.

victornpb commented Dec 31, 2020

Extract subtitles from MKV on all subdirectories

extract_subtitles.py

from os import walk
import subprocess
import re
from os import path

tool_path = "/Applications/MKVToolNix-51.0.0.app/Contents/MacOS/"
dir = "./"

def find_files(dir, ext):
    file_list = []
    for (dirpath, dirnames, filenames) in walk(dir):
        for filename in filenames:
            if filename.endswith(ext) and not filename.startswith('._'):
                file_list.append(dirpath + '/' + filename)
    file_list.sort()
    return file_list

for file in find_files(dir, ".mkv"):
    basename = file.replace(".mkv", "")
    
    if path.exists(basename+".srt") or path.exists(basename+".ssa"):
        print("Already Exist, skipping...", basename)
        continue

    # Find subtitle track
    result = subprocess.run([tool_path + "mkvmerge", "-i", file], stdout=subprocess.PIPE, check=True)
    
    # SubRip .srt
    srt_track = re.search(r'Track ID (\d+): subtitles \(SubRip/SRT\)', str(result.stdout)) 
    if srt_track:
        srt_track = "{}:{}.{}".format(srt_track.group(1), basename, "srt")

    # SubStation Alpha .ssa
    ssa_track = re.search(r'Track ID (\d+): subtitles \(SubStationAlpha\)', str(result.stdout)) 
    if ssa_track:
        ssa_track = "{}:{}.{}".format(ssa_track.group(1), basename, "ssa")

    if not srt_track and not ssa_track:
        print('No SRT track found!', file, str(result.stdout))
        continue


    # Extract SRT
    subprocess.run(list(filter(None, [tool_path+"mkvextract", "tracks", file, srt_track, ssa_track])), check=True)

print("Finished!");

then run on terminal like:

python3 extract_subtitles.py

victorboykocom commented Jan 11, 2021

same issue exactly with an srt file of over 10MB with binary data.
SAD!

victorboykocom commented Jan 11, 2021

There's a Mac application called Subtitle Extractor in the App Store that does this

Thank you for this @larryy ! I needed an SRT to translate subtitles into another language and needed precise time codes, this app does it!

kwccoin commented Apr 15, 2021

For SRT it is ok but for SUP like Japanese and Chinese, seems OCR is needed.

salishrodinger commented Jun 26, 2021

Thank you so much, it worked !!

alsciende commented Nov 16, 2021

Thanks a lot!

phamhphuc commented Mar 11, 2022

Very simple and work, it's really work well. Thank you very much

RichardsonTxMan commented Jun 15, 2025

There's a Mac application called Subtitle Extractor in the App Store that does this

Thank you for this @larryy ! I needed an SRT to translate subtitles into another language and needed precise time codes, this app does it!

Yeah, it costs $15, too!

Check out the Python script that victornpb posted here, it works perfectly! All you have to do is change the text of the script to whatever version of MKVToolNix you have and then run it with an MKV file in the same folder. I got it to work on the first try... easy, peasy.

Don't forget to buy him a coffee via the link on his page to say thanks.

pavelbinar/extract-subtitles-from-mkv.md

Extract Subtitles From mkv

sunsetrunner commented Dec 14, 2020

Uh oh!

larryy commented Dec 14, 2020 •

edited

Loading

Uh oh!

victornpb commented Dec 31, 2020

Uh oh!

victorboykocom commented Jan 11, 2021

Uh oh!

victorboykocom commented Jan 11, 2021

Uh oh!

kwccoin commented Apr 15, 2021

Uh oh!

salishrodinger commented Jun 26, 2021

Uh oh!

alsciende commented Nov 16, 2021

Uh oh!

phamhphuc commented Mar 11, 2022

Uh oh!

RichardsonTxMan commented Jun 15, 2025

Uh oh!

pavelbinar/extract-subtitles-from-mkv.md

Extract Subtitles From mkv

sunsetrunner commented Dec 14, 2020

Uh oh!

larryy commented Dec 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

victornpb commented Dec 31, 2020

Uh oh!

victorboykocom commented Jan 11, 2021

Uh oh!

victorboykocom commented Jan 11, 2021

Uh oh!

kwccoin commented Apr 15, 2021

Uh oh!

salishrodinger commented Jun 26, 2021

Uh oh!

alsciende commented Nov 16, 2021

Uh oh!

phamhphuc commented Mar 11, 2022

Uh oh!

RichardsonTxMan commented Jun 15, 2025

Uh oh!

larryy commented Dec 14, 2020 •

edited

Loading