Created
November 2, 2013 06:54
-
-
Save ddimtirov/7276338 to your computer and use it in GitHub Desktop.
Another script translating from one non-standard Cyrillic code page to another (by adding offset). This time in python. Circa 2004
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# -*- coding: windows-1251 -*- | |
r""" | |
:Authors: Dimitar A. Dimitrov | |
:Contact: dimiter[at]blue[dash]edge[dot]bg | |
:Copyright: This work is licensed under the X license. | |
For the full text of the license see http://www.opensource.org/licenses/xnet.php | |
:Version: 0.1 | |
:Date: 2004-08-22 | |
:Abstract: This script tries to extract cyrillic text from custom codepage. | |
""" | |
import os, sys | |
alpha_upper = 0x80 | |
alpha_lower = 0xA0 | |
input = open(sys.argv[1]) | |
text = input.read() | |
input.close() | |
output = open(sys.argv[1] + ".cyr", "w") | |
for c in text: | |
code = ord(c) | |
if c == "�": c = "-" | |
elif c in "��": c = "|" | |
elif c == "�": c="�" | |
elif code in range(alpha_upper, alpha_upper + 31): c = chr(ord("�") + code - alpha_upper) | |
elif code in range(alpha_lower, alpha_lower + 31): c = chr(ord("�") + code - alpha_lower) | |
output.write(c) | |
output.close() | |
print "done." |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment