Skip to content

Instantly share code, notes, and snippets.

@ramanqul
Created March 14, 2015 20:42
Show Gist options
  • Save ramanqul/a8504c15ca537ea49c6a to your computer and use it in GitHub Desktop.
Save ramanqul/a8504c15ca537ea49c6a to your computer and use it in GitHub Desktop.
Cyrillic to Latin File Changer
#!/usr/bin/python
# -*- coding: utf-8 -*-
from os import walk, rename, unlink, mkdir, remove
from os.path import isdir, exists
from sys import argv, exit, getfilesystemencoding
from shutil import copyfile
import shutil
conversion = {
u'А': u'A',
u'Б': u'B',
u'В': u'V',
u'Г': u'G',
u'Д': u'D',
u'Е': u'E',
u'Ё': u'E',
u'Ж': u'Zh',
u'З': u'Z',
u'И': u'I',
u'Й': u'Y',
u'К': u'K',
u'Л': u'L',
u'М': u'M',
u'Н': u'N',
u'О': u'O',
u'П': u'P',
u'Р': u'R',
u'С': u'S',
u'Т': u'T',
u'У': u'U',
u'Ф': u'F',
u'Х': u'H',
u'Ц': u'Ts',
u'Ч': u'Ch',
u'Ш': u'Sh',
u'Щ': u'Sch',
u'Ъ': u'',
u'Ы': u'Y',
u'Ь': u'',
u'Э': u'E',
u' ' : '_',
}
def cyr2lat(s):
retval = ""
for c in s:
if ord(c) > 128:
try:
c = conversion[c.upper()].lower()
except KeyError:
c=''
elif c == ' ':
c = '_'
retval += c
return retval
if len(argv) == 1:
print "Usage: %s <dirs>" % argv[0]
exit(-1)
processed = []
def recursive_walk(dir):
# See http://docs.activestate.com/activepython/2.5/whatsnew/2.3/node6.html
found = []
dir = unicode(dir)
for finfo in walk(dir, True):
dirnames = finfo[1]
fnames = finfo[2]
for subdir in dirnames:
subdir = "%s/%s" % (dir, subdir)
if subdir in processed:
continue
for yield_val in recursive_walk(subdir):
yield yield_val
for fname in fnames:
yield '%s/%s' % (dir, fname)
raise StopIteration
if __name__ == "__main__":
fs_enc = getfilesystemencoding()
for dir in argv[1:]:
for fpath in recursive_walk(dir):
new_fpath = cyr2lat(fpath).lower()
print 'new path %s' % new_fpath
print fpath.encode('utf-8')
# First make dirs
path_elts = new_fpath.split('/')
for idx in range(len(path_elts))[1:]:
subpath = '/'.join(path_elts[:idx])
while True:
i = 0
if exists(subpath):
if not isdir(subpath):
print '%s exists but is not a directory, will try again' % subpath
subpath += str(i)
continue
else:
path_elts[idx - 1] = subpath.split('/')[-1]
break
else:
print 'Creating directory: %s' % subpath
mkdir(subpath)
break
print 'Copying %s to %s' % (fpath, new_fpath)
shutil.copyfile(fpath, new_fpath)
remove(fpath)
@a-x-
Copy link

a-x- commented Nov 2, 2015

hi,
do you want bug report?

prerequisites:
wget https://gist.githubusercontent.com/braman/a8504c15ca537ea49c6a/raw/cyrillic2latin_file_renamer.py
chmod a+x ./cyrillic2latin_file_renamer.py
ls

.
..
243
cyrillic2latin_file_renamer.py

./cyrillic2latin_file_renamer.py .

result:

new path ./243
./243
Copying ./243 to ./243
Traceback (most recent call last):
  File "./cyrillic2latin_file_renamer.py", line 106, in <module>
    shutil.copyfile(fpath, new_fpath)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 69, in copyfile
    raise Error("`%s` and `%s` are the same file" % (src, dst))
shutil.Error: `./243` and `./243` are the same file

it is because file after and before renaming is same.

@cirona
Copy link

cirona commented Mar 17, 2017

Hi,
how to change the script to rename the file only with new name ?
Because the names are duplicated:
adriana - znaya, znaya adriana - znaq, znaq.mp3

@etrushkin
Copy link

etrushkin commented Apr 1, 2017

Hi,
I've fixed errors in the script. You can get the modified version here.

@tigran123
Copy link

Various bugs fixed in this version:

https://gist.github.com/tigran123/aafee6f58295ce6e6c06af9eb0aab0fb

First of all, the case should be preserved (no need to lowercase). Second, you forgot two letters: Ю and Я. Third, you should not copy the file as this is extremely slow (on a multi-terabyte library) and will fail if the amount of free filesystem space is less than the size of the file. Instead, just use rename().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment