Created
December 1, 2011 14:51
-
-
Save Iktomist/1417300 to your computer and use it in GitHub Desktop.
i use this all the time to convert plaintext into something festival text to speech package can understand
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# coding=UTF-8 | |
import os, sys, tempfile | |
for file in sys.argv[1:]: | |
f = open(file) | |
fs = f.read() | |
r1 = fs.replace('\n',' ') | |
r2 = r1.replace('\r',' ') | |
r3 = r2.replace('. ','.\n\n') | |
r4 = r3.replace('é','e') | |
r5 = r4.replace('\xc2',' ') | |
r6 = r5.replace('\xa0',' ') | |
r7 = r6.replace(' ',' ') | |
r8 = r7.replace(' ',' ') | |
r9 = r8.replace('\n ','\n') | |
f.close() | |
print r8 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment