Skip to content

Instantly share code, notes, and snippets.

@tetsuok
Created May 10, 2013 12:08
Show Gist options
  • Select an option

  • Save tetsuok/5554020 to your computer and use it in GitHub Desktop.

Select an option

Save tetsuok/5554020 to your computer and use it in GitHub Desktop.
Data format converter for Chris' fast_align: https://github.com/clab/fast_align
#!/usr/bin/env python
"""
Data format converter for Chris' fast_align: https://github.com/clab/fast_align
For the input format, please see the README of his code.
"""
import itertools
import sys
def main():
if len(sys.argv) != 3:
print 'Usage: %s fr en' % (sys.argv[0])
sys.exit()
fr = sys.argv[1]
en = sys.argv[2]
# TODO: error check for the input data.
for i, lines in enumerate(itertools.izip(*[file(f) for f in [fr, en]]), 1):
print '%s ||| %s' % (lines[0].rstrip(), lines[1].rstrip())
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment