Skip to content

Instantly share code, notes, and snippets.

@ymoslem
Last active May 19, 2024 05:12
Show Gist options
  • Save ymoslem/cdc320d9be2cb9a258fd5e0cc5871004 to your computer and use it in GitHub Desktop.
Save ymoslem/cdc320d9be2cb9a258fd5e0cc5871004 to your computer and use it in GitHub Desktop.
Compute WER score for the whole dataset
# Corpus WER
# WER score for the whole corpus
# Run this file from CMD/Terminal
# Example Command: python3 corpus-wer.py test_file_name.txt mt_file_name.txt
import sys
from jiwer import wer
target_test = sys.argv[1] # Test file argument
target_pred = sys.argv[2] # MTed file argument
# Open the test dataset human translation file
with open(target_test) as test:
refs = test.readlines()
# Open the translation file by the NMT model
with open(target_pred) as pred:
preds = pred.readlines()
wer_file = "wer-" + target_pred + ".txt"
# Calculate WER for the whole corpus
wer_score = wer(refs, preds)
print("WER Score:", wer_score)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment