-
-
Save ymoslem/5174469f88d9f1fb1660121a663bb87f to your computer and use it in GitHub Desktop.
# Sentence METEOR | |
# METEOR mainly works on sentence evaluation rather than corpus evaluation | |
# Run this file from CMD/Terminal | |
# Example Command: python3 sentence-meteor.py test_file_name.txt mt_file_name.txt | |
import sys | |
from nltk.translate.meteor_score import meteor_score | |
target_test = sys.argv[1] # Test file argument | |
target_pred = sys.argv[2] # MTed file argument | |
# Open the test dataset human translation file | |
with open(target_test) as test: | |
refs = test.readlines() | |
#print("Reference 1st sentence:", refs[0]) | |
# Open the translation file by the NMT model | |
with open(target_pred) as pred: | |
preds = pred.readlines() | |
meteor_file = "meteor-" + target_pred + ".txt" | |
# Calculate METEOR for each sentence and save the result to a file | |
with open(meteor_file, "w+") as output: | |
for line in zip(refs, preds): | |
test = line[0] | |
pred = line[1] | |
#print(test, pred) | |
meteor = round(meteor_score([test], pred), 2) # list of references | |
#print(meteor, "\n") | |
output.write(str(meteor) + "\n") | |
print("Done! Please check the METEOR file '" + meteor_file + "' in the same folder!") |
You can only use BLEU during validation in OpenNMT-tf. Any evaluation metrics should be run independently on the test dataset after training a model.
If you have further questions about OpenNMT, please send them to its GitHub repository or its forum.
How to run this in OpenNMT.py
The script above is not related to any certain framework. First, you train an NMT model, or you might already have a pretrained NMT mode. Then, you translate your test dataset source with that model. Finally, you use this script or any evaluation tool like sacreBLEU to compare the output MT translation with the reference target translation from your test dataset.
You can find here two notebooks that explain how to use OpenNMT-py:
https://github.com/ymoslem/OpenNMT-Tutorial
I hope this helps.
Dear @ymoslem , the following error is showing up when line no 34 ( meteor = round(meteor_score([test], pred), 2) # list of references
) is exactly used without modification.
"hypothesis" expects pre-tokenized hypothesis (Iterable[str]): xxxx xxxxxx xxxxxx .
But, if line no 30, 31 are casted to list type test = list(line[0])
and pred = list(line[1])
, then everything is fine. Should it be modified/pulled ?
hi, im a new in machine translation, could any please help me to find a command to run the meteor, F-measure and scareBleu in OpenNMT-py?