This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
This is a workaround for `examples/run_mlm.py` for pretraining models | |
with big text files line-by-line. | |
For the time being, `datasets` is facing some issues dealing with really | |
big text files, so we use a custom dataset until this is fixed. | |
August 3th 2021 | |
Author: Juan Manuel Pérez |