A clean installation of Ubuntu 18.04.02 LTS was used.
This gist is an extension to the official docs, adding missing parts and instructions.
follow the pre-installation actions on:
| # MIT License | |
| # | |
| # Copyright (c) 2018 Yuze Huang ([email protected]) | |
| # | |
| # Permission is hereby granted, free of charge, to any person obtaining a copy | |
| # of this software and associated documentation files (the "Software"), to deal | |
| # in the Software without restriction, including without limitation the rights | |
| # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |
| # copies of the Software, and to permit persons to whom the Software is | |
| # furnished to do so, subject to the following conditions: |
| from torch.optim import Optimizer | |
| class AdamW(Optimizer): | |
| """ | |
| Implements Adam algorithm with weight decay fix in PyTorch | |
| Paper: Fixing Weight Decay Regularization in Adam by Ilya Loshchilov, Frank Hutter | |
| https://arxiv.org/abs/1711.05101 | |
| """ | |
| def __init__(self, params, lr, b1=0.9, b2=0.999, e=1e-8, l2=0, | |
| vector_l2=False, max_grad_norm=-1, **kwargs): |
A clean installation of Ubuntu 18.04.02 LTS was used.
This gist is an extension to the official docs, adding missing parts and instructions.
follow the pre-installation actions on:
| fairseq-train qa_en_small-bin \ | |
| --log-interval=10 \ | |
| --log-format=json \ | |
| --tensorboard-logdir=/users/tom/ed/sp/pretrain/tests/fairseq/bart_en_small/logs \ | |
| --seed=1 \ | |
| --cpu \ | |
| --min-loss-scale=0.0001 \ | |
| --model-parallel-size=1 \ | |
| --criterion=cross_entropy \ |