Skip to content

Instantly share code, notes, and snippets.

@taylanbil
Created August 19, 2019 16:53
Show Gist options
  • Select an option

  • Save taylanbil/48eba2f98f6b64b871da087c483c4710 to your computer and use it in GitHub Desktop.

Select an option

Save taylanbil/48eba2f98f6b64b871da087c483c4710 to your computer and use it in GitHub Desktop.
Fairseq Transformer, 1 TPU on the small dataset.
Fri Aug 16 18:43:11 UTC 2019
#!/bin/bash
batch_size=128
n_words=64
data_path=/home/taylanbil/data/wmt18_en_de_bpej32k
data_path=/home/taylanbil/data/dummy
#conda activate pytorch
pkill -9 python
TPU_IP_ADDRESS=10.1.2.2 # nightly
TPU_IP_ADDRESS=10.1.4.2 # nightly
#export XLA_USE_32BIT_LONG=1
#export XLA_IR_DEBUG=1
#export XLA_HLO_DEBUG=1
#export GET_TENSORS_OPBYOP=1
#export SYNC_TENSORS_OPBYOP=1
#export XLA_SAVE_TENSORS_FILE=$tensors_dir/${taskname}_tensors.txt
#export TRIM_GRAPH_SIZE=50000
#export XLA_SYNC_WAIT=1
export XRT_TPU_CONFIG="tpu_worker;0;$TPU_IP_ADDRESS:8470"
other_flags="
--disable-validation \
--max-tokens=4096 \ # has no effect w/ TPUS
--num-workers=8 \
"
#LD_PRELOAD=/usr/lib/libtcmalloc_minimal.so.4 python tpu-examples/fairseq_train_tpu.py \
python tpu-examples/fairseq_train_tpu.py \
$data_path \
--arch=transformer_vaswani_wmt_en_de_big \
--max-sentences=$batch_size \
--max-sentences-valid=$batch_size \
--max-source-positions=$n_words \
--max-target-positions=$n_words \
--required-batch-size-multiple=$batch_size \
--no-save \
--attention-dropout=0.1 \
--no-progress-bar \
--criterion=label_smoothed_cross_entropy \
--log-interval=100 \
--source-lang=en \
--lr-scheduler=inverse_sqrt \
--min-lr 1e-09 \
--skip-invalid-size-inputs-valid-test \
--target-lang=de \
--label-smoothing=0.1 \
--update-freq=1 \
--optimizer adam \
--adam-betas '(0.9, 0.98)' \
--warmup-init-lr 1e-07 \
--lr 0.0005 \
--warmup-updates 4000 \
--share-all-embeddings \
--dropout 0.3 \
--weight-decay 0.0 \
--valid-subset=valid \
--curriculum=4 \
--max-epoch=50 \
--num_cores=1 \
--metrics_debug \
--pad_to_length=$n_words \
--log_steps=10
--------------
nohup: ignoring input
2019-08-16 18:43:12.286463: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) CPU:0 -> /job:tpu_worker/replica:0/task:0/device:XLA_CPU:0
2019-08-16 18:43:12.286515: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:0 -> /job:tpu_worker/replica:0/task:0/device:TPU:0
2019-08-16 18:43:12.286522: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:1 -> /job:tpu_worker/replica:0/task:0/device:TPU:1
2019-08-16 18:43:12.286528: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:2 -> /job:tpu_worker/replica:0/task:0/device:TPU:2
2019-08-16 18:43:12.286534: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:3 -> /job:tpu_worker/replica:0/task:0/device:TPU:3
2019-08-16 18:43:12.286540: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:4 -> /job:tpu_worker/replica:0/task:0/device:TPU:4
2019-08-16 18:43:12.286545: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:5 -> /job:tpu_worker/replica:0/task:0/device:TPU:5
2019-08-16 18:43:12.286551: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:6 -> /job:tpu_worker/replica:0/task:0/device:TPU:6
2019-08-16 18:43:12.286556: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:196] XRT device (LOCAL) TPU:7 -> /job:tpu_worker/replica:0/task:0/device:TPU:7
2019-08-16 18:43:12.286591: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:200] Worker grpc://10.1.4.2:8470 for /job:tpu_worker/replica:0/task:0
2019-08-16 18:43:12.286598: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:204] XRT default device: TPU:0
2019-08-16 18:43:12.288808: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:1086] Configuring TPU for worker tpu_worker:0 at grpc://10.1.4.2:8470
2019-08-16 18:43:16.981469: I tensorflow/compiler/xla/xla_client/xrt_computation_client.cc:1102] TPU topology: mesh_shape: 2
mesh_shape: 2
mesh_shape: 2
num_tasks: 1
num_tpu_devices_per_task: 8
device_coordinates: 0
device_coordinates: 0
device_coordinates: 0
device_coordinates: 0
device_coordinates: 0
device_coordinates: 1
device_coordinates: 0
device_coordinates: 1
device_coordinates: 0
device_coordinates: 0
device_coordinates: 1
device_coordinates: 1
device_coordinates: 1
device_coordinates: 0
device_coordinates: 0
device_coordinates: 1
device_coordinates: 0
device_coordinates: 1
device_coordinates: 1
device_coordinates: 1
device_coordinates: 0
device_coordinates: 1
device_coordinates: 1
device_coordinates: 1
| [en] dictionary: 35662 types
| [de] dictionary: 35662 types
| /home/taylanbil/data/dummy valid en-de 3004 examples
TransformerModel(
(encoder): TransformerEncoder(
(embed_tokens): Embedding(35662, 1024, padding_idx=1)
(embed_positions): SinusoidalPositionalEmbedding()
(layers): ModuleList(
(0): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(1): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(2): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(3): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(4): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(5): TransformerEncoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)
)
(decoder): TransformerDecoder(
(embed_tokens): Embedding(35662, 1024, padding_idx=1)
(embed_positions): SinusoidalPositionalEmbedding()
(layers): ModuleList(
(0): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(1): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(2): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(3): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(4): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
(5): TransformerDecoderLayer(
(self_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(self_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(encoder_attn): MultiheadAttention(
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(encoder_attn_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)
)
)
| model transformer_vaswani_wmt_en_de_big, criterion LabelSmoothedCrossEntropyCriterion
| num. model params: 212875264 (num. trained: 212875264)
| no existing checkpoint found checkpoints/checkpoint_last.pt
| loading train data for epoch 0
| /home/taylanbil/data/dummy train en-de 3004 examples
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
Epoch 1 begin 18:43:24
training/ 18:46:12, device xla:1, step 10, Rate=136.28, Global Rate=7.61
training/ 18:46:20, device xla:1, step 20, Rate=172.04, Global Rate=14.58
Epoch 1 Training stats:
device xla:1
| epoch 001 | loss 15.308 | nll_loss 15.248 | ppl 38903.59 | wps 410 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 22 | lr 2.84945e-06 | gnorm 6.421 | clip 0.000 | oom 0.000 | wall 178 | train_wall 28
Epoch 1 Tracker Rates:
Rate=172.26, Global Rate=15.90
Epoch 1 end 18:46:22
Metric: CompileTime
TotalSamples: 3
Counter: 02m24s661ms768.778us
ValueRate: 947ms183.191us / second
Rate: 0.0197796 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=091ms699.712us; 50%=01m11s278ms507.643us; 80%=01m12s293ms561.423us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 23
Counter: 28s546ms982.069us
ValueRate: 158ms194.850us / second
Rate: 0.132088 / second
Percentiles: 1%=045ms741.955us; 5%=612ms203.275us; 10%=612ms442.063us; 20%=613ms682.177us; 50%=615ms568.250us; 80%=647ms132.993us; 90%=654ms914.822us; 95%=07s337ms68.986us; 99%=08s684ms985.577us
Metric: InboundData
TotalSamples: 5
Counter: 20.00B
ValueRate: 466.25B / second
Rate: 116.564 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 135
Counter: 816.74MB
ValueRate: 4.52MB / second
Rate: 0.746446 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=139.30MB
Metric: ReleaseDataHandlesTime
TotalSamples: 517
Counter: 06s787ms358.285us
ValueRate: 033ms234.090us / second
Rate: 2.96889 / second
Percentiles: 1%=645.234us; 5%=749.296us; 10%=864.516us; 20%=001ms40.561us; 50%=002ms867.943us; 80%=005ms82.749us; 90%=007ms116.753us; 95%=009ms844.443us; 99%=304ms302.831us
Metric: TransferFromServerTime
TotalSamples: 5
Counter: 081ms978.255us
ValueRate: 02s888ms936.259us / second
Rate: 116.571 / second
Percentiles: 1%=001ms339.909us; 5%=001ms339.909us; 10%=001ms339.909us; 20%=001ms346.061us; 50%=001ms388.896us; 80%=041ms732.304us; 90%=041ms732.304us; 95%=041ms732.304us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 135
Counter: 31s756ms219.402us
ValueRate: 171ms626.786us / second
Rate: 0.748942 / second
Percentiles: 1%=001ms186.559us; 5%=001ms369.036us; 10%=001ms491.537us; 20%=002ms626.568us; 50%=002ms97.526us; 80%=028ms965.181us; 90%=515ms641.945us; 95%=552ms749.428us; 99%=07s243ms1.796us
Counter: CachedSyncTensors
Value: 20
Counter: CreateCompileHandles
Value: 3
Counter: CreateDataHandles
Value: 16552
Counter: CreateXlaTensor
Value: 106708
Counter: DestroyDataHandles
Value: 15682
Counter: DestroyXlaTensor
Value: 105963
Counter: ReleaseDataHandles
Value: 15682
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 3
Counter: XRTAllocateFromTensor_Empty
Value: 229
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 5
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:46:22, device xla:1, step 0
validation/ 18:47:02, device xla:1, step 10
validation/ 18:47:04, device xla:1, step 20
validation stats on subset "valid" - 18:47:05
| epoch 001 | valid on 'valid' subset | loss 14.455 | nll_loss 14.280 | ppl 19897.77 | num_updates 22
old learning rate: 1e-07
new learning rate: 2.8494500000000003e-06
Metric: CompileTime
TotalSamples: 6
Counter: 03m56s691ms419.112us
ValueRate: 832ms51.214us / second
Rate: 0.0284152 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=11s642ms545.220us; 50%=11s726ms428.582us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 45
Counter: 39s516ms831.257us
ValueRate: 177ms9.162us / second
Rate: 0.206809 / second
Percentiles: 1%=045ms741.955us; 5%=215ms130.838us; 10%=216ms593.944us; 20%=216ms824.426us; 50%=613ms638.846us; 80%=646ms239.932us; 90%=02s284ms239.225us; 95%=02s291ms408.715us; 99%=08s684ms985.577us
Metric: InboundData
TotalSamples: 8
Counter: 32.00B
ValueRate: 0.74B / second
Rate: 0.18409 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 141
Counter: 820.91MB
ValueRate: 3.68MB / second
Rate: 0.63164 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=264.00KB; 95%=776.00KB; 99%=139.30MB
Metric: ReleaseDataHandlesTime
TotalSamples: 572
Counter: 13s702ms496.690us
ValueRate: 058ms387.180us / second
Rate: 2.6292 / second
Percentiles: 1%=645.234us; 5%=771.474us; 10%=868.210us; 20%=001ms15.402us; 50%=002ms561.276us; 80%=005ms846.460us; 90%=007ms947.415us; 95%=009ms844.443us; 99%=600ms487.286us
Metric: TransferFromServerTime
TotalSamples: 8
Counter: 123ms81.219us
ValueRate: 003ms832.253us / second
Rate: 0.18409 / second
Percentiles: 1%=001ms332.362us; 5%=001ms332.362us; 10%=001ms332.362us; 20%=001ms339.909us; 50%=004ms32.002us; 80%=037ms738.600us; 90%=041ms732.304us; 95%=041ms732.304us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 141
Counter: 34s700ms510.581us
ValueRate: 151ms228.790us / second
Rate: 0.632747 / second
Percentiles: 1%=001ms186.559us; 5%=001ms387.748us; 10%=002ms510.013us; 20%=002ms627.845us; 50%=002ms193.784us; 80%=214ms864.585us; 90%=515ms641.945us; 95%=552ms749.428us; 99%=07s243ms1.796us
Counter: CachedSyncTensors
Value: 39
Counter: CreateCompileHandles
Value: 6
Counter: CreateDataHandles
Value: 16805
Counter: CreateXlaTensor
Value: 131308
Counter: DestroyDataHandles
Value: 15929
Counter: DestroyXlaTensor
Value: 130557
Counter: ReleaseDataHandles
Value: 15929
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 6
Counter: XRTAllocateFromTensor_Empty
Value: 229
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 8
Epoch 2 begin 18:47:05
training/ 18:47:12, device xla:1, step 10, Rate=166.65, Global Rate=192.18
training/ 18:47:19, device xla:1, step 20, Rate=178.87, Global Rate=186.02
Epoch 2 Training stats:
device xla:1
| epoch 002 | loss 14.739 | nll_loss 14.609 | ppl 24984.08 | wps 616 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 44 | lr 5.5989e-06 | gnorm 5.143 | clip 0.000 | oom 0.000 | wall 237 | train_wall 42
Epoch 2 Tracker Rates:
Rate=179.61, Global Rate=185.55
Epoch 2 end 18:47:21
Metric: CompileTime
TotalSamples: 6
Counter: 03m56s691ms419.112us
ValueRate: 832ms51.214us / second
Rate: 0.0284152 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=11s642ms545.220us; 50%=11s726ms428.582us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 67
Counter: 52s074ms574.626us
ValueRate: 223ms58.332us / second
Rate: 0.286996 / second
Percentiles: 1%=045ms741.955us; 5%=215ms286.808us; 10%=216ms641.270us; 20%=216ms256.299us; 50%=614ms770.004us; 80%=617ms762.261us; 90%=650ms479.576us; 95%=02s290ms903.845us; 99%=08s684ms985.577us
Metric: InboundData
TotalSamples: 13
Counter: 52.00B
ValueRate: 0.88B / second
Rate: 0.219125 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 231
Counter: 825.08MB
ValueRate: 3.44MB / second
Rate: 0.961765 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 1137
Counter: 14s301ms33.127us
ValueRate: 153ms600.086us / second
Rate: 14.3941 / second
Percentiles: 1%=583.596us; 5%=712.428us; 10%=812.044us; 20%=972.568us; 50%=002ms553.319us; 80%=005ms899.604us; 90%=007ms918.383us; 95%=008ms62.307us; 99%=010ms802.259us
Metric: TransferFromServerTime
TotalSamples: 13
Counter: 170ms751.930us
ValueRate: 003ms861.302us / second
Rate: 0.219125 / second
Percentiles: 1%=001ms332.362us; 5%=001ms332.362us; 10%=001ms339.909us; 20%=001ms346.061us; 50%=002ms571.634us; 80%=037ms738.600us; 90%=041ms541.436us; 95%=041ms732.304us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 231
Counter: 45s888ms452.385us
ValueRate: 187ms362.324us / second
Rate: 0.964183 / second
Percentiles: 1%=001ms186.559us; 5%=001ms356.282us; 10%=001ms436.467us; 20%=002ms624.598us; 50%=002ms357.997us; 80%=468ms175.133us; 90%=526ms63.777us; 95%=535ms201.625us; 99%=03s478ms159.301us
Counter: CachedSyncTensors
Value: 61
Counter: CreateCompileHandles
Value: 6
Counter: CreateDataHandles
Value: 33125
Counter: CreateXlaTensor
Value: 237458
Counter: DestroyDataHandles
Value: 32246
Counter: DestroyXlaTensor
Value: 236707
Counter: ReleaseDataHandles
Value: 32246
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 6
Counter: XRTAllocateFromTensor_Empty
Value: 229
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 13
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:47:21, device xla:1, step 0
validation/ 18:47:23, device xla:1, step 10
validation/ 18:47:25, device xla:1, step 20
validation stats on subset "valid" - 18:47:26
| epoch 002 | valid on 'valid' subset | loss 13.741 | nll_loss 13.478 | ppl 11413.00 | num_updates 44
old learning rate: 2.8494500000000003e-06
new learning rate: 5.598900000000001e-06
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 90
Counter: 57s854ms943.040us
ValueRate: 238ms123.584us / second
Rate: 0.376951 / second
Percentiles: 1%=004ms950.332us; 5%=215ms286.808us; 10%=216ms734.852us; 20%=216ms428.616us; 50%=613ms638.846us; 80%=616ms687.131us; 90%=647ms132.993us; 95%=02s284ms239.225us; 99%=08s684ms985.577us
Metric: InboundData
TotalSamples: 17
Counter: 65.00B
ValueRate: 1.01B / second
Rate: 0.263234 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 258
Counter: 829.24MB
ValueRate: 3.38MB / second
Rate: 1.05085 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 1225
Counter: 14s402ms463.691us
ValueRate: 137ms757.981us / second
Rate: 13.8467 / second
Percentiles: 1%=582.288us; 5%=714.747us; 10%=815.507us; 20%=968.504us; 50%=001ms428.069us; 80%=005ms583.335us; 90%=007ms632.481us; 95%=008ms916.006us; 99%=010ms783.866us
Metric: TransferFromServerTime
TotalSamples: 17
Counter: 177ms522.409us
ValueRate: 003ms733.336us / second
Rate: 0.263234 / second
Percentiles: 1%=001ms332.362us; 5%=001ms332.362us; 10%=001ms339.909us; 20%=001ms354.718us; 50%=002ms571.634us; 80%=036ms171.085us; 90%=041ms541.436us; 95%=041ms732.304us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 258
Counter: 50s836ms450.919us
ValueRate: 203ms324.339us / second
Rate: 1.0526 / second
Percentiles: 1%=001ms186.559us; 5%=001ms369.036us; 10%=001ms458.320us; 20%=002ms659.172us; 50%=005ms280.134us; 80%=215ms734.920us; 90%=524ms893.625us; 95%=532ms253.496us; 99%=03s478ms159.301us
Counter: CachedSyncTensors
Value: 83
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 33400
Counter: CreateXlaTensor
Value: 262059
Counter: DestroyDataHandles
Value: 32519
Counter: DestroyXlaTensor
Value: 261308
Counter: ReleaseDataHandles
Value: 32521
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 229
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 17
Epoch 3 begin 18:47:26
training/ 18:47:33, device xla:1, step 10, Rate=167.98, Global Rate=191.20
training/ 18:47:40, device xla:1, step 20, Rate=176.87, Global Rate=184.79
Epoch 3 Training stats:
device xla:1
| epoch 003 | loss 14.387 | nll_loss 14.214 | ppl 19000.78 | wps 848 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 66 | lr 8.34835e-06 | gnorm 4.541 | clip 0.000 | oom 0.000 | wall 258 | train_wall 56
Epoch 3 Tracker Rates:
Rate=176.79, Global Rate=184.02
Epoch 3 end 18:47:42
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 112
Counter: 01m10s396ms180.759us
ValueRate: 276ms378.324us / second
Rate: 0.439717 / second
Percentiles: 1%=045ms741.955us; 5%=216ms593.944us; 10%=216ms824.426us; 20%=217ms618.217us; 50%=614ms770.004us; 80%=616ms879.971us; 90%=645ms821.911us; 95%=654ms914.822us; 99%=07s337ms68.986us
Metric: InboundData
TotalSamples: 22
Counter: 85.00B
ValueRate: 1.06B / second
Rate: 0.27313 / second
Percentiles: 1%=1.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 352
Counter: 833.41MB
ValueRate: 3.19MB / second
Rate: 1.34645 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 1811
Counter: 16s090ms648.981us
ValueRate: 090ms687.304us / second
Rate: 32.8642 / second
Percentiles: 1%=572.069us; 5%=683.579us; 10%=786.361us; 20%=934.406us; 50%=001ms424.814us; 80%=005ms957.055us; 90%=007ms298.334us; 95%=009ms517.754us; 99%=010ms981.629us
Metric: TransferFromServerTime
TotalSamples: 22
Counter: 188ms699.957us
ValueRate: 002ms330.296us / second
Rate: 0.27313 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms198.477us; 20%=001ms339.909us; 50%=002ms571.634us; 80%=006ms413.828us; 90%=037ms738.600us; 95%=041ms541.436us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 352
Counter: 01m01s143ms37.134us
ValueRate: 234ms421.837us / second
Rate: 1.34956 / second
Percentiles: 1%=001ms186.559us; 5%=001ms372.156us; 10%=001ms491.323us; 20%=002ms627.845us; 50%=005ms371.245us; 80%=215ms174.103us; 90%=526ms136.635us; 95%=530ms138.061us; 99%=02s281ms882.730us
Counter: CachedSyncTensors
Value: 105
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 49724
Counter: CreateXlaTensor
Value: 368209
Counter: DestroyDataHandles
Value: 48845
Counter: DestroyXlaTensor
Value: 367458
Counter: ReleaseDataHandles
Value: 48845
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 229
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 22
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:47:42, device xla:1, step 0
validation/ 18:47:44, device xla:1, step 10
validation/ 18:47:47, device xla:1, step 20
validation stats on subset "valid" - 18:47:47
| epoch 003 | valid on 'valid' subset | loss 13.368 | nll_loss 13.063 | ppl 8555.88 | num_updates 66
old learning rate: 5.598900000000001e-06
new learning rate: 8.348350000000001e-06
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 135
Counter: 01m15s167ms931.875us
ValueRate: 289ms292.215us / second
Rate: 0.51957 / second
Percentiles: 1%=004ms950.332us; 5%=216ms577.911us; 10%=216ms824.426us; 20%=217ms521.388us; 50%=612ms442.063us; 80%=616ms706.146us; 90%=618ms619.181us; 95%=650ms479.576us; 99%=07s337ms68.986us
Metric: InboundData
TotalSamples: 26
Counter: 98.00B
ValueRate: 1.14B / second
Rate: 0.303549 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 379
Counter: 837.58MB
ValueRate: 3.14MB / second
Rate: 1.42076 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 1905
Counter: 16s191ms680.978us
ValueRate: 079ms579.802us / second
Rate: 30.6363 / second
Percentiles: 1%=572.590us; 5%=684.337us; 10%=796.268us; 20%=934.159us; 50%=001ms348.154us; 80%=004ms492.247us; 90%=007ms70.361us; 95%=008ms442.885us; 99%=010ms905.492us
Metric: TransferFromServerTime
TotalSamples: 26
Counter: 230ms810.379us
ValueRate: 003ms683.028us / second
Rate: 0.303549 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms198.477us; 20%=001ms339.909us; 50%=002ms571.634us; 80%=006ms413.828us; 90%=037ms114.494us; 95%=041ms541.436us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 379
Counter: 01m06s058ms350.032us
ValueRate: 248ms11.224us / second
Rate: 1.42293 / second
Percentiles: 1%=001ms186.559us; 5%=001ms383.621us; 10%=001ms491.796us; 20%=002ms660.697us; 50%=006ms609.880us; 80%=215ms703.215us; 90%=526ms991.890us; 95%=530ms740.707us; 99%=02s281ms882.730us
Counter: CachedSyncTensors
Value: 128
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 49999
Counter: CreateXlaTensor
Value: 392810
Counter: DestroyDataHandles
Value: 49118
Counter: DestroyXlaTensor
Value: 392059
Counter: ReleaseDataHandles
Value: 49120
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 229
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 26
Epoch 4 begin 18:47:47
training/ 18:47:54, device xla:1, step 10, Rate=166.85, Global Rate=190.14
training/ 18:48:01, device xla:1, step 20, Rate=178.27, Global Rate=184.86
Epoch 4 Training stats:
device xla:1
| epoch 004 | loss 14.124 | nll_loss 13.920 | ppl 15504.79 | wps 1045 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 88 | lr 1.10978e-05 | gnorm 4.167 | clip 0.000 | oom 0.000 | wall 280 | train_wall 70
Epoch 4 Tracker Rates:
Rate=178.67, Global Rate=184.34
Epoch 4 end 18:48:03
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 157
Counter: 01m29s707ms386.858us
ValueRate: 322ms693.578us / second
Rate: 0.569354 / second
Percentiles: 1%=004ms950.332us; 5%=216ms593.944us; 10%=216ms985.178us; 20%=217ms598.813us; 50%=613ms475.091us; 80%=616ms817.813us; 90%=617ms222.841us; 95%=649ms7.344us; 99%=07s337ms68.986us
Metric: InboundData
TotalSamples: 31
Counter: 118.00B
ValueRate: 1.16B / second
Rate: 0.305163 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 473
Counter: 841.75MB
ValueRate: 2.98MB / second
Rate: 1.67447 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 2472
Counter: 18s865ms772.683us
ValueRate: 093ms314.555us / second
Rate: 33.7124 / second
Percentiles: 1%=600.593us; 5%=680.980us; 10%=772.825us; 20%=934.159us; 50%=001ms356.274us; 80%=005ms870.760us; 90%=008ms661.908us; 95%=009ms885.795us; 99%=010ms404.235us
Metric: TransferFromServerTime
TotalSamples: 31
Counter: 237ms661.107us
ValueRate: 002ms329.686us / second
Rate: 0.305163 / second
Percentiles: 1%=807.358us; 5%=835.320us; 10%=877.012us; 20%=001ms253.647us; 50%=002ms550.440us; 80%=004ms32.002us; 90%=037ms738.600us; 95%=041ms541.436us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 473
Counter: 01m18s981ms362.677us
ValueRate: 277ms651.422us / second
Rate: 1.67804 / second
Percentiles: 1%=001ms184.075us; 5%=001ms356.282us; 10%=001ms458.119us; 20%=002ms618.829us; 50%=006ms629.548us; 80%=215ms32.955us; 90%=527ms629.473us; 95%=530ms138.061us; 99%=665ms425.857us
Counter: CachedSyncTensors
Value: 150
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 66323
Counter: CreateXlaTensor
Value: 498960
Counter: DestroyDataHandles
Value: 65444
Counter: DestroyXlaTensor
Value: 498209
Counter: ReleaseDataHandles
Value: 65444
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 31
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:48:03, device xla:1, step 0
validation/ 18:48:05, device xla:1, step 10
validation/ 18:48:08, device xla:1, step 20
validation stats on subset "valid" - 18:48:08
| epoch 004 | valid on 'valid' subset | loss 13.052 | nll_loss 12.712 | ppl 6710.52 | num_updates 88
old learning rate: 8.348350000000001e-06
new learning rate: 1.1097800000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 180
Counter: 02m33s484ms575.151us
ValueRate: 333ms845.731us / second
Rate: 0.640885 / second
Percentiles: 1%=003ms959.118us; 5%=216ms607.570us; 10%=216ms985.178us; 20%=217ms533.342us; 50%=612ms442.063us; 80%=616ms771.427us; 90%=617ms502.284us; 95%=647ms132.993us; 99%=07s337ms68.986us
Metric: InboundData
TotalSamples: 35
Counter: 131.00B
ValueRate: 1.23B / second
Rate: 0.328069 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 500
Counter: 845.92MB
ValueRate: 2.94MB / second
Rate: 1.73738 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 2566
Counter: 18s973ms811.534us
ValueRate: 081ms943.880us / second
Rate: 30.7203 / second
Percentiles: 1%=620.583us; 5%=689.899us; 10%=783.977us; 20%=931.680us; 50%=001ms298.530us; 80%=004ms434.362us; 90%=008ms509.365us; 95%=009ms869.047us; 99%=010ms404.235us
Metric: TransferFromServerTime
TotalSamples: 35
Counter: 281ms345.456us
ValueRate: 003ms637.167us / second
Rate: 0.328069 / second
Percentiles: 1%=807.358us; 5%=835.320us; 10%=877.012us; 20%=001ms253.647us; 50%=002ms526.568us; 80%=006ms413.828us; 90%=037ms114.494us; 95%=041ms732.304us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 500
Counter: 01m23s873ms307.424us
ValueRate: 288ms373.518us / second
Rate: 1.73985 / second
Percentiles: 1%=001ms186.559us; 5%=001ms372.156us; 10%=001ms475.538us; 20%=002ms626.568us; 50%=006ms831.257us; 80%=215ms734.920us; 90%=526ms351.236us; 95%=530ms858.513us; 99%=665ms425.857us
Counter: CachedSyncTensors
Value: 173
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 66598
Counter: CreateXlaTensor
Value: 523561
Counter: DestroyDataHandles
Value: 65717
Counter: DestroyXlaTensor
Value: 522810
Counter: ReleaseDataHandles
Value: 65719
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 35
Epoch 5 begin 18:48:08
training/ 18:48:15, device xla:1, step 10, Rate=168.43, Global Rate=192.39
training/ 18:48:22, device xla:1, step 20, Rate=177.06, Global Rate=185.54
Epoch 5 Training stats:
device xla:1
| epoch 005 | loss 13.905 | nll_loss 13.676 | ppl 13086.37 | wps 1215 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 110 | lr 1.38473e-05 | gnorm 3.844 | clip 0.000 | oom 0.000 | wall 301 | train_wall 84
Epoch 5 Tracker Rates:
Rate=177.57, Global Rate=184.88
Epoch 5 end 18:48:24
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 202
Counter: 02m47s052ms112.630us
ValueRate: 361ms761.487us / second
Rate: 0.680732 / second
Percentiles: 1%=004ms950.332us; 5%=216ms613.492us; 10%=216ms45.304us; 20%=217ms573.285us; 50%=613ms475.091us; 80%=616ms0.284us; 90%=618ms619.181us; 95%=646ms192.311us; 99%=02s291ms408.715us
Metric: InboundData
TotalSamples: 40
Counter: 151.00B
ValueRate: 1.23B / second
Rate: 0.32633 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 591
Counter: 850.09MB
ValueRate: 2.80MB / second
Rate: 1.94753 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 3143
Counter: 20s722ms985.202us
ValueRate: 095ms497.717us / second
Rate: 33.0803 / second
Percentiles: 1%=628.257us; 5%=708.821us; 10%=800.551us; 20%=959.738us; 50%=001ms436.521us; 80%=005ms341.243us; 90%=008ms635.395us; 95%=009ms992.154us; 99%=010ms404.235us
Metric: TransferFromServerTime
TotalSamples: 40
Counter: 290ms835.732us
ValueRate: 002ms364.549us / second
Rate: 0.32633 / second
Percentiles: 1%=807.358us; 5%=860.901us; 10%=896.996us; 20%=001ms198.477us; 50%=001ms496.556us; 80%=004ms32.002us; 90%=037ms114.494us; 95%=041ms732.304us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 591
Counter: 02m34s962ms312.525us
ValueRate: 310ms250.972us / second
Rate: 1.9514 / second
Percentiles: 1%=001ms185.203us; 5%=001ms372.156us; 10%=001ms458.302us; 20%=002ms620.201us; 50%=006ms827.034us; 80%=215ms991.852us; 90%=526ms63.777us; 95%=529ms367.292us; 99%=640ms715.818us
Counter: CachedSyncTensors
Value: 195
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 82919
Counter: CreateXlaTensor
Value: 629711
Counter: DestroyDataHandles
Value: 82040
Counter: DestroyXlaTensor
Value: 628960
Counter: ReleaseDataHandles
Value: 82040
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 40
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:48:24, device xla:1, step 0
validation/ 18:48:26, device xla:1, step 10
validation/ 18:48:29, device xla:1, step 20
validation stats on subset "valid" - 18:48:29
| epoch 005 | valid on 'valid' subset | loss 12.692 | nll_loss 12.303 | ppl 5052.19 | num_updates 110
old learning rate: 1.1097800000000002e-05
new learning rate: 1.3847250000000004e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 225
Counter: 02m52s819ms966.189us
ValueRate: 370ms483.835us / second
Rate: 0.745481 / second
Percentiles: 1%=003ms959.118us; 5%=216ms613.492us; 10%=216ms967.264us; 20%=217ms516.335us; 50%=612ms203.275us; 80%=616ms894.985us; 90%=617ms762.261us; 95%=645ms821.911us; 99%=02s291ms408.715us
Metric: InboundData
TotalSamples: 44
Counter: 164.00B
ValueRate: 1.28B / second
Rate: 0.344715 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 619
Counter: 854.25MB
ValueRate: 2.77MB / second
Rate: 2.00488 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 3243
Counter: 20s832ms98.987us
ValueRate: 083ms863.297us / second
Rate: 30.86 / second
Percentiles: 1%=639.553us; 5%=722.512us; 10%=826.725us; 20%=957.435us; 50%=001ms356.274us; 80%=005ms626.009us; 90%=007ms430.141us; 95%=009ms896.738us; 99%=010ms269.518us
Metric: TransferFromServerTime
TotalSamples: 44
Counter: 335ms503.257us
ValueRate: 003ms620.644us / second
Rate: 0.344715 / second
Percentiles: 1%=807.358us; 5%=860.901us; 10%=896.996us; 20%=001ms157.936us; 50%=001ms463.409us; 80%=004ms32.002us; 90%=037ms114.494us; 95%=041ms732.304us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 619
Counter: 02m39s874ms713.813us
ValueRate: 321ms666.023us / second
Rate: 2.00753 / second
Percentiles: 1%=001ms186.559us; 5%=001ms383.621us; 10%=001ms475.538us; 20%=002ms631.245us; 50%=006ms185.505us; 80%=215ms734.920us; 90%=526ms840.855us; 95%=529ms306.112us; 99%=613ms188.762us
Counter: CachedSyncTensors
Value: 218
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 83195
Counter: CreateXlaTensor
Value: 654312
Counter: DestroyDataHandles
Value: 82314
Counter: DestroyXlaTensor
Value: 653561
Counter: ReleaseDataHandles
Value: 82316
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 44
Epoch 6 begin 18:48:29
training/ 18:48:36, device xla:1, step 10, Rate=169.00, Global Rate=192.36
training/ 18:48:43, device xla:1, step 20, Rate=179.18, Global Rate=187.01
Epoch 6 Training stats:
device xla:1
| epoch 006 | loss 13.695 | nll_loss 13.441 | ppl 11118.70 | wps 1363 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 132 | lr 1.65967e-05 | gnorm 3.584 | clip 0.000 | oom 0.000 | wall 321 | train_wall 98
Epoch 6 Tracker Rates:
Rate=179.10, Global Rate=186.24
Epoch 6 end 18:48:45
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 247
Counter: 02m05s378ms967.654us
ValueRate: 395ms780.730us / second
Rate: 0.777735 / second
Percentiles: 1%=003ms959.118us; 5%=216ms634.839us; 10%=216ms987.439us; 20%=217ms538.597us; 50%=613ms324.068us; 80%=616ms87.465us; 90%=617ms388.057us; 95%=623ms868.638us; 99%=02s291ms408.715us
Metric: InboundData
TotalSamples: 49
Counter: 184.00B
ValueRate: 1.28B / second
Rate: 0.341538 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 707
Counter: 858.42MB
ValueRate: 2.65MB / second
Rate: 2.17999 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 3748
Counter: 21s398ms382.735us
ValueRate: 094ms857.360us / second
Rate: 31.7134 / second
Percentiles: 1%=639.394us; 5%=736.666us; 10%=852.136us; 20%=999.278us; 50%=002ms508.441us; 80%=005ms387.361us; 90%=008ms690.883us; 95%=009ms109.084us; 99%=011ms644.054us
Metric: TransferFromServerTime
TotalSamples: 49
Counter: 387ms567.475us
ValueRate: 003ms694.436us / second
Rate: 0.341538 / second
Percentiles: 1%=807.358us; 5%=860.901us; 10%=896.996us; 20%=001ms198.477us; 50%=001ms422.211us; 80%=004ms32.002us; 90%=041ms541.436us; 95%=041ms787.486us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 707
Counter: 02m50s456ms262.493us
ValueRate: 341ms218.665us / second
Rate: 2.18405 / second
Percentiles: 1%=001ms186.559us; 5%=001ms360.509us; 10%=001ms475.538us; 20%=002ms624.598us; 50%=006ms80.178us; 80%=215ms18.662us; 90%=526ms662.574us; 95%=529ms64.262us; 99%=613ms188.762us
Counter: CachedSyncTensors
Value: 240
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 99513
Counter: CreateXlaTensor
Value: 760462
Counter: DestroyDataHandles
Value: 98634
Counter: DestroyXlaTensor
Value: 759711
Counter: ReleaseDataHandles
Value: 98634
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 49
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:48:45, device xla:1, step 0
validation/ 18:48:47, device xla:1, step 10
validation/ 18:48:50, device xla:1, step 20
validation stats on subset "valid" - 18:48:50
| epoch 006 | valid on 'valid' subset | loss 12.301 | nll_loss 11.851 | ppl 3693.81 | num_updates 132
old learning rate: 1.3847250000000004e-05
new learning rate: 1.65967e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 270
Counter: 02m10s171ms433.007us
ValueRate: 403ms299.667us / second
Rate: 0.836519 / second
Percentiles: 1%=003ms506.921us; 5%=216ms613.492us; 10%=216ms28.840us; 20%=217ms570.631us; 50%=612ms203.275us; 80%=616ms985.133us; 90%=617ms222.841us; 95%=621ms398.382us; 99%=02s291ms408.715us
Metric: InboundData
TotalSamples: 53
Counter: 197.00B
ValueRate: 1.33B / second
Rate: 0.356688 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 735
Counter: 862.59MB
ValueRate: 2.61MB / second
Rate: 2.22804 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 3847
Counter: 22s511ms388.486us
ValueRate: 081ms363.023us / second
Rate: 29.6324 / second
Percentiles: 1%=639.394us; 5%=736.666us; 10%=852.136us; 20%=995.345us; 50%=001ms414.040us; 80%=005ms599.007us; 90%=007ms401.526us; 95%=009ms965.430us; 99%=010ms498.658us
Metric: TransferFromServerTime
TotalSamples: 53
Counter: 431ms283.500us
ValueRate: 003ms902.524us / second
Rate: 0.356688 / second
Percentiles: 1%=807.358us; 5%=860.901us; 10%=001ms58.001us; 20%=001ms209.838us; 50%=001ms463.409us; 80%=004ms32.002us; 90%=039ms243.925us; 95%=041ms787.486us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 735
Counter: 02m55s403ms646.395us
ValueRate: 350ms464.680us / second
Rate: 2.23211 / second
Percentiles: 1%=001ms186.559us; 5%=001ms369.036us; 10%=001ms483.836us; 20%=002ms640.027us; 50%=006ms393.907us; 80%=215ms970.770us; 90%=525ms987.607us; 95%=529ms862.003us; 99%=613ms188.762us
Counter: CachedSyncTensors
Value: 263
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 99789
Counter: CreateXlaTensor
Value: 785063
Counter: DestroyDataHandles
Value: 98908
Counter: DestroyXlaTensor
Value: 784312
Counter: ReleaseDataHandles
Value: 98910
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 53
Epoch 7 begin 18:48:50
training/ 18:48:57, device xla:1, step 10, Rate=168.90, Global Rate=193.68
training/ 18:49:04, device xla:1, step 20, Rate=177.12, Global Rate=185.99
Epoch 7 Training stats:
device xla:1
| epoch 007 | loss 13.480 | nll_loss 13.199 | ppl 9401.78 | wps 1493 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 154 | lr 1.93462e-05 | gnorm 3.472 | clip 0.000 | oom 0.000 | wall 342 | train_wall 112
Epoch 7 Tracker Rates:
Rate=177.38, Global Rate=185.23
Epoch 7 end 18:49:06
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 292
Counter: 02m24s754ms270.990us
ValueRate: 425ms537.795us / second
Rate: 0.86234 / second
Percentiles: 1%=003ms506.921us; 5%=216ms634.839us; 10%=216ms84.506us; 20%=217ms618.217us; 50%=613ms324.068us; 80%=616ms180.683us; 90%=618ms882.314us; 95%=621ms245.737us; 99%=02s291ms408.715us
Metric: InboundData
TotalSamples: 58
Counter: 217.00B
ValueRate: 1.32B / second
Rate: 0.352693 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 823
Counter: 866.76MB
ValueRate: 2.51MB / second
Rate: 2.38317 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 4406
Counter: 23s159ms297.613us
ValueRate: 088ms295.671us / second
Rate: 31.0941 / second
Percentiles: 1%=621.565us; 5%=733.307us; 10%=848.092us; 20%=001ms10.398us; 50%=001ms484.745us; 80%=005ms885.934us; 90%=008ms554.615us; 95%=009ms887.611us; 99%=011ms681.455us
Metric: TransferFromServerTime
TotalSamples: 58
Counter: 439ms92.804us
ValueRate: 003ms670.085us / second
Rate: 0.352693 / second
Percentiles: 1%=807.358us; 5%=860.901us; 10%=001ms58.001us; 20%=001ms228.991us; 50%=001ms458.171us; 80%=004ms543.772us; 90%=039ms243.925us; 95%=041ms787.486us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 823
Counter: 02m06s425ms570.105us
ValueRate: 367ms728.955us / second
Rate: 2.38734 / second
Percentiles: 1%=001ms186.559us; 5%=001ms360.509us; 10%=001ms458.302us; 20%=002ms620.201us; 50%=006ms363.697us; 80%=215ms32.955us; 90%=525ms593.086us; 95%=528ms469.060us; 99%=603ms299.454us
Counter: CachedSyncTensors
Value: 285
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 116107
Counter: CreateXlaTensor
Value: 891213
Counter: DestroyDataHandles
Value: 115228
Counter: DestroyXlaTensor
Value: 890462
Counter: ReleaseDataHandles
Value: 115228
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 58
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:49:06, device xla:1, step 0
validation/ 18:49:08, device xla:1, step 10
validation/ 18:49:11, device xla:1, step 20
validation stats on subset "valid" - 18:49:11
| epoch 007 | valid on 'valid' subset | loss 12.066 | nll_loss 11.580 | ppl 3060.60 | num_updates 154
old learning rate: 1.65967e-05
new learning rate: 1.934615e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 315
Counter: 02m29s552ms216.117us
ValueRate: 432ms190.280us / second
Rate: 0.916445 / second
Percentiles: 1%=003ms506.921us; 5%=216ms634.839us; 10%=216ms102.252us; 20%=217ms656.533us; 50%=224ms742.434us; 80%=616ms123.879us; 90%=618ms679.766us; 95%=621ms45.607us; 99%=02s290ms903.845us
Metric: InboundData
TotalSamples: 62
Counter: 230.00B
ValueRate: 1.36B / second
Rate: 0.36569 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 851
Counter: 870.93MB
ValueRate: 2.48MB / second
Rate: 2.4267 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 4507
Counter: 23s268ms360.411us
ValueRate: 077ms168.846us / second
Rate: 29.6077 / second
Percentiles: 1%=621.565us; 5%=744.034us; 10%=844.554us; 20%=989.667us; 50%=001ms377.160us; 80%=004ms162.704us; 90%=007ms130.144us; 95%=008ms416.038us; 99%=010ms110.494us
Metric: TransferFromServerTime
TotalSamples: 62
Counter: 447ms808.272us
ValueRate: 003ms635.375us / second
Rate: 0.36569 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms63.480us; 20%=001ms248.953us; 50%=001ms463.409us; 80%=003ms754.984us; 90%=037ms114.494us; 95%=041ms732.304us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 851
Counter: 02m11s365ms853.115us
ValueRate: 375ms34.003us / second
Rate: 2.42952 / second
Percentiles: 1%=001ms186.559us; 5%=001ms361.802us; 10%=001ms475.741us; 20%=002ms633.623us; 50%=007ms515.518us; 80%=215ms991.852us; 90%=524ms314.346us; 95%=528ms332.390us; 99%=603ms299.454us
Counter: CachedSyncTensors
Value: 308
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 116383
Counter: CreateXlaTensor
Value: 915814
Counter: DestroyDataHandles
Value: 115502
Counter: DestroyXlaTensor
Value: 915063
Counter: ReleaseDataHandles
Value: 115504
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 62
Epoch 8 begin 18:49:11
training/ 18:49:18, device xla:1, step 10, Rate=167.15, Global Rate=192.84
training/ 18:49:25, device xla:1, step 20, Rate=176.74, Global Rate=185.01
Epoch 8 Training stats:
device xla:1
| epoch 008 | loss 13.278 | nll_loss 12.970 | ppl 8025.07 | wps 1608 | ups 0 | wpb 3319.773 | bsz 128.000 | num_updates 176 | lr 2.20956e-05 | gnorm 3.440 | clip 0.000 | oom 0.000 | wall 363 | train_wall 125
Epoch 8 Tracker Rates:
Rate=176.40, Global Rate=184.14
Epoch 8 end 18:49:27
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 337
Counter: 03m42s148ms344.212us
ValueRate: 451ms837.390us / second
Rate: 0.936995 / second
Percentiles: 1%=003ms506.921us; 5%=216ms641.270us; 10%=216ms177.857us; 20%=217ms734.938us; 50%=613ms266.858us; 80%=616ms493.661us; 90%=618ms155.472us; 95%=621ms245.737us; 99%=02s290ms903.845us
Metric: InboundData
TotalSamples: 67
Counter: 250.00B
ValueRate: 1.35B / second
Rate: 0.361192 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 942
Counter: 875.10MB
ValueRate: 2.39MB / second
Rate: 2.57111 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 5076
Counter: 25s027ms170.818us
ValueRate: 095ms735.811us / second
Rate: 33.7138 / second
Percentiles: 1%=596.245us; 5%=732.382us; 10%=807.201us; 20%=977.528us; 50%=001ms445.529us; 80%=005ms16.896us; 90%=008ms554.238us; 95%=009ms763.283us; 99%=010ms308.221us
Metric: TransferFromServerTime
TotalSamples: 67
Counter: 455ms447.787us
ValueRate: 002ms455.282us / second
Rate: 0.361192 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms58.001us; 20%=001ms228.991us; 50%=001ms463.409us; 80%=003ms754.984us; 90%=037ms114.494us; 95%=041ms732.304us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 942
Counter: 02m22s463ms908.184us
ValueRate: 389ms479.759us / second
Rate: 2.57534 / second
Percentiles: 1%=001ms186.559us; 5%=001ms354.117us; 10%=001ms475.538us; 20%=002ms624.063us; 50%=006ms470.971us; 80%=215ms93.400us; 90%=524ms968.991us; 95%=528ms631.834us; 99%=552ms749.428us
Counter: CachedSyncTensors
Value: 330
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 132704
Counter: CreateXlaTensor
Value: 1021964
Counter: DestroyDataHandles
Value: 131825
Counter: DestroyXlaTensor
Value: 1021213
Counter: ReleaseDataHandles
Value: 131825
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 67
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:49:27, device xla:1, step 0
validation/ 18:49:29, device xla:1, step 10
validation/ 18:49:32, device xla:1, step 20
validation stats on subset "valid" - 18:49:32
| epoch 008 | valid on 'valid' subset | loss 11.730 | nll_loss 11.180 | ppl 2320.68 | num_updates 176
old learning rate: 1.934615e-05
new learning rate: 2.2095600000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 360
Counter: 03m47s961ms373.449us
ValueRate: 458ms727.731us / second
Rate: 0.986947 / second
Percentiles: 1%=003ms506.921us; 5%=216ms698.612us; 10%=216ms203.962us; 20%=217ms797.392us; 50%=226ms439.386us; 80%=616ms350.427us; 90%=618ms112.541us; 95%=621ms45.607us; 99%=02s290ms903.845us
Metric: InboundData
TotalSamples: 71
Counter: 263.00B
ValueRate: 1.38B / second
Rate: 0.372538 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 970
Counter: 879.26MB
ValueRate: 2.37MB / second
Rate: 2.60946 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 5171
Counter: 25s158ms17.311us
ValueRate: 082ms162.285us / second
Rate: 30.7709 / second
Percentiles: 1%=601.902us; 5%=736.493us; 10%=825.656us; 20%=984.604us; 50%=001ms400.680us; 80%=005ms583.245us; 90%=007ms297.972us; 95%=009ms676.729us; 99%=010ms178.754us
Metric: TransferFromServerTime
TotalSamples: 71
Counter: 463ms856.758us
ValueRate: 002ms428.618us / second
Rate: 0.372538 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms63.480us; 20%=001ms248.953us; 50%=001ms489.224us; 80%=003ms754.984us; 90%=037ms738.600us; 95%=041ms732.304us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 970
Counter: 02m27s443ms518.260us
ValueRate: 397ms80.599us / second
Rate: 2.61233 / second
Percentiles: 1%=001ms186.559us; 5%=001ms356.282us; 10%=001ms479.558us; 20%=002ms633.623us; 50%=007ms574.337us; 80%=215ms32.955us; 90%=524ms893.625us; 95%=528ms582.800us; 99%=552ms749.428us
Counter: CachedSyncTensors
Value: 353
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 132980
Counter: CreateXlaTensor
Value: 1046565
Counter: DestroyDataHandles
Value: 132099
Counter: DestroyXlaTensor
Value: 1045814
Counter: ReleaseDataHandles
Value: 132101
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 71
Epoch 9 begin 18:49:32
training/ 18:49:39, device xla:1, step 10, Rate=168.87, Global Rate=192.77
training/ 18:49:46, device xla:1, step 20, Rate=177.72, Global Rate=186.35
Epoch 9 Training stats:
device xla:1
| epoch 009 | loss 13.074 | nll_loss 12.739 | ppl 6837.09 | wps 1710 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 198 | lr 2.48451e-05 | gnorm 3.358 | clip 0.000 | oom 0.000 | wall 384 | train_wall 139
Epoch 9 Tracker Rates:
Rate=177.40, Global Rate=185.44
Epoch 9 end 18:49:48
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 382
Counter: 03m01s568ms887.063us
ValueRate: 474ms436.441us / second
Rate: 1.00369 / second
Percentiles: 1%=003ms506.921us; 5%=216ms734.852us; 10%=216ms256.299us; 20%=217ms901.192us; 50%=613ms266.858us; 80%=617ms760.984us; 90%=619ms631.829us; 95%=621ms245.737us; 99%=02s290ms903.845us
Metric: InboundData
TotalSamples: 76
Counter: 283.00B
ValueRate: 1.37B / second
Rate: 0.368161 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1058
Counter: 883.43MB
ValueRate: 321.98KB / second
Rate: 4.64309 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 5702
Counter: 27s965ms838.010us
ValueRate: 098ms100.362us / second
Rate: 32.2562 / second
Percentiles: 1%=613.042us; 5%=739.971us; 10%=856.007us; 20%=001ms23.525us; 50%=002ms526.614us; 80%=006ms712.408us; 90%=008ms820.404us; 95%=009ms203.821us; 99%=011ms332.571us
Metric: TransferFromServerTime
TotalSamples: 76
Counter: 472ms553.885us
ValueRate: 002ms284.312us / second
Rate: 0.368161 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms63.480us; 20%=001ms233.625us; 50%=001ms489.224us; 80%=003ms754.984us; 90%=037ms738.600us; 95%=041ms732.304us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1058
Counter: 03m38s421ms824.234us
ValueRate: 630ms230.416us / second
Rate: 4.64318 / second
Percentiles: 1%=001ms196.486us; 5%=001ms360.509us; 10%=001ms483.836us; 20%=002ms639.428us; 50%=007ms611.894us; 80%=215ms491.355us; 90%=524ms893.625us; 95%=527ms450.307us; 99%=536ms610.401us
Counter: CachedSyncTensors
Value: 375
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 149298
Counter: CreateXlaTensor
Value: 1152715
Counter: DestroyDataHandles
Value: 148419
Counter: DestroyXlaTensor
Value: 1151964
Counter: ReleaseDataHandles
Value: 148419
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 76
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:49:48, device xla:1, step 0
validation/ 18:49:50, device xla:1, step 10
validation/ 18:49:53, device xla:1, step 20
validation stats on subset "valid" - 18:49:53
| epoch 009 | valid on 'valid' subset | loss 11.568 | nll_loss 10.974 | ppl 2011.65 | num_updates 198
old learning rate: 2.2095600000000002e-05
new learning rate: 2.4845050000000004e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 405
Counter: 03m05s371ms605.597us
ValueRate: 481ms618.322us / second
Rate: 1.05006 / second
Percentiles: 1%=003ms530.828us; 5%=216ms734.852us; 10%=216ms306.967us; 20%=217ms927.395us; 50%=224ms742.434us; 80%=617ms727.369us; 90%=618ms340.313us; 95%=621ms45.607us; 99%=02s284ms239.225us
Metric: InboundData
TotalSamples: 80
Counter: 296.00B
ValueRate: 1.40B / second
Rate: 0.378223 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1086
Counter: 887.60MB
ValueRate: 331.94KB / second
Rate: 4.61042 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 5804
Counter: 27s087ms390.578us
ValueRate: 086ms964.845us / second
Rate: 30.195 / second
Percentiles: 1%=618.109us; 5%=753.375us; 10%=868.918us; 20%=001ms26.608us; 50%=001ms452.297us; 80%=005ms225.259us; 90%=008ms695.803us; 95%=009ms0.006us; 99%=011ms289.634us
Metric: TransferFromServerTime
TotalSamples: 80
Counter: 479ms99.645us
ValueRate: 002ms265.083us / second
Rate: 0.378223 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms96.757us; 20%=001ms248.953us; 50%=001ms496.556us; 80%=003ms774.990us; 90%=037ms738.600us; 95%=041ms732.304us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1086
Counter: 03m43s379ms629.750us
ValueRate: 633ms853.607us / second
Rate: 4.60635 / second
Percentiles: 1%=001ms196.486us; 5%=001ms361.802us; 10%=001ms490.004us; 20%=002ms659.172us; 50%=007ms745.436us; 80%=215ms174.103us; 90%=524ms774.530us; 95%=527ms378.305us; 99%=532ms380.907us
Counter: CachedSyncTensors
Value: 398
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 149574
Counter: CreateXlaTensor
Value: 1177316
Counter: DestroyDataHandles
Value: 148694
Counter: DestroyXlaTensor
Value: 1176565
Counter: ReleaseDataHandles
Value: 148695
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1152
Counter: XrtExecuteChained_Empty
Value: 1152
Counter: XrtExecute_Empty
Value: 1152
Counter: XrtRead_Empty
Value: 1152
Counter: XrtReleaseAllocationHandle_Empty
Value: 1152
Counter: XrtReleaseCompileHandle_Empty
Value: 1152
Counter: XrtSessionCount
Value: 11
Counter: XrtSubTuple_Empty
Value: 1152
Counter: aten::_local_scalar_dense
Value: 80
Epoch 10 begin 18:49:53
training/ 18:50:00, device xla:1, step 10, Rate=171.00, Global Rate=194.64
training/ 18:50:07, device xla:1, step 20, Rate=177.35, Global Rate=186.34
Epoch 10 Training stats:
device xla:1
| epoch 010 | loss 12.879 | nll_loss 12.517 | ppl 5861.84 | wps 1801 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 220 | lr 2.75945e-05 | gnorm 3.429 | clip 0.000 | oom 0.000 | wall 405 | train_wall 153
Epoch 10 Tracker Rates:
Rate=177.27, Global Rate=185.46
Epoch 10 end 18:50:09
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 427
Counter: 03m19s968ms39.420us
ValueRate: 496ms534.817us / second
Rate: 1.06345 / second
Percentiles: 1%=003ms530.828us; 5%=216ms776.965us; 10%=216ms399.379us; 20%=217ms936.771us; 50%=613ms70.522us; 80%=617ms866.539us; 90%=619ms774.216us; 95%=621ms312.657us; 99%=02s284ms239.225us
Metric: InboundData
TotalSamples: 85
Counter: 316.00B
ValueRate: 1.39B / second
Rate: 0.373641 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1173
Counter: 891.77MB
ValueRate: 390.02KB / second
Rate: 5.62433 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 6376
Counter: 29s068ms627.766us
ValueRate: 105ms258.265us / second
Rate: 32.4241 / second
Percentiles: 1%=613.042us; 5%=749.904us; 10%=868.918us; 20%=001ms26.301us; 50%=001ms492.272us; 80%=006ms723.872us; 90%=008ms359.640us; 95%=010ms680.954us; 99%=012ms992.002us
Metric: TransferFromServerTime
TotalSamples: 85
Counter: 618ms300.311us
ValueRate: 003ms717.913us / second
Rate: 0.373641 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms96.757us; 20%=001ms248.953us; 50%=001ms489.224us; 80%=003ms774.990us; 90%=037ms738.600us; 95%=041ms732.304us; 99%=134ms529.880us
Metric: TransferToServerTime
TotalSamples: 1173
Counter: 03m54s355ms866.538us
ValueRate: 772ms917.792us / second
Rate: 5.64058 / second
Percentiles: 1%=001ms196.486us; 5%=001ms349.114us; 10%=001ms479.231us; 20%=002ms654.841us; 50%=007ms797.709us; 80%=216ms621.187us; 90%=524ms616.373us; 95%=527ms136.013us; 99%=531ms823.814us
Counter: CachedSyncTensors
Value: 420
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 165891
Counter: CreateXlaTensor
Value: 1283466
Counter: DestroyDataHandles
Value: 165007
Counter: DestroyXlaTensor
Value: 1282715
Counter: ReleaseDataHandles
Value: 165012
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 85
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:50:09, device xla:1, step 0
validation/ 18:50:11, device xla:1, step 10
validation/ 18:50:14, device xla:1, step 20
validation stats on subset "valid" - 18:50:14
| epoch 010 | valid on 'valid' subset | loss 11.415 | nll_loss 10.782 | ppl 1761.11 | num_updates 220
old learning rate: 2.4845050000000004e-05
new learning rate: 2.7594500000000005e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 450
Counter: 03m24s799ms248.395us
ValueRate: 501ms10.895us / second
Rate: 1.10626 / second
Percentiles: 1%=003ms506.921us; 5%=216ms776.965us; 10%=216ms428.616us; 20%=217ms980.059us; 50%=226ms439.386us; 80%=617ms761.537us; 90%=619ms631.829us; 95%=621ms245.737us; 99%=02s284ms239.225us
Metric: InboundData
TotalSamples: 89
Counter: 329.00B
ValueRate: 1.41B / second
Rate: 0.382633 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1201
Counter: 895.94MB
ValueRate: 403.21KB / second
Rate: 5.60035 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 6480
Counter: 29s223ms263.061us
ValueRate: 094ms140.113us / second
Rate: 31.3699 / second
Percentiles: 1%=620.326us; 5%=756.409us; 10%=880.538us; 20%=001ms18.084us; 50%=001ms389.629us; 80%=005ms147.455us; 90%=008ms947.192us; 95%=009ms471.213us; 99%=012ms992.002us
Metric: TransferFromServerTime
TotalSamples: 89
Counter: 625ms385.276us
ValueRate: 003ms688.683us / second
Rate: 0.382633 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms63.480us; 20%=001ms233.625us; 50%=001ms481.028us; 80%=003ms774.990us; 90%=037ms738.600us; 95%=041ms732.304us; 99%=134ms529.880us
Metric: TransferToServerTime
TotalSamples: 1201
Counter: 03m59s333ms99.528us
ValueRate: 776ms695.351us / second
Rate: 5.61654 / second
Percentiles: 1%=001ms209.890us; 5%=001ms360.509us; 10%=001ms490.004us; 20%=002ms664.272us; 50%=007ms930.488us; 80%=215ms190.248us; 90%=523ms326.408us; 95%=527ms934.592us; 99%=531ms801.619us
Counter: CachedSyncTensors
Value: 443
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 166167
Counter: CreateXlaTensor
Value: 1308067
Counter: DestroyDataHandles
Value: 165286
Counter: DestroyXlaTensor
Value: 1307316
Counter: ReleaseDataHandles
Value: 165288
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 89
Epoch 11 begin 18:50:14
training/ 18:50:21, device xla:1, step 10, Rate=167.94, Global Rate=191.82
training/ 18:50:28, device xla:1, step 20, Rate=176.39, Global Rate=184.53
Epoch 11 Training stats:
device xla:1
| epoch 011 | loss 12.689 | nll_loss 12.300 | ppl 5043.12 | wps 1884 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 242 | lr 3.0344e-05 | gnorm 3.545 | clip 0.000 | oom 0.000 | wall 427 | train_wall 167
Epoch 11 Tracker Rates:
Rate=176.52, Global Rate=183.80
Epoch 11 end 18:50:30
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 472
Counter: 04m37s380ms818.886us
ValueRate: 514ms211.665us / second
Rate: 1.11652 / second
Percentiles: 1%=003ms506.921us; 5%=216ms824.426us; 10%=216ms445.337us; 20%=217ms1.299us; 50%=613ms70.522us; 80%=617ms964.829us; 90%=619ms724.413us; 95%=621ms245.737us; 99%=02s284ms239.225us
Metric: InboundData
TotalSamples: 94
Counter: 349.00B
ValueRate: 1.40B / second
Rate: 0.378146 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1291
Counter: 900.10MB
ValueRate: 389.46KB / second
Rate: 5.61619 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 7040
Counter: 31s982ms573.823us
ValueRate: 104ms154.518us / second
Rate: 32.83 / second
Percentiles: 1%=646.429us; 5%=775.071us; 10%=868.993us; 20%=001ms21.438us; 50%=001ms461.711us; 80%=005ms418.392us; 90%=008ms222.277us; 95%=010ms532.088us; 99%=012ms992.002us
Metric: TransferFromServerTime
TotalSamples: 94
Counter: 634ms582.723us
ValueRate: 003ms548.798us / second
Rate: 0.378146 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms201.001us; 50%=001ms474.338us; 80%=003ms774.990us; 90%=036ms171.085us; 95%=041ms732.304us; 99%=134ms529.880us
Metric: TransferToServerTime
TotalSamples: 1291
Counter: 03m10s307ms969.441us
ValueRate: 767ms340.329us / second
Rate: 5.61619 / second
Percentiles: 1%=001ms209.890us; 5%=001ms360.509us; 10%=001ms491.323us; 20%=002ms656.292us; 50%=007ms848.256us; 80%=216ms33.287us; 90%=523ms967.978us; 95%=527ms535.561us; 99%=531ms695.655us
Counter: CachedSyncTensors
Value: 465
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 182487
Counter: CreateXlaTensor
Value: 1414217
Counter: DestroyDataHandles
Value: 181608
Counter: DestroyXlaTensor
Value: 1413466
Counter: ReleaseDataHandles
Value: 181608
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 94
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:50:30, device xla:1, step 0
validation/ 18:50:32, device xla:1, step 10
validation/ 18:50:35, device xla:1, step 20
validation stats on subset "valid" - 18:50:35
| epoch 011 | valid on 'valid' subset | loss 11.191 | nll_loss 10.507 | ppl 1455.60 | num_updates 242
old learning rate: 2.7594500000000005e-05
new learning rate: 3.0343950000000003e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 495
Counter: 04m42s180ms568.503us
ValueRate: 519ms322.369us / second
Rate: 1.15701 / second
Percentiles: 1%=003ms506.921us; 5%=216ms824.426us; 10%=216ms461.610us; 20%=217ms28.863us; 50%=224ms742.434us; 80%=617ms866.539us; 90%=619ms595.887us; 95%=621ms45.607us; 99%=02s284ms239.225us
Metric: InboundData
TotalSamples: 98
Counter: 362.00B
ValueRate: 1.43B / second
Rate: 0.386361 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1321
Counter: 904.27MB
ValueRate: 403.17KB / second
Rate: 5.59974 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 7144
Counter: 31s102ms488.671us
ValueRate: 092ms215.407us / second
Rate: 31.2381 / second
Percentiles: 1%=656.349us; 5%=775.336us; 10%=868.993us; 20%=001ms16.215us; 50%=001ms389.629us; 80%=005ms801.903us; 90%=008ms782.790us; 95%=009ms338.444us; 99%=012ms749.463us
Metric: TransferFromServerTime
TotalSamples: 98
Counter: 641ms974.102us
ValueRate: 003ms527.013us / second
Rate: 0.386361 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms209.838us; 50%=001ms474.338us; 80%=003ms774.990us; 90%=036ms171.085us; 95%=041ms732.304us; 99%=134ms529.880us
Metric: TransferToServerTime
TotalSamples: 1321
Counter: 03m15s264ms994.753us
ValueRate: 772ms672.573us / second
Rate: 5.59973 / second
Percentiles: 1%=001ms209.890us; 5%=001ms361.802us; 10%=001ms497.495us; 20%=002ms667.286us; 50%=007ms944.868us; 80%=216ms621.842us; 90%=522ms430.074us; 95%=526ms298.841us; 99%=531ms687.751us
Counter: CachedSyncTensors
Value: 488
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 182765
Counter: CreateXlaTensor
Value: 1438818
Counter: DestroyDataHandles
Value: 181884
Counter: DestroyXlaTensor
Value: 1438067
Counter: ReleaseDataHandles
Value: 181886
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 98
Epoch 12 begin 18:50:35
training/ 18:50:42, device xla:1, step 10, Rate=166.35, Global Rate=190.89
training/ 18:50:49, device xla:1, step 20, Rate=176.82, Global Rate=184.40
Epoch 12 Training stats:
device xla:1
| epoch 012 | loss 12.498 | nll_loss 12.083 | ppl 4337.40 | wps 1958 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 264 | lr 3.30934e-05 | gnorm 3.677 | clip 0.000 | oom 0.000 | wall 448 | train_wall 181
Epoch 12 Tracker Rates:
Rate=177.02, Global Rate=183.74
Epoch 12 end 18:50:51
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 517
Counter: 04m56s768ms139.950us
ValueRate: 531ms248.543us / second
Rate: 1.16494 / second
Percentiles: 1%=003ms530.828us; 5%=216ms852.580us; 10%=216ms490.345us; 20%=217ms75.045us; 50%=613ms67.724us; 80%=617ms66.154us; 90%=619ms773.910us; 95%=621ms245.737us; 99%=654ms914.822us
Metric: InboundData
TotalSamples: 103
Counter: 382.00B
ValueRate: 1.42B / second
Rate: 0.381994 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1409
Counter: 908.44MB
ValueRate: 388.29KB / second
Rate: 5.59939 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 7702
Counter: 33s855ms299.852us
ValueRate: 097ms562.762us / second
Rate: 32.9094 / second
Percentiles: 1%=646.429us; 5%=774.135us; 10%=844.420us; 20%=994.190us; 50%=001ms439.678us; 80%=005ms420.521us; 90%=008ms713.862us; 95%=009ms118.417us; 99%=011ms828.842us
Metric: TransferFromServerTime
TotalSamples: 103
Counter: 649ms937.311us
ValueRate: 002ms406.700us / second
Rate: 0.381994 / second
Percentiles: 1%=832.812us; 5%=896.996us; 10%=001ms33.243us; 20%=001ms198.477us; 50%=001ms463.409us; 80%=003ms774.990us; 90%=006ms413.828us; 95%=041ms541.436us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1409
Counter: 03m27s491ms65.090us
ValueRate: 775ms447.313us / second
Rate: 5.61553 / second
Percentiles: 1%=001ms209.890us; 5%=001ms361.802us; 10%=001ms487.826us; 20%=002ms654.841us; 50%=007ms865.301us; 80%=217ms524.614us; 90%=523ms596.230us; 95%=526ms48.288us; 99%=531ms687.751us
Counter: CachedSyncTensors
Value: 510
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 199083
Counter: CreateXlaTensor
Value: 1544968
Counter: DestroyDataHandles
Value: 198204
Counter: DestroyXlaTensor
Value: 1544217
Counter: ReleaseDataHandles
Value: 198204
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 103
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:50:51, device xla:1, step 0
validation/ 18:50:54, device xla:1, step 10
validation/ 18:50:56, device xla:1, step 20
validation stats on subset "valid" - 18:50:56
| epoch 012 | valid on 'valid' subset | loss 11.301 | nll_loss 10.577 | ppl 1527.06 | num_updates 264
old learning rate: 3.0343950000000003e-05
new learning rate: 3.3093400000000004e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 540
Counter: 04m01s569ms273.144us
ValueRate: 536ms892.884us / second
Rate: 1.20291 / second
Percentiles: 1%=003ms506.921us; 5%=216ms861.222us; 10%=217ms516.335us; 20%=217ms119.954us; 50%=224ms742.434us; 80%=617ms20.529us; 90%=619ms631.829us; 95%=621ms45.607us; 99%=654ms914.822us
Metric: InboundData
TotalSamples: 107
Counter: 395.00B
ValueRate: 1.44B / second
Rate: 0.389465 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1438
Counter: 912.61MB
ValueRate: 402.09KB / second
Rate: 5.58476 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 7798
Counter: 33s969ms80.204us
ValueRate: 083ms188.796us / second
Rate: 30.7106 / second
Percentiles: 1%=646.429us; 5%=774.135us; 10%=854.750us; 20%=994.190us; 50%=001ms369.811us; 80%=005ms763.347us; 90%=007ms289.165us; 95%=009ms946.335us; 99%=011ms698.090us
Metric: TransferFromServerTime
TotalSamples: 107
Counter: 656ms399.700us
ValueRate: 002ms389.204us / second
Rate: 0.389465 / second
Percentiles: 1%=832.812us; 5%=896.996us; 10%=001ms33.243us; 20%=001ms198.477us; 50%=001ms471.897us; 80%=003ms774.990us; 90%=006ms413.828us; 95%=041ms541.436us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1438
Counter: 04m32s439ms52.873us
ValueRate: 774ms658.527us / second
Rate: 5.57883 / second
Percentiles: 1%=001ms209.890us; 5%=001ms374.454us; 10%=002ms504.221us; 20%=002ms672.013us; 50%=007ms952.035us; 80%=216ms204.504us; 90%=522ms130.728us; 95%=526ms662.574us; 99%=530ms948.977us
Counter: CachedSyncTensors
Value: 533
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 199360
Counter: CreateXlaTensor
Value: 1569569
Counter: DestroyDataHandles
Value: 198479
Counter: DestroyXlaTensor
Value: 1568817
Counter: ReleaseDataHandles
Value: 198480
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 107
Epoch 13 begin 18:50:56
training/ 18:51:03, device xla:1, step 10, Rate=169.15, Global Rate=193.19
training/ 18:51:10, device xla:1, step 20, Rate=176.90, Global Rate=185.36
Epoch 13 Training stats:
device xla:1
| epoch 013 | loss 12.315 | nll_loss 11.873 | ppl 3751.06 | wps 2026 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 286 | lr 3.58429e-05 | gnorm 3.978 | clip 0.000 | oom 0.000 | wall 469 | train_wall 195
Epoch 13 Tracker Rates:
Rate=176.31, Global Rate=184.40
Epoch 13 end 18:51:12
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 562
Counter: 04m14s158ms225.971us
ValueRate: 547ms772.410us / second
Rate: 1.20903 / second
Percentiles: 1%=003ms506.921us; 5%=216ms932.768us; 10%=217ms521.388us; 20%=217ms156.534us; 50%=613ms67.724us; 80%=617ms199.013us; 90%=619ms774.216us; 95%=621ms790.392us; 99%=654ms914.822us
Metric: InboundData
TotalSamples: 112
Counter: 415.00B
ValueRate: 1.43B / second
Rate: 0.385315 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1527
Counter: 916.78MB
ValueRate: 387.94KB / second
Rate: 5.59426 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 8359
Counter: 35s720ms196.148us
ValueRate: 099ms37.029us / second
Rate: 32.927 / second
Percentiles: 1%=602.587us; 5%=758.216us; 10%=866.505us; 20%=001ms16.828us; 50%=001ms466.713us; 80%=006ms676.705us; 90%=008ms672.513us; 95%=009ms194.780us; 99%=011ms859.429us
Metric: TransferFromServerTime
TotalSamples: 112
Counter: 664ms281.680us
ValueRate: 002ms285.339us / second
Rate: 0.385315 / second
Percentiles: 1%=832.812us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms201.001us; 50%=001ms471.897us; 80%=003ms754.984us; 90%=004ms32.002us; 95%=041ms541.436us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1527
Counter: 04m43s314ms674.668us
ValueRate: 767ms121.955us / second
Rate: 5.59425 / second
Percentiles: 1%=001ms218.218us; 5%=001ms380.158us; 10%=002ms513.834us; 20%=002ms686.206us; 50%=007ms829.021us; 80%=217ms588.563us; 90%=521ms211.626us; 95%=525ms529.338us; 99%=529ms557.776us
Counter: CachedSyncTensors
Value: 555
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 215679
Counter: CreateXlaTensor
Value: 1675719
Counter: DestroyDataHandles
Value: 214799
Counter: DestroyXlaTensor
Value: 1674967
Counter: ReleaseDataHandles
Value: 214799
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 112
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:51:12, device xla:1, step 0
validation/ 18:51:15, device xla:1, step 10
validation/ 18:51:17, device xla:1, step 20
validation stats on subset "valid" - 18:51:17
| epoch 013 | valid on 'valid' subset | loss 11.041 | nll_loss 10.275 | ppl 1239.33 | num_updates 286
old learning rate: 3.3093400000000004e-05
new learning rate: 3.5842850000000005e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 585
Counter: 04m19s954ms253.251us
ValueRate: 551ms49.134us / second
Rate: 1.24487 / second
Percentiles: 1%=002ms450.590us; 5%=216ms932.768us; 10%=217ms533.342us; 20%=217ms191.835us; 50%=220ms568.183us; 80%=617ms81.351us; 90%=619ms724.413us; 95%=621ms657.780us; 99%=654ms914.822us
Metric: InboundData
TotalSamples: 116
Counter: 428.00B
ValueRate: 1.45B / second
Rate: 0.39222 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1556
Counter: 920.95MB
ValueRate: 401.76KB / second
Rate: 5.58015 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 8457
Counter: 35s836ms681.330us
ValueRate: 086ms24.858us / second
Rate: 30.7194 / second
Percentiles: 1%=625.346us; 5%=767.121us; 10%=872.320us; 20%=001ms16.828us; 50%=001ms398.602us; 80%=005ms998.822us; 90%=007ms463.766us; 95%=009ms59.259us; 99%=011ms859.429us
Metric: TransferFromServerTime
TotalSamples: 116
Counter: 672ms774.107us
ValueRate: 002ms271.406us / second
Rate: 0.39222 / second
Percentiles: 1%=832.812us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms201.001us; 50%=001ms463.409us; 80%=003ms754.984us; 90%=004ms32.002us; 95%=041ms541.436us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1556
Counter: 04m48s261ms968.870us
ValueRate: 775ms574.406us / second
Rate: 5.59615 / second
Percentiles: 1%=001ms218.218us; 5%=001ms393.471us; 10%=002ms518.507us; 20%=002ms694.782us; 50%=007ms929.593us; 80%=216ms244.089us; 90%=521ms877.546us; 95%=524ms397.290us; 99%=528ms135.425us
Counter: CachedSyncTensors
Value: 578
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 215956
Counter: CreateXlaTensor
Value: 1700320
Counter: DestroyDataHandles
Value: 215075
Counter: DestroyXlaTensor
Value: 1699569
Counter: ReleaseDataHandles
Value: 215077
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 116
Epoch 14 begin 18:51:17
training/ 18:51:24, device xla:1, step 10, Rate=164.90, Global Rate=188.71
training/ 18:51:31, device xla:1, step 20, Rate=176.67, Global Rate=183.42
Epoch 14 Training stats:
device xla:1
| epoch 014 | loss 12.133 | nll_loss 11.665 | ppl 3246.82 | wps 2088 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 308 | lr 3.85923e-05 | gnorm 4.294 | clip 0.000 | oom 0.000 | wall 490 | train_wall 209
Epoch 14 Tracker Rates:
Rate=176.17, Global Rate=182.65
Epoch 14 end 18:51:34
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 607
Counter: 05m33s546ms152.554us
ValueRate: 561ms797.021us / second
Rate: 1.24898 / second
Percentiles: 1%=003ms506.921us; 5%=216ms948.975us; 10%=217ms553.987us; 20%=217ms264.844us; 50%=613ms37.188us; 80%=617ms303.839us; 90%=619ms773.910us; 95%=621ms657.780us; 99%=650ms479.576us
Metric: InboundData
TotalSamples: 121
Counter: 448.00B
ValueRate: 1.44B / second
Rate: 0.388026 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1649
Counter: 925.11MB
ValueRate: 389.12KB / second
Rate: 5.61132 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 9064
Counter: 37s685ms251.110us
ValueRate: 099ms911.540us / second
Rate: 34.3536 / second
Percentiles: 1%=601.534us; 5%=755.680us; 10%=861.023us; 20%=001ms23.461us; 50%=001ms446.189us; 80%=005ms364.555us; 90%=008ms586.241us; 95%=009ms205.223us; 99%=011ms812.067us
Metric: TransferFromServerTime
TotalSamples: 121
Counter: 679ms141.709us
ValueRate: 002ms177.889us / second
Rate: 0.388026 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms190.435us; 50%=001ms448.596us; 80%=003ms754.984us; 90%=004ms23.556us; 95%=039ms243.925us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1649
Counter: 04m60s931ms683.605us
ValueRate: 770ms940.890us / second
Rate: 5.61147 / second
Percentiles: 1%=001ms218.218us; 5%=001ms383.993us; 10%=002ms516.547us; 20%=002ms686.265us; 50%=007ms845.766us; 80%=217ms588.563us; 90%=521ms872.466us; 95%=524ms228.723us; 99%=528ms928.130us
Counter: CachedSyncTensors
Value: 600
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 232279
Counter: CreateXlaTensor
Value: 1806470
Counter: DestroyDataHandles
Value: 231400
Counter: DestroyXlaTensor
Value: 1805719
Counter: ReleaseDataHandles
Value: 231400
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 121
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:51:34, device xla:1, step 0
validation/ 18:51:36, device xla:1, step 10
validation/ 18:51:38, device xla:1, step 20
validation stats on subset "valid" - 18:51:39
| epoch 014 | valid on 'valid' subset | loss 11.186 | nll_loss 10.396 | ppl 1347.20 | num_updates 308
old learning rate: 3.5842850000000005e-05
new learning rate: 3.8592300000000007e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 630
Counter: 05m37s345ms366.112us
ValueRate: 565ms732.070us / second
Rate: 1.28281 / second
Percentiles: 1%=002ms490.525us; 5%=216ms948.975us; 10%=217ms571.138us; 20%=217ms300.167us; 50%=220ms568.183us; 80%=617ms235.664us; 90%=619ms631.829us; 95%=620ms495.733us; 99%=650ms479.576us
Metric: InboundData
TotalSamples: 125
Counter: 461.00B
ValueRate: 1.45B / second
Rate: 0.394406 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1680
Counter: 929.28MB
ValueRate: 404.14KB / second
Rate: 5.6132 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 9157
Counter: 37s799ms327.457us
ValueRate: 087ms869.602us / second
Rate: 31.9536 / second
Percentiles: 1%=601.534us; 5%=758.629us; 10%=872.150us; 20%=001ms17.146us; 50%=001ms384.379us; 80%=005ms533.727us; 90%=008ms501.630us; 95%=009ms73.532us; 99%=011ms812.067us
Metric: TransferFromServerTime
TotalSamples: 125
Counter: 687ms923.137us
ValueRate: 002ms167.410us / second
Rate: 0.394406 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms190.435us; 50%=001ms447.697us; 80%=003ms774.990us; 90%=004ms23.556us; 95%=039ms243.925us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1680
Counter: 04m05s877ms745.743us
ValueRate: 771ms380.618us / second
Rate: 5.6132 / second
Percentiles: 1%=001ms220.544us; 5%=001ms394.929us; 10%=002ms522.652us; 20%=002ms694.782us; 50%=007ms929.593us; 80%=216ms302.680us; 90%=520ms414.235us; 95%=524ms968.991us; 99%=528ms631.834us
Counter: CachedSyncTensors
Value: 623
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 232558
Counter: CreateXlaTensor
Value: 1831071
Counter: DestroyDataHandles
Value: 231677
Counter: DestroyXlaTensor
Value: 1830319
Counter: ReleaseDataHandles
Value: 231678
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 125
Epoch 15 begin 18:51:39
training/ 18:51:45, device xla:1, step 10, Rate=167.69, Global Rate=192.88
training/ 18:51:53, device xla:1, step 20, Rate=174.89, Global Rate=184.39
Epoch 15 Training stats:
device xla:1
| epoch 015 | loss 11.950 | nll_loss 11.456 | ppl 2808.97 | wps 2144 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 330 | lr 4.13418e-05 | gnorm 4.548 | clip 0.000 | oom 0.000 | wall 511 | train_wall 223
Epoch 15 Tracker Rates:
Rate=175.87, Global Rate=183.76
Epoch 15 end 18:51:55
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 652
Counter: 05m51s955ms116.795us
ValueRate: 574ms780.710us / second
Rate: 1.28578 / second
Percentiles: 1%=002ms490.525us; 5%=216ms967.264us; 10%=217ms598.813us; 20%=217ms330.676us; 50%=613ms37.188us; 80%=617ms375.181us; 90%=619ms934.726us; 95%=621ms790.392us; 99%=650ms479.576us
Metric: InboundData
TotalSamples: 130
Counter: 481.00B
ValueRate: 1.44B / second
Rate: 0.390475 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=4.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1769
Counter: 933.45MB
ValueRate: 390.13KB / second
Rate: 5.62581 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 9700
Counter: 39s584ms200.177us
ValueRate: 099ms346.366us / second
Rate: 32.8275 / second
Percentiles: 1%=667.856us; 5%=773.739us; 10%=877.230us; 20%=001ms13.258us; 50%=001ms476.573us; 80%=006ms575.833us; 90%=008ms153.950us; 95%=009ms391.060us; 99%=011ms915.522us
Metric: TransferFromServerTime
TotalSamples: 130
Counter: 698ms44.887us
ValueRate: 002ms96.685us / second
Rate: 0.390475 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms58.001us; 20%=001ms198.477us; 50%=001ms463.409us; 80%=003ms774.990us; 90%=004ms32.002us; 95%=039ms243.925us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1769
Counter: 04m16s846ms588.675us
ValueRate: 766ms40.840us / second
Rate: 5.62603 / second
Percentiles: 1%=001ms229.882us; 5%=001ms407.530us; 10%=002ms524.274us; 20%=002ms690.923us; 50%=007ms873.531us; 80%=216ms484.629us; 90%=521ms782.688us; 95%=524ms924.946us; 99%=528ms661.921us
Counter: CachedSyncTensors
Value: 645
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 248877
Counter: CreateXlaTensor
Value: 1937221
Counter: DestroyDataHandles
Value: 247997
Counter: DestroyXlaTensor
Value: 1936469
Counter: ReleaseDataHandles
Value: 247997
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 130
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:51:55, device xla:1, step 0
validation/ 18:51:57, device xla:1, step 10
validation/ 18:51:59, device xla:1, step 20
validation stats on subset "valid" - 18:52:00
| epoch 015 | valid on 'valid' subset | loss 10.993 | nll_loss 10.175 | ppl 1156.18 | num_updates 330
old learning rate: 3.8592300000000007e-05
new learning rate: 4.134175000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 675
Counter: 05m56s759ms696.051us
ValueRate: 577ms438.586us / second
Rate: 1.31787 / second
Percentiles: 1%=002ms477.732us; 5%=216ms967.264us; 10%=217ms618.217us; 20%=217ms386.671us; 50%=220ms568.183us; 80%=617ms322.772us; 90%=619ms822.331us; 95%=621ms781.875us; 99%=650ms479.576us
Metric: InboundData
TotalSamples: 134
Counter: 494.00B
ValueRate: 1.46B / second
Rate: 0.396434 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1798
Counter: 937.62MB
ValueRate: 405.67KB / second
Rate: 5.63443 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 9799
Counter: 39s923ms393.123us
ValueRate: 092ms802.856us / second
Rate: 30.0452 / second
Percentiles: 1%=658.878us; 5%=788.595us; 10%=881.619us; 20%=001ms19.032us; 50%=001ms399.253us; 80%=005ms735.249us; 90%=008ms912.755us; 95%=009ms353.159us; 99%=011ms975.093us
Metric: TransferFromServerTime
TotalSamples: 134
Counter: 705ms117.718us
ValueRate: 002ms86.064us / second
Rate: 0.396434 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms58.001us; 20%=001ms189.327us; 50%=001ms458.171us; 80%=003ms774.990us; 90%=004ms23.556us; 95%=039ms243.925us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1798
Counter: 04m21s811ms677.293us
ValueRate: 771ms969.901us / second
Rate: 5.62836 / second
Percentiles: 1%=001ms229.882us; 5%=001ms421.986us; 10%=002ms533.839us; 20%=002ms696.814us; 50%=007ms958.380us; 80%=216ms244.089us; 90%=520ms414.235us; 95%=524ms825.563us; 99%=528ms661.921us
Counter: CachedSyncTensors
Value: 668
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 249154
Counter: CreateXlaTensor
Value: 1961822
Counter: DestroyDataHandles
Value: 248273
Counter: DestroyXlaTensor
Value: 1961071
Counter: ReleaseDataHandles
Value: 248275
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 134
Epoch 16 begin 18:52:00
training/ 18:52:07, device xla:1, step 10, Rate=164.27, Global Rate=188.80
training/ 18:52:14, device xla:1, step 20, Rate=175.99, Global Rate=183.60
Epoch 16 Training stats:
device xla:1
| epoch 016 | loss 11.760 | nll_loss 11.239 | ppl 2417.22 | wps 2197 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 352 | lr 4.40912e-05 | gnorm 4.765 | clip 0.000 | oom 0.000 | wall 532 | train_wall 237
Epoch 16 Tracker Rates:
Rate=176.44, Global Rate=182.98
Epoch 16 end 18:52:16
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 697
Counter: 05m09s351ms275.430us
ValueRate: 586ms640.948us / second
Rate: 1.31951 / second
Percentiles: 1%=002ms477.732us; 5%=216ms985.178us; 10%=217ms639.892us; 20%=217ms421.260us; 50%=613ms7.897us; 80%=617ms388.057us; 90%=619ms940.091us; 95%=621ms790.392us; 99%=650ms479.576us
Metric: InboundData
TotalSamples: 139
Counter: 514.00B
ValueRate: 1.45B / second
Rate: 0.392584 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1888
Counter: 941.79MB
ValueRate: 385.96KB / second
Rate: 5.62725 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 10382
Counter: 41s719ms852.226us
ValueRate: 106ms945.175us / second
Rate: 33.5074 / second
Percentiles: 1%=635.124us; 5%=785.145us; 10%=874.525us; 20%=001ms32.989us; 50%=001ms490.788us; 80%=005ms263.322us; 90%=008ms995.643us; 95%=009ms146.383us; 99%=011ms781.559us
Metric: TransferFromServerTime
TotalSamples: 139
Counter: 712ms890.245us
ValueRate: 002ms10.622us / second
Rate: 0.392584 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms33.243us; 20%=001ms178.371us; 50%=001ms447.697us; 80%=003ms754.984us; 90%=004ms23.556us; 95%=039ms243.925us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1888
Counter: 05m32s835ms117.075us
ValueRate: 768ms229.112us / second
Rate: 5.64156 / second
Percentiles: 1%=001ms229.882us; 5%=001ms421.629us; 10%=002ms535.941us; 20%=002ms694.782us; 50%=007ms910.035us; 80%=217ms588.563us; 90%=521ms650.510us; 95%=524ms856.070us; 99%=528ms591.524us
Counter: CachedSyncTensors
Value: 690
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 265474
Counter: CreateXlaTensor
Value: 2067972
Counter: DestroyDataHandles
Value: 264595
Counter: DestroyXlaTensor
Value: 2067221
Counter: ReleaseDataHandles
Value: 264595
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 139
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:52:16, device xla:1, step 0
validation/ 18:52:18, device xla:1, step 10
validation/ 18:52:20, device xla:1, step 20
validation stats on subset "valid" - 18:52:21
| epoch 016 | valid on 'valid' subset | loss 11.079 | nll_loss 10.231 | ppl 1201.75 | num_updates 352
old learning rate: 4.134175000000001e-05
new learning rate: 4.409120000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 720
Counter: 05m14s158ms146.163us
ValueRate: 589ms48.380us / second
Rate: 1.35 / second
Percentiles: 1%=002ms477.732us; 5%=216ms987.439us; 10%=217ms656.533us; 20%=217ms442.620us; 50%=221ms690.555us; 80%=617ms375.181us; 90%=619ms848.354us; 95%=621ms781.875us; 99%=649ms7.344us
Metric: InboundData
TotalSamples: 143
Counter: 527.00B
ValueRate: 1.47B / second
Rate: 0.398158 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 1917
Counter: 945.96MB
ValueRate: 405.41KB / second
Rate: 5.6308 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 10482
Counter: 41s835ms766.253us
ValueRate: 092ms66.156us / second
Rate: 31.2261 / second
Percentiles: 1%=647.780us; 5%=808.899us; 10%=881.770us; 20%=001ms28.286us; 50%=001ms424.081us; 80%=005ms591.063us; 90%=008ms581.769us; 95%=009ms828.775us; 99%=010ms391.832us
Metric: TransferFromServerTime
TotalSamples: 143
Counter: 718ms497.404us
ValueRate: 002ms0.526us / second
Rate: 0.398158 / second
Percentiles: 1%=807.358us; 5%=893.248us; 10%=001ms33.243us; 20%=001ms163.119us; 50%=001ms440.582us; 80%=003ms754.984us; 90%=004ms934.657us; 95%=037ms114.494us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 1917
Counter: 05m37s778ms901.103us
ValueRate: 770ms91.843us / second
Rate: 5.62476 / second
Percentiles: 1%=001ms246.597us; 5%=001ms424.834us; 10%=002ms538.407us; 20%=002ms700.844us; 50%=007ms55.187us; 80%=216ms384.457us; 90%=520ms75.327us; 95%=524ms775.119us; 99%=528ms575.317us
Counter: CachedSyncTensors
Value: 713
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 265751
Counter: CreateXlaTensor
Value: 2092573
Counter: DestroyDataHandles
Value: 264870
Counter: DestroyXlaTensor
Value: 2091821
Counter: ReleaseDataHandles
Value: 264871
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 143
Epoch 17 begin 18:52:21
training/ 18:52:27, device xla:1, step 10, Rate=170.80, Global Rate=194.68
training/ 18:52:35, device xla:1, step 20, Rate=176.35, Global Rate=185.50
Epoch 17 Training stats:
device xla:1
| epoch 017 | loss 11.566 | nll_loss 11.018 | ppl 2074.33 | wps 2245 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 374 | lr 4.68407e-05 | gnorm 5.037 | clip 0.000 | oom 0.000 | wall 553 | train_wall 250
Epoch 17 Tracker Rates:
Rate=176.58, Global Rate=184.70
Epoch 17 end 18:52:37
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 742
Counter: 05m28s763ms367.076us
ValueRate: 597ms769.570us / second
Rate: 1.35098 / second
Percentiles: 1%=002ms477.732us; 5%=216ms28.840us; 10%=217ms675.414us; 20%=217ms450.504us; 50%=613ms7.897us; 80%=618ms541.287us; 90%=619ms3.245us; 95%=621ms790.392us; 99%=649ms7.344us
Metric: InboundData
TotalSamples: 148
Counter: 547.00B
ValueRate: 1.46B / second
Rate: 0.394596 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2006
Counter: 950.12MB
ValueRate: 390.15KB / second
Rate: 5.62612 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 11062
Counter: 43s588ms997.350us
ValueRate: 097ms302.801us / second
Rate: 33.7955 / second
Percentiles: 1%=628.026us; 5%=758.102us; 10%=857.777us; 20%=001ms8.535us; 50%=001ms443.902us; 80%=005ms329.879us; 90%=008ms659.773us; 95%=009ms662.159us; 99%=011ms589.102us
Metric: TransferFromServerTime
TotalSamples: 148
Counter: 727ms121.188us
ValueRate: 002ms938.644us / second
Rate: 0.394596 / second
Percentiles: 1%=807.358us; 5%=893.248us; 10%=001ms19.072us; 20%=001ms157.936us; 50%=001ms429.678us; 80%=003ms754.984us; 90%=004ms934.657us; 95%=037ms114.494us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2006
Counter: 05m48s713ms736.156us
ValueRate: 765ms932.326us / second
Rate: 5.62644 / second
Percentiles: 1%=001ms250.567us; 5%=001ms421.629us; 10%=002ms538.407us; 20%=002ms700.844us; 50%=007ms975.307us; 80%=217ms588.563us; 90%=520ms75.327us; 95%=524ms533.171us; 99%=528ms575.317us
Counter: CachedSyncTensors
Value: 735
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 282070
Counter: CreateXlaTensor
Value: 2198723
Counter: DestroyDataHandles
Value: 281190
Counter: DestroyXlaTensor
Value: 2197971
Counter: ReleaseDataHandles
Value: 281190
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 148
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:52:37, device xla:1, step 0
validation/ 18:52:39, device xla:1, step 10
validation/ 18:52:41, device xla:1, step 20
validation stats on subset "valid" - 18:52:42
| epoch 017 | valid on 'valid' subset | loss 10.886 | nll_loss 10.001 | ppl 1024.48 | num_updates 374
old learning rate: 4.409120000000001e-05
new learning rate: 4.684065000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 765
Counter: 06m33s561ms651.157us
ValueRate: 600ms926.956us / second
Rate: 1.38003 / second
Percentiles: 1%=002ms450.590us; 5%=216ms28.840us; 10%=217ms698.500us; 20%=217ms450.397us; 50%=220ms568.183us; 80%=617ms442.023us; 90%=619ms951.484us; 95%=621ms781.875us; 99%=649ms7.344us
Metric: InboundData
TotalSamples: 152
Counter: 560.00B
ValueRate: 1.47B / second
Rate: 0.399834 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2035
Counter: 954.29MB
ValueRate: 405.51KB / second
Rate: 5.6323 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 11168
Counter: 43s707ms357.446us
ValueRate: 085ms379.862us / second
Rate: 32.1097 / second
Percentiles: 1%=628.026us; 5%=775.907us; 10%=862.272us; 20%=996.857us; 50%=001ms358.903us; 80%=005ms672.844us; 90%=007ms311.439us; 95%=009ms500.091us; 99%=010ms353.125us
Metric: TransferFromServerTime
TotalSamples: 152
Counter: 733ms627.642us
ValueRate: 002ms927.167us / second
Rate: 0.399834 / second
Percentiles: 1%=807.358us; 5%=893.248us; 10%=001ms33.243us; 20%=001ms157.936us; 50%=001ms422.211us; 80%=003ms750.927us; 90%=004ms829.824us; 95%=037ms114.494us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2035
Counter: 05m53s661ms885.368us
ValueRate: 769ms438.477us / second
Rate: 5.62626 / second
Percentiles: 1%=001ms250.567us; 5%=001ms423.877us; 10%=002ms553.796us; 20%=002ms711.828us; 50%=007ms110.754us; 80%=216ms302.680us; 90%=520ms854.247us; 95%=523ms332.310us; 99%=527ms338.321us
Counter: CachedSyncTensors
Value: 758
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 282347
Counter: CreateXlaTensor
Value: 2223324
Counter: DestroyDataHandles
Value: 281466
Counter: DestroyXlaTensor
Value: 2222573
Counter: ReleaseDataHandles
Value: 281468
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 152
Epoch 18 begin 18:52:42
training/ 18:52:49, device xla:1, step 10, Rate=166.73, Global Rate=191.88
training/ 18:52:56, device xla:1, step 20, Rate=175.28, Global Rate=183.57
Epoch 18 Training stats:
device xla:1
| epoch 018 | loss 11.370 | nll_loss 10.796 | ppl 1777.86 | wps 2290 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 396 | lr 4.95901e-05 | gnorm 5.259 | clip 0.000 | oom 0.000 | wall 574 | train_wall 264
Epoch 18 Tracker Rates:
Rate=175.86, Global Rate=182.93
Epoch 18 end 18:52:58
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 787
Counter: 06m46s149ms326.378us
ValueRate: 607ms871.157us / second
Rate: 1.37977 / second
Percentiles: 1%=002ms450.590us; 5%=216ms45.304us; 10%=217ms734.938us; 20%=217ms489.827us; 50%=613ms706.699us; 80%=618ms558.532us; 90%=619ms965.706us; 95%=621ms781.875us; 99%=649ms7.344us
Metric: InboundData
TotalSamples: 157
Counter: 580.00B
ValueRate: 1.46B / second
Rate: 0.396243 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2127
Counter: 958.46MB
ValueRate: 388.11KB / second
Rate: 5.6586 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 11739
Counter: 44s432ms606.975us
ValueRate: 094ms134.964us / second
Rate: 33.5552 / second
Percentiles: 1%=630.125us; 5%=758.102us; 10%=828.796us; 20%=982.225us; 50%=001ms408.762us; 80%=005ms183.030us; 90%=007ms265.227us; 95%=009ms670.325us; 99%=011ms589.102us
Metric: TransferFromServerTime
TotalSamples: 157
Counter: 745ms266.764us
ValueRate: 002ms880.935us / second
Rate: 0.396243 / second
Percentiles: 1%=807.358us; 5%=893.248us; 10%=001ms33.243us; 20%=001ms157.936us; 50%=001ms417.568us; 80%=003ms750.927us; 90%=004ms934.657us; 95%=037ms114.494us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2127
Counter: 05m04s754ms161.679us
ValueRate: 764ms477.840us / second
Rate: 5.65878 / second
Percentiles: 1%=001ms250.567us; 5%=001ms403.994us; 10%=002ms538.407us; 20%=002ms696.814us; 50%=007ms959.956us; 80%=216ms392.195us; 90%=520ms197.717us; 95%=523ms289.491us; 99%=527ms934.592us
Counter: CachedSyncTensors
Value: 780
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 298669
Counter: CreateXlaTensor
Value: 2329474
Counter: DestroyDataHandles
Value: 297790
Counter: DestroyXlaTensor
Value: 2328723
Counter: ReleaseDataHandles
Value: 297790
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 157
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:52:58, device xla:1, step 0
validation/ 18:53:00, device xla:1, step 10
validation/ 18:53:02, device xla:1, step 20
validation stats on subset "valid" - 18:53:03
| epoch 018 | valid on 'valid' subset | loss 11.015 | nll_loss 10.127 | ppl 1117.94 | num_updates 396
old learning rate: 4.684065000000001e-05
new learning rate: 4.959010000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 810
Counter: 06m51s947ms88.172us
ValueRate: 610ms842.843us / second
Rate: 1.40754 / second
Percentiles: 1%=002ms450.590us; 5%=216ms45.304us; 10%=217ms762.922us; 20%=217ms489.827us; 50%=220ms568.183us; 80%=617ms497.848us; 90%=619ms940.091us; 95%=621ms768.494us; 99%=647ms132.993us
Metric: InboundData
TotalSamples: 161
Counter: 593.00B
ValueRate: 1.48B / second
Rate: 0.401202 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2157
Counter: 962.63MB
ValueRate: 403.76KB / second
Rate: 5.66765 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 11835
Counter: 45s538ms561.388us
ValueRate: 082ms185.464us / second
Rate: 31.2888 / second
Percentiles: 1%=643.020us; 5%=762.353us; 10%=831.967us; 20%=971.887us; 50%=001ms333.273us; 80%=005ms516.270us; 90%=007ms113.392us; 95%=009ms604.635us; 99%=011ms569.638us
Metric: TransferFromServerTime
TotalSamples: 161
Counter: 752ms617.963us
ValueRate: 002ms872.985us / second
Rate: 0.401202 / second
Percentiles: 1%=807.358us; 5%=896.996us; 10%=001ms34.917us; 20%=001ms157.936us; 50%=001ms417.568us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=037ms738.600us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2157
Counter: 05m09s719ms664.618us
ValueRate: 769ms450.108us / second
Rate: 5.66149 / second
Percentiles: 1%=001ms252.304us; 5%=001ms405.270us; 10%=002ms550.184us; 20%=002ms705.467us; 50%=007ms64.479us; 80%=216ms204.504us; 90%=520ms63.196us; 95%=523ms223.041us; 99%=527ms910.451us
Counter: CachedSyncTensors
Value: 803
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 298947
Counter: CreateXlaTensor
Value: 2354075
Counter: DestroyDataHandles
Value: 298066
Counter: DestroyXlaTensor
Value: 2353323
Counter: ReleaseDataHandles
Value: 298067
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 161
Epoch 19 begin 18:53:03
training/ 18:53:10, device xla:1, step 10, Rate=166.47, Global Rate=190.98
training/ 18:53:17, device xla:1, step 20, Rate=177.32, Global Rate=185.08
Epoch 19 Training stats:
device xla:1
| epoch 019 | loss 11.177 | nll_loss 10.576 | ppl 1526.29 | wps 2332 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 418 | lr 5.23396e-05 | gnorm 5.484 | clip 0.000 | oom 0.000 | wall 595 | train_wall 278
Epoch 19 Tracker Rates:
Rate=177.14, Global Rate=184.30
Epoch 19 end 18:53:19
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 832
Counter: 06m05s530ms258.568us
ValueRate: 616ms390.622us / second
Rate: 1.40684 / second
Percentiles: 1%=002ms450.590us; 5%=216ms84.506us; 10%=217ms797.076us; 20%=218ms527.880us; 50%=613ms706.699us; 80%=618ms558.532us; 90%=619ms951.484us; 95%=621ms768.494us; 99%=647ms132.993us
Metric: InboundData
TotalSamples: 166
Counter: 613.00B
ValueRate: 1.47B / second
Rate: 0.397861 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2245
Counter: 966.80MB
ValueRate: 388.51KB / second
Rate: 5.66448 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 12388
Counter: 46s263ms234.879us
ValueRate: 094ms880.750us / second
Rate: 32.8677 / second
Percentiles: 1%=619.237us; 5%=751.493us; 10%=842.616us; 20%=990.865us; 50%=001ms467.340us; 80%=005ms120.088us; 90%=007ms443.798us; 95%=009ms60.695us; 99%=011ms754.189us
Metric: TransferFromServerTime
TotalSamples: 166
Counter: 759ms239.993us
ValueRate: 002ms819.712us / second
Rate: 0.397861 / second
Percentiles: 1%=807.358us; 5%=893.248us; 10%=001ms33.243us; 20%=001ms153.023us; 50%=001ms417.568us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=037ms738.600us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2245
Counter: 05m20s726ms720.161us
ValueRate: 765ms33.704us / second
Rate: 5.66447 / second
Percentiles: 1%=001ms250.567us; 5%=001ms414.044us; 10%=002ms548.938us; 20%=002ms696.814us; 50%=007ms958.380us; 80%=216ms331.601us; 90%=520ms45.431us; 95%=523ms289.491us; 99%=527ms428.442us
Counter: CachedSyncTensors
Value: 825
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 315265
Counter: CreateXlaTensor
Value: 2460225
Counter: DestroyDataHandles
Value: 314385
Counter: DestroyXlaTensor
Value: 2459473
Counter: ReleaseDataHandles
Value: 314385
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 166
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:53:19, device xla:1, step 0
validation/ 18:53:21, device xla:1, step 10
validation/ 18:53:23, device xla:1, step 20
validation stats on subset "valid" - 18:53:24
| epoch 019 | valid on 'valid' subset | loss 10.998 | nll_loss 10.066 | ppl 1072.00 | num_updates 418
old learning rate: 4.959010000000001e-05
new learning rate: 5.233955000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 855
Counter: 06m09s317ms275.361us
ValueRate: 619ms169.847us / second
Rate: 1.43343 / second
Percentiles: 1%=002ms434.899us; 5%=216ms84.506us; 10%=217ms723.561us; 20%=217ms473.985us; 50%=219ms490.397us; 80%=617ms497.848us; 90%=619ms934.726us; 95%=621ms657.780us; 99%=647ms132.993us
Metric: InboundData
TotalSamples: 170
Counter: 626.00B
ValueRate: 1.48B / second
Rate: 0.402563 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2275
Counter: 970.97MB
ValueRate: 404.51KB / second
Rate: 5.67818 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 12486
Counter: 46s375ms5.103us
ValueRate: 083ms103.091us / second
Rate: 30.7008 / second
Percentiles: 1%=639.532us; 5%=766.912us; 10%=855.214us; 20%=989.790us; 50%=001ms364.523us; 80%=005ms594.561us; 90%=007ms187.925us; 95%=009ms25.975us; 99%=011ms754.189us
Metric: TransferFromServerTime
TotalSamples: 170
Counter: 766ms117.415us
ValueRate: 002ms814.178us / second
Rate: 0.402563 / second
Percentiles: 1%=807.358us; 5%=893.248us; 10%=001ms34.917us; 20%=001ms153.023us; 50%=001ms417.568us; 80%=003ms750.927us; 90%=004ms829.824us; 95%=037ms738.600us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2275
Counter: 05m25s682ms265.537us
ValueRate: 773ms209.283us / second
Rate: 5.68861 / second
Percentiles: 1%=001ms250.567us; 5%=001ms421.629us; 10%=002ms553.796us; 20%=002ms709.316us; 50%=007ms85.027us; 80%=216ms87.056us; 90%=520ms953.967us; 95%=523ms266.415us; 99%=527ms428.442us
Counter: CachedSyncTensors
Value: 848
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 315543
Counter: CreateXlaTensor
Value: 2484826
Counter: DestroyDataHandles
Value: 314662
Counter: DestroyXlaTensor
Value: 2484074
Counter: ReleaseDataHandles
Value: 314663
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 170
Epoch 20 begin 18:53:24
training/ 18:53:31, device xla:1, step 10, Rate=168.64, Global Rate=192.88
training/ 18:53:38, device xla:1, step 20, Rate=178.81, Global Rate=186.07
Epoch 20 Training stats:
device xla:1
| epoch 020 | loss 10.973 | nll_loss 10.345 | ppl 1300.39 | wps 2371 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 440 | lr 5.5089e-05 | gnorm 5.673 | clip 0.000 | oom 0.000 | wall 616 | train_wall 292
Epoch 20 Tracker Rates:
Rate=179.01, Global Rate=185.44
Epoch 20 end 18:53:40
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 877
Counter: 06m23s895ms166.660us
ValueRate: 625ms339.846us / second
Rate: 1.43231 / second
Percentiles: 1%=002ms434.899us; 5%=216ms102.252us; 10%=217ms762.922us; 20%=218ms506.786us; 50%=613ms696.634us; 80%=618ms549.319us; 90%=619ms5.387us; 95%=621ms657.780us; 99%=647ms132.993us
Metric: InboundData
TotalSamples: 175
Counter: 646.00B
ValueRate: 1.47B / second
Rate: 0.399415 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2363
Counter: 975.13MB
ValueRate: 388.74KB / second
Rate: 5.66783 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 13027
Counter: 48s171ms0.690us
ValueRate: 099ms617.390us / second
Rate: 32.3774 / second
Percentiles: 1%=599.611us; 5%=734.993us; 10%=831.663us; 20%=001ms3.466us; 50%=001ms475.067us; 80%=006ms547.764us; 90%=008ms174.827us; 95%=010ms538.845us; 99%=011ms931.308us
Metric: TransferFromServerTime
TotalSamples: 175
Counter: 776ms357.185us
ValueRate: 002ms771.938us / second
Rate: 0.399415 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms18.120us; 20%=001ms148.777us; 50%=001ms415.855us; 80%=003ms750.927us; 90%=004ms829.824us; 95%=037ms738.600us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2363
Counter: 06m36s600ms351.297us
ValueRate: 762ms6.019us / second
Rate: 5.66784 / second
Percentiles: 1%=001ms246.583us; 5%=001ms399.499us; 10%=002ms535.919us; 20%=002ms688.733us; 50%=007ms944.511us; 80%=216ms384.457us; 90%=520ms834.883us; 95%=523ms993.495us; 99%=527ms615.552us
Counter: CachedSyncTensors
Value: 870
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 331861
Counter: CreateXlaTensor
Value: 2590976
Counter: DestroyDataHandles
Value: 330981
Counter: DestroyXlaTensor
Value: 2590224
Counter: ReleaseDataHandles
Value: 330981
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 175
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:53:40, device xla:1, step 0
validation/ 18:53:42, device xla:1, step 10
validation/ 18:53:44, device xla:1, step 20
validation stats on subset "valid" - 18:53:45
| epoch 020 | valid on 'valid' subset | loss 10.735 | nll_loss 9.786 | ppl 882.72 | num_updates 440
old learning rate: 5.233955000000001e-05
new learning rate: 5.5089000000000014e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 900
Counter: 06m28s679ms350.488us
ValueRate: 628ms934.458us / second
Rate: 1.45775 / second
Percentiles: 1%=002ms434.899us; 5%=216ms134.186us; 10%=217ms711.861us; 20%=217ms442.620us; 50%=220ms568.183us; 80%=617ms497.350us; 90%=619ms965.706us; 95%=620ms495.733us; 99%=647ms132.993us
Metric: InboundData
TotalSamples: 179
Counter: 659.00B
ValueRate: 1.49B / second
Rate: 0.403871 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2392
Counter: 979.30MB
ValueRate: 403.27KB / second
Rate: 5.66067 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 13127
Counter: 48s286ms420.171us
ValueRate: 086ms832.695us / second
Rate: 30.2795 / second
Percentiles: 1%=618.671us; 5%=736.639us; 10%=841.538us; 20%=980.464us; 50%=001ms359.124us; 80%=005ms816.583us; 90%=008ms7.777us; 95%=009ms474.594us; 99%=011ms931.308us
Metric: TransferFromServerTime
TotalSamples: 179
Counter: 783ms694.363us
ValueRate: 002ms765.963us / second
Rate: 0.403871 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms18.120us; 20%=001ms141.469us; 50%=001ms411.588us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=037ms738.600us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2392
Counter: 06m41s538ms913.184us
ValueRate: 770ms691.145us / second
Rate: 5.67093 / second
Percentiles: 1%=001ms246.583us; 5%=001ms405.270us; 10%=002ms546.919us; 20%=002ms709.316us; 50%=007ms99.571us; 80%=216ms12.894us; 90%=519ms426.308us; 95%=523ms856.287us; 99%=527ms519.317us
Counter: CachedSyncTensors
Value: 893
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 332138
Counter: CreateXlaTensor
Value: 2615577
Counter: DestroyDataHandles
Value: 331257
Counter: DestroyXlaTensor
Value: 2614826
Counter: ReleaseDataHandles
Value: 331259
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 179
Epoch 21 begin 18:53:45
training/ 18:53:52, device xla:1, step 10, Rate=168.17, Global Rate=193.11
training/ 18:53:59, device xla:1, step 20, Rate=176.76, Global Rate=185.44
Epoch 21 Training stats:
device xla:1
| epoch 021 | loss 10.769 | nll_loss 10.113 | ppl 1107.75 | wps 2407 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 462 | lr 5.78385e-05 | gnorm 5.882 | clip 0.000 | oom 0.000 | wall 637 | train_wall 306
Epoch 21 Tracker Rates:
Rate=176.80, Global Rate=184.63
Epoch 21 end 18:54:01
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 922
Counter: 07m41s242ms659.581us
ValueRate: 634ms586.027us / second
Rate: 1.4559 / second
Percentiles: 1%=002ms434.899us; 5%=216ms173.261us; 10%=217ms723.561us; 20%=217ms450.504us; 50%=613ms696.634us; 80%=618ms506.360us; 90%=619ms949.063us; 95%=620ms405.265us; 99%=646ms239.932us
Metric: InboundData
TotalSamples: 184
Counter: 679.00B
ValueRate: 1.48B / second
Rate: 0.40076 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2481
Counter: 983.47MB
ValueRate: 388.82KB / second
Rate: 5.6689 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 13682
Counter: 50s017ms442.106us
ValueRate: 097ms912.468us / second
Rate: 32.3889 / second
Percentiles: 1%=630.262us; 5%=757.061us; 10%=856.123us; 20%=001ms12.092us; 50%=001ms443.918us; 80%=005ms406.849us; 90%=008ms29.255us; 95%=009ms488.190us; 99%=011ms41.751us
Metric: TransferFromServerTime
TotalSamples: 184
Counter: 792ms940.141us
ValueRate: 002ms724.878us / second
Rate: 0.40076 / second
Percentiles: 1%=807.358us; 5%=878.660us; 10%=001ms19.072us; 20%=001ms148.777us; 50%=001ms408.434us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=036ms171.085us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2481
Counter: 06m52s101ms479.493us
ValueRate: 762ms50.966us / second
Rate: 5.66898 / second
Percentiles: 1%=001ms250.567us; 5%=001ms404.496us; 10%=002ms535.140us; 20%=002ms698.785us; 50%=007ms943.129us; 80%=216ms120.506us; 90%=519ms426.308us; 95%=523ms806.250us; 99%=527ms910.451us
Counter: CachedSyncTensors
Value: 915
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 348457
Counter: CreateXlaTensor
Value: 2721727
Counter: DestroyDataHandles
Value: 347578
Counter: DestroyXlaTensor
Value: 2720976
Counter: ReleaseDataHandles
Value: 347578
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 184
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:54:01, device xla:1, step 0
validation/ 18:54:03, device xla:1, step 10
validation/ 18:54:05, device xla:1, step 20
validation stats on subset "valid" - 18:54:06
| epoch 021 | valid on 'valid' subset | loss 10.804 | nll_loss 9.843 | ppl 918.42 | num_updates 462
old learning rate: 5.5089000000000014e-05
new learning rate: 5.783845000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 945
Counter: 07m46s011ms325.770us
ValueRate: 636ms2.522us / second
Rate: 1.48031 / second
Percentiles: 1%=002ms384.020us; 5%=216ms948.975us; 10%=217ms639.892us; 20%=217ms357.129us; 50%=219ms490.397us; 80%=617ms442.023us; 90%=619ms937.467us; 95%=620ms401.254us; 99%=646ms239.932us
Metric: InboundData
TotalSamples: 188
Counter: 692.00B
ValueRate: 1.49B / second
Rate: 0.404995 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2510
Counter: 987.64MB
ValueRate: 404.60KB / second
Rate: 5.67931 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 13787
Counter: 50s344ms778.708us
ValueRate: 091ms927.938us / second
Rate: 30.8536 / second
Percentiles: 1%=636.193us; 5%=770.025us; 10%=858.407us; 20%=988.214us; 50%=001ms344.782us; 80%=005ms635.157us; 90%=008ms539.864us; 95%=009ms223.837us; 99%=011ms962.558us
Metric: TransferFromServerTime
TotalSamples: 188
Counter: 798ms331.522us
ValueRate: 002ms719.791us / second
Rate: 0.404995 / second
Percentiles: 1%=807.358us; 5%=878.660us; 10%=001ms19.072us; 20%=001ms141.469us; 50%=001ms407.300us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=036ms171.085us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2510
Counter: 06m57s028ms970.675us
ValueRate: 767ms414.574us / second
Rate: 5.6732 / second
Percentiles: 1%=001ms250.567us; 5%=001ms405.270us; 10%=002ms538.407us; 20%=002ms722.703us; 50%=007ms93.776us; 80%=216ms901.493us; 90%=519ms337.895us; 95%=523ms806.250us; 99%=527ms910.451us
Counter: CachedSyncTensors
Value: 938
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 348734
Counter: CreateXlaTensor
Value: 2746328
Counter: DestroyDataHandles
Value: 347853
Counter: DestroyXlaTensor
Value: 2745576
Counter: ReleaseDataHandles
Value: 347854
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 188
Epoch 22 begin 18:54:06
training/ 18:54:13, device xla:1, step 10, Rate=168.78, Global Rate=192.84
training/ 18:54:20, device xla:1, step 20, Rate=176.34, Global Rate=184.71
Epoch 22 Training stats:
device xla:1
| epoch 022 | loss 10.564 | nll_loss 9.881 | ppl 943.11 | wps 2441 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 484 | lr 6.05879e-05 | gnorm 6.074 | clip 0.000 | oom 0.000 | wall 658 | train_wall 320
Epoch 22 Tracker Rates:
Rate=176.31, Global Rate=183.91
Epoch 22 end 18:54:22
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 967
Counter: 07m60s553ms46.661us
ValueRate: 641ms185.230us / second
Rate: 1.47783 / second
Percentiles: 1%=002ms384.020us; 5%=216ms967.264us; 10%=217ms654.643us; 20%=217ms417.335us; 50%=613ms682.177us; 80%=617ms415.825us; 90%=619ms934.726us; 95%=620ms300.225us; 99%=646ms239.932us
Metric: InboundData
TotalSamples: 193
Counter: 712.00B
ValueRate: 1.48B / second
Rate: 0.401934 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2600
Counter: 991.81MB
ValueRate: 388.96KB / second
Rate: 5.67091 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 14371
Counter: 52s162ms404.457us
ValueRate: 105ms835.234us / second
Rate: 33.6812 / second
Percentiles: 1%=648.843us; 5%=775.081us; 10%=853.557us; 20%=983.229us; 50%=001ms409.490us; 80%=005ms456.062us; 90%=008ms850.969us; 95%=009ms931.433us; 99%=011ms937.007us
Metric: TransferFromServerTime
TotalSamples: 193
Counter: 806ms333.341us
ValueRate: 002ms679.236us / second
Rate: 0.401934 / second
Percentiles: 1%=785.058us; 5%=860.901us; 10%=001ms18.120us; 20%=001ms132.293us; 50%=001ms384.375us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=036ms171.085us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2600
Counter: 06m08s051ms524.490us
ValueRate: 762ms390.134us / second
Rate: 5.67092 / second
Percentiles: 1%=001ms254.353us; 5%=001ms422.269us; 10%=002ms543.415us; 20%=002ms707.501us; 50%=007ms983.762us; 80%=216ms120.506us; 90%=519ms105.204us; 95%=523ms806.250us; 99%=527ms706.163us
Counter: CachedSyncTensors
Value: 960
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 365054
Counter: CreateXlaTensor
Value: 2852478
Counter: DestroyDataHandles
Value: 364174
Counter: DestroyXlaTensor
Value: 2851726
Counter: ReleaseDataHandles
Value: 364174
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 193
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:54:22, device xla:1, step 0
validation/ 18:54:24, device xla:1, step 10
validation/ 18:54:26, device xla:1, step 20
validation stats on subset "valid" - 18:54:27
| epoch 022 | valid on 'valid' subset | loss 11.012 | nll_loss 9.984 | ppl 1012.76 | num_updates 484
old learning rate: 5.783845000000001e-05
new learning rate: 6.058790000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 990
Counter: 07m04s330ms173.981us
ValueRate: 643ms473.996us / second
Rate: 1.50128 / second
Percentiles: 1%=002ms316.927us; 5%=216ms948.458us; 10%=217ms618.217us; 20%=217ms317.213us; 50%=219ms490.397us; 80%=617ms381.741us; 90%=619ms831.666us; 95%=620ms244.807us; 99%=646ms239.932us
Metric: InboundData
TotalSamples: 197
Counter: 725.00B
ValueRate: 1.49B / second
Rate: 0.405969 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2630
Counter: 995.98MB
ValueRate: 403.58KB / second
Rate: 5.66505 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 14470
Counter: 52s275ms680.590us
ValueRate: 092ms697.984us / second
Rate: 31.365 / second
Percentiles: 1%=660.747us; 5%=778.438us; 10%=855.798us; 20%=986.460us; 50%=001ms335.787us; 80%=005ms906.098us; 90%=007ms431.319us; 95%=009ms787.980us; 99%=011ms734.001us
Metric: TransferFromServerTime
TotalSamples: 197
Counter: 813ms157.020us
ValueRate: 002ms675.717us / second
Rate: 0.405969 / second
Percentiles: 1%=785.058us; 5%=860.901us; 10%=997.327us; 20%=001ms128.042us; 50%=001ms384.375us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=036ms171.085us; 99%=047ms816.580us
Metric: TransferToServerTime
TotalSamples: 2630
Counter: 06m13s970ms30.149us
ValueRate: 767ms673.969us / second
Rate: 5.67529 / second
Percentiles: 1%=001ms254.353us; 5%=001ms422.269us; 10%=002ms547.035us; 20%=002ms726.048us; 50%=007ms110.754us; 80%=216ms722.907us; 90%=519ms581.513us; 95%=523ms620.003us; 99%=527ms519.317us
Counter: CachedSyncTensors
Value: 983
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 365332
Counter: CreateXlaTensor
Value: 2877079
Counter: DestroyDataHandles
Value: 364451
Counter: DestroyXlaTensor
Value: 2876327
Counter: ReleaseDataHandles
Value: 364452
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 197
Epoch 23 begin 18:54:27
training/ 18:54:34, device xla:1, step 10, Rate=170.62, Global Rate=195.95
training/ 18:54:41, device xla:1, step 20, Rate=177.81, Global Rate=187.69
Epoch 23 Training stats:
device xla:1
| epoch 023 | loss 10.359 | nll_loss 9.649 | ppl 803.10 | wps 2474 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 506 | lr 6.33374e-05 | gnorm 6.244 | clip 0.000 | oom 0.000 | wall 679 | train_wall 334
Epoch 23 Tracker Rates:
Rate=177.85, Global Rate=186.77
Epoch 23 end 18:54:43
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1012
Counter: 07m18s932ms509.370us
ValueRate: 649ms635.083us / second
Rate: 1.49891 / second
Percentiles: 1%=002ms384.020us; 5%=216ms948.975us; 10%=217ms630.661us; 20%=217ms350.715us; 50%=613ms682.177us; 80%=617ms497.848us; 90%=619ms934.726us; 95%=620ms401.254us; 99%=646ms192.311us
Metric: InboundData
TotalSamples: 202
Counter: 745.00B
ValueRate: 1.49B / second
Rate: 0.403193 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2719
Counter: 1000.14MB
ValueRate: 388.36KB / second
Rate: 5.66216 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 14998
Counter: 54s921ms40.107us
ValueRate: 095ms171.485us / second
Rate: 32.3948 / second
Percentiles: 1%=668.011us; 5%=774.557us; 10%=862.109us; 20%=991.580us; 50%=001ms459.450us; 80%=005ms369.819us; 90%=008ms156.824us; 95%=009ms198.779us; 99%=011ms815.343us
Metric: TransferFromServerTime
TotalSamples: 202
Counter: 823ms491.477us
ValueRate: 002ms643.694us / second
Rate: 0.403193 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms18.120us; 20%=001ms132.293us; 50%=001ms384.375us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 2719
Counter: 06m24s013ms667.638us
ValueRate: 763ms254.722us / second
Rate: 5.67856 / second
Percentiles: 1%=001ms254.353us; 5%=001ms411.684us; 10%=002ms544.998us; 20%=002ms722.635us; 50%=007ms68.672us; 80%=216ms972.989us; 90%=519ms105.204us; 95%=523ms813.210us; 99%=527ms980.216us
Counter: CachedSyncTensors
Value: 1005
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 381651
Counter: CreateXlaTensor
Value: 2983229
Counter: DestroyDataHandles
Value: 380771
Counter: DestroyXlaTensor
Value: 2982477
Counter: ReleaseDataHandles
Value: 380771
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 202
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:54:43, device xla:1, step 0
validation/ 18:54:45, device xla:1, step 10
validation/ 18:54:47, device xla:1, step 20
validation stats on subset "valid" - 18:54:48
| epoch 023 | valid on 'valid' subset | loss 10.578 | nll_loss 9.600 | ppl 776.14 | num_updates 506
old learning rate: 6.058790000000001e-05
new learning rate: 6.333735000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1035
Counter: 07m23s717ms808.816us
ValueRate: 822ms88.057us / second
Rate: 1.99168 / second
Percentiles: 1%=002ms332.318us; 5%=216ms948.458us; 10%=217ms590.807us; 20%=217ms294.495us; 50%=219ms253.193us; 80%=617ms415.825us; 90%=619ms784.975us; 95%=620ms202.262us; 99%=623ms607.722us
Metric: InboundData
TotalSamples: 206
Counter: 758.00B
ValueRate: 1.50B / second
Rate: 0.40707 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2748
Counter: 1004.31MB
ValueRate: 403.84KB / second
Rate: 5.66872 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 15092
Counter: 54s035ms914.477us
ValueRate: 083ms982.041us / second
Rate: 30.2706 / second
Percentiles: 1%=680.215us; 5%=783.649us; 10%=876.863us; 20%=995.329us; 50%=001ms376.586us; 80%=004ms471.726us; 90%=008ms861.408us; 95%=009ms934.776us; 99%=011ms648.261us
Metric: TransferFromServerTime
TotalSamples: 206
Counter: 830ms557.339us
ValueRate: 002ms639.260us / second
Rate: 0.40707 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=997.327us; 20%=001ms132.293us; 50%=001ms369.974us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 2748
Counter: 06m29s985ms403.502us
ValueRate: 766ms536.413us / second
Rate: 5.66262 / second
Percentiles: 1%=001ms254.353us; 5%=001ms411.684us; 10%=002ms547.137us; 20%=002ms726.048us; 50%=007ms151.826us; 80%=216ms722.907us; 90%=519ms569.274us; 95%=522ms335.988us; 99%=527ms910.451us
Counter: CachedSyncTensors
Value: 1028
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 381928
Counter: CreateXlaTensor
Value: 3007830
Counter: DestroyDataHandles
Value: 381048
Counter: DestroyXlaTensor
Value: 3007079
Counter: ReleaseDataHandles
Value: 381049
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 206
Epoch 24 begin 18:54:48
training/ 18:54:54, device xla:1, step 10, Rate=172.11, Global Rate=196.15
training/ 18:55:01, device xla:1, step 20, Rate=177.95, Global Rate=187.03
Epoch 24 Training stats:
device xla:1
| epoch 024 | loss 10.158 | nll_loss 9.421 | ppl 685.45 | wps 2505 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 528 | lr 6.60868e-05 | gnorm 6.423 | clip 0.000 | oom 0.000 | wall 700 | train_wall 347
Epoch 24 Tracker Rates:
Rate=178.73, Global Rate=186.38
Epoch 24 end 18:55:04
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1057
Counter: 08m36s277ms111.307us
ValueRate: 874ms236.587us / second
Rate: 2.1297 / second
Percentiles: 1%=002ms332.318us; 5%=216ms948.975us; 10%=217ms630.661us; 20%=217ms320.349us; 50%=221ms690.555us; 80%=617ms407.815us; 90%=619ms699.820us; 95%=620ms945.091us; 99%=622ms752.701us
Metric: InboundData
TotalSamples: 211
Counter: 778.00B
ValueRate: 1.49B / second
Rate: 0.404347 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2836
Counter: 1008.48MB
ValueRate: 388.69KB / second
Rate: 5.66697 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 15618
Counter: 56s731ms375.330us
ValueRate: 094ms81.582us / second
Rate: 31.7957 / second
Percentiles: 1%=640.890us; 5%=767.553us; 10%=876.025us; 20%=001ms2.708us; 50%=001ms484.388us; 80%=005ms124.279us; 90%=008ms987.867us; 95%=009ms354.390us; 99%=011ms124.758us
Metric: TransferFromServerTime
TotalSamples: 211
Counter: 839ms463.551us
ValueRate: 002ms608.693us / second
Rate: 0.404347 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms18.120us; 20%=001ms132.293us; 50%=001ms369.974us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 2836
Counter: 07m41s542ms554.199us
ValueRate: 765ms703.681us / second
Rate: 5.66714 / second
Percentiles: 1%=001ms279.910us; 5%=001ms422.269us; 10%=002ms547.137us; 20%=002ms722.289us; 50%=007ms58.766us; 80%=216ms12.894us; 90%=519ms93.484us; 95%=522ms468.692us; 99%=527ms980.216us
Counter: CachedSyncTensors
Value: 1050
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 398246
Counter: CreateXlaTensor
Value: 3113980
Counter: DestroyDataHandles
Value: 397367
Counter: DestroyXlaTensor
Value: 3113229
Counter: ReleaseDataHandles
Value: 397367
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 211
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:55:04, device xla:1, step 0
validation/ 18:55:06, device xla:1, step 10
validation/ 18:55:08, device xla:1, step 20
validation stats on subset "valid" - 18:55:09
| epoch 024 | valid on 'valid' subset | loss 10.442 | nll_loss 9.456 | ppl 702.09 | num_updates 528
old learning rate: 6.333735000000002e-05
new learning rate: 6.60868e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1080
Counter: 08m41s049ms599.905us
ValueRate: 876ms800.462us / second
Rate: 2.15719 / second
Percentiles: 1%=002ms316.927us; 5%=216ms987.439us; 10%=217ms639.892us; 20%=217ms294.495us; 50%=219ms75.287us; 80%=617ms388.057us; 90%=619ms636.655us; 95%=620ms934.552us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 215
Counter: 791.00B
ValueRate: 1.50B / second
Rate: 0.408073 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2864
Counter: 1012.65MB
ValueRate: 407.32KB / second
Rate: 5.65736 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 15719
Counter: 56s849ms302.972us
ValueRate: 081ms934.528us / second
Rate: 29.2201 / second
Percentiles: 1%=641.138us; 5%=770.831us; 10%=876.999us; 20%=001ms9.139us; 50%=001ms399.079us; 80%=004ms468.053us; 90%=008ms692.353us; 95%=009ms276.241us; 99%=011ms28.312us
Metric: TransferFromServerTime
TotalSamples: 215
Counter: 845ms304.078us
ValueRate: 002ms604.397us / second
Rate: 0.408073 / second
Percentiles: 1%=807.358us; 5%=860.901us; 10%=997.327us; 20%=001ms128.042us; 50%=001ms369.903us; 80%=003ms712.052us; 90%=004ms829.824us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 2864
Counter: 07m45s454ms883.154us
ValueRate: 770ms641.109us / second
Rate: 5.65133 / second
Percentiles: 1%=001ms279.910us; 5%=001ms422.269us; 10%=002ms547.137us; 20%=002ms727.058us; 50%=007ms158.095us; 80%=216ms513.652us; 90%=519ms849.810us; 95%=522ms449.923us; 99%=527ms980.216us
Counter: CachedSyncTensors
Value: 1073
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 398522
Counter: CreateXlaTensor
Value: 3138581
Counter: DestroyDataHandles
Value: 397641
Counter: DestroyXlaTensor
Value: 3137830
Counter: ReleaseDataHandles
Value: 397643
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 215
Epoch 25 begin 18:55:09
training/ 18:55:15, device xla:1, step 10, Rate=169.60, Global Rate=194.83
training/ 18:55:22, device xla:1, step 20, Rate=177.09, Global Rate=186.81
Epoch 25 Training stats:
device xla:1
| epoch 025 | loss 9.951 | nll_loss 9.187 | ppl 582.98 | wps 2534 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 550 | lr 6.88363e-05 | gnorm 6.537 | clip 0.000 | oom 0.000 | wall 721 | train_wall 361
Epoch 25 Tracker Rates:
Rate=176.86, Global Rate=185.82
Epoch 25 end 18:55:24
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1102
Counter: 08m55s620ms485.995us
ValueRate: 874ms394.521us / second
Rate: 2.13105 / second
Percentiles: 1%=002ms316.927us; 5%=216ms20.961us; 10%=217ms654.643us; 20%=217ms339.626us; 50%=221ms690.555us; 80%=617ms497.350us; 90%=619ms744.182us; 95%=620ms945.091us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 220
Counter: 811.00B
ValueRate: 1.49B / second
Rate: 0.405394 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2952
Counter: 1016.82MB
ValueRate: 391.85KB / second
Rate: 5.65065 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 16282
Counter: 58s652ms590.733us
ValueRate: 098ms333.703us / second
Rate: 32.5227 / second
Percentiles: 1%=624.651us; 5%=726.046us; 10%=837.856us; 20%=976.634us; 50%=001ms434.884us; 80%=006ms507.237us; 90%=008ms92.041us; 95%=009ms461.264us; 99%=011ms109.183us
Metric: TransferFromServerTime
TotalSamples: 220
Counter: 852ms393.992us
ValueRate: 002ms570.705us / second
Rate: 0.405394 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms14.648us; 20%=001ms128.042us; 50%=001ms369.856us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 2952
Counter: 07m56s426ms517.733us
ValueRate: 765ms31.749us / second
Rate: 5.65071 / second
Percentiles: 1%=001ms279.910us; 5%=001ms431.185us; 10%=002ms549.652us; 20%=002ms722.289us; 50%=007ms51.864us; 80%=216ms901.493us; 90%=519ms105.204us; 95%=522ms468.692us; 99%=528ms799.967us
Counter: CachedSyncTensors
Value: 1095
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 414840
Counter: CreateXlaTensor
Value: 3244731
Counter: DestroyDataHandles
Value: 413961
Counter: DestroyXlaTensor
Value: 3243980
Counter: ReleaseDataHandles
Value: 413961
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 220
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:55:24, device xla:1, step 0
validation/ 18:55:27, device xla:1, step 10
validation/ 18:55:29, device xla:1, step 20
validation stats on subset "valid" - 18:55:29
| epoch 025 | valid on 'valid' subset | loss 10.506 | nll_loss 9.499 | ppl 723.79 | num_updates 550
old learning rate: 6.60868e-05
new learning rate: 6.883625000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1125
Counter: 08m59s393ms836.031us
ValueRate: 876ms394.897us / second
Rate: 2.15851 / second
Percentiles: 1%=002ms316.927us; 5%=216ms982.426us; 10%=217ms594.216us; 20%=217ms278.866us; 50%=219ms79.744us; 80%=617ms495.486us; 90%=619ms744.182us; 95%=620ms945.091us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 224
Counter: 824.00B
ValueRate: 1.50B / second
Rate: 0.408957 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 2982
Counter: 1020.99MB
ValueRate: 407.61KB / second
Rate: 5.66145 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 16376
Counter: 58s768ms702.921us
ValueRate: 086ms257.503us / second
Rate: 30.9518 / second
Percentiles: 1%=628.510us; 5%=730.633us; 10%=847.663us; 20%=982.724us; 50%=001ms368.768us; 80%=005ms887.928us; 90%=008ms859.577us; 95%=009ms213.267us; 99%=011ms100.835us
Metric: TransferFromServerTime
TotalSamples: 224
Counter: 859ms14.249us
ValueRate: 002ms568.304us / second
Rate: 0.408957 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms14.648us; 20%=001ms123.852us; 50%=001ms369.856us; 80%=003ms664.984us; 90%=004ms543.772us; 95%=006ms413.828us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 2982
Counter: 07m01s375ms740.660us
ValueRate: 770ms314.321us / second
Rate: 5.65525 / second
Percentiles: 1%=001ms279.910us; 5%=001ms433.332us; 10%=002ms551.406us; 20%=002ms727.457us; 50%=007ms158.095us; 80%=215ms164.198us; 90%=519ms42.867us; 95%=522ms414.219us; 99%=527ms980.216us
Counter: CachedSyncTensors
Value: 1118
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 415118
Counter: CreateXlaTensor
Value: 3269332
Counter: DestroyDataHandles
Value: 414237
Counter: DestroyXlaTensor
Value: 3268580
Counter: ReleaseDataHandles
Value: 414238
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 224
Epoch 26 begin 18:55:29
training/ 18:55:36, device xla:1, step 10, Rate=171.21, Global Rate=194.78
training/ 18:55:43, device xla:1, step 20, Rate=178.74, Global Rate=187.19
Epoch 26 Training stats:
device xla:1
| epoch 026 | loss 9.751 | nll_loss 8.960 | ppl 498.01 | wps 2561 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 572 | lr 7.15857e-05 | gnorm 6.660 | clip 0.000 | oom 0.000 | wall 741 | train_wall 375
Epoch 26 Tracker Rates:
Rate=179.21, Global Rate=186.51
Epoch 26 end 18:55:45
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1147
Counter: 08m13s960ms933.633us
ValueRate: 875ms230.349us / second
Rate: 2.13298 / second
Percentiles: 1%=002ms316.927us; 5%=216ms987.439us; 10%=217ms639.892us; 20%=217ms320.349us; 50%=221ms690.555us; 80%=618ms541.287us; 90%=619ms784.975us; 95%=620ms971.459us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 229
Counter: 844.00B
ValueRate: 1.50B / second
Rate: 0.406389 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3071
Counter: 1.00GB
ValueRate: 388.43KB / second
Rate: 5.66325 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 16944
Counter: 60s543ms902.895us
ValueRate: 100ms121.012us / second
Rate: 34.0011 / second
Percentiles: 1%=608.519us; 5%=724.818us; 10%=833.389us; 20%=988.271us; 50%=001ms451.969us; 80%=005ms226.609us; 90%=008ms55.677us; 95%=009ms238.293us; 99%=011ms597.524us
Metric: TransferFromServerTime
TotalSamples: 229
Counter: 876ms708.519us
ValueRate: 002ms554.053us / second
Rate: 0.406389 / second
Percentiles: 1%=807.358us; 5%=877.012us; 10%=001ms14.648us; 20%=001ms128.042us; 50%=001ms369.856us; 80%=003ms712.052us; 90%=004ms898.422us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3071
Counter: 07m12s423ms91.031us
ValueRate: 769ms144.457us / second
Rate: 5.67727 / second
Percentiles: 1%=001ms222.465us; 5%=001ms432.409us; 10%=002ms548.857us; 20%=002ms707.224us; 50%=007ms37.786us; 80%=216ms752.150us; 90%=519ms198.661us; 95%=523ms813.210us; 99%=528ms799.967us
Counter: CachedSyncTensors
Value: 1140
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 431437
Counter: CreateXlaTensor
Value: 3375482
Counter: DestroyDataHandles
Value: 430557
Counter: DestroyXlaTensor
Value: 3374730
Counter: ReleaseDataHandles
Value: 430557
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 229
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:55:45, device xla:1, step 0
validation/ 18:55:47, device xla:1, step 10
validation/ 18:55:50, device xla:1, step 20
validation stats on subset "valid" - 18:55:50
| epoch 026 | valid on 'valid' subset | loss 10.596 | nll_loss 9.570 | ppl 759.84 | num_updates 572
old learning rate: 6.883625000000001e-05
new learning rate: 7.158570000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1170
Counter: 08m18s731ms913.319us
ValueRate: 877ms979.187us / second
Rate: 2.15982 / second
Percentiles: 1%=002ms301.524us; 5%=216ms982.426us; 10%=217ms601.459us; 20%=217ms285.109us; 50%=219ms79.744us; 80%=618ms517.320us; 90%=619ms774.216us; 95%=620ms945.091us; 99%=622ms732.126us
Metric: InboundData
TotalSamples: 233
Counter: 857.00B
ValueRate: 1.51B / second
Rate: 0.409809 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3099
Counter: 1.01GB
ValueRate: 406.93KB / second
Rate: 5.65196 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 17041
Counter: 60s653ms461.916us
ValueRate: 087ms500.852us / second
Rate: 31.6511 / second
Percentiles: 1%=617.638us; 5%=743.600us; 10%=842.392us; 20%=984.673us; 50%=001ms380.863us; 80%=005ms658.451us; 90%=008ms603.397us; 95%=009ms117.115us; 99%=011ms570.472us
Metric: TransferFromServerTime
TotalSamples: 233
Counter: 881ms54.057us
ValueRate: 002ms549.631us / second
Rate: 0.409809 / second
Percentiles: 1%=807.358us; 5%=864.616us; 10%=997.327us; 20%=001ms123.852us; 50%=001ms357.165us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3099
Counter: 07m17s326ms132.920us
ValueRate: 774ms970.469us / second
Rate: 5.6621 / second
Percentiles: 1%=001ms246.583us; 5%=001ms434.772us; 10%=002ms552.496us; 20%=002ms727.058us; 50%=007ms150.705us; 80%=215ms161.913us; 90%=519ms42.867us; 95%=522ms449.923us; 99%=528ms799.967us
Counter: CachedSyncTensors
Value: 1163
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 431713
Counter: CreateXlaTensor
Value: 3400083
Counter: DestroyDataHandles
Value: 430832
Counter: DestroyXlaTensor
Value: 3399331
Counter: ReleaseDataHandles
Value: 430833
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 233
Epoch 27 begin 18:55:50
training/ 18:55:57, device xla:1, step 10, Rate=173.45, Global Rate=197.22
training/ 18:56:04, device xla:1, step 20, Rate=179.86, Global Rate=189.40
Epoch 27 Training stats:
device xla:1
| epoch 027 | loss 9.553 | nll_loss 8.736 | ppl 426.24 | wps 2588 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 594 | lr 7.43352e-05 | gnorm 6.763 | clip 0.000 | oom 0.000 | wall 762 | train_wall 389
Epoch 27 Tracker Rates:
Rate=179.89, Global Rate=188.51
Epoch 27 end 18:56:06
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1192
Counter: 09m31s321ms866.530us
ValueRate: 876ms989.047us / second
Rate: 2.13461 / second
Percentiles: 1%=002ms301.524us; 5%=216ms20.961us; 10%=217ms619.208us; 20%=217ms350.715us; 50%=220ms823.010us; 80%=618ms618.646us; 90%=619ms858.864us; 95%=620ms122.883us; 99%=622ms732.126us
Metric: InboundData
TotalSamples: 238
Counter: 877.00B
ValueRate: 1.50B / second
Rate: 0.407424 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3187
Counter: 1.01GB
ValueRate: 391.91KB / second
Rate: 5.65152 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 17561
Counter: 01m01s370ms42.028us
ValueRate: 098ms130.565us / second
Rate: 32.0263 / second
Percentiles: 1%=589.281us; 5%=728.558us; 10%=841.297us; 20%=001ms10.744us; 50%=001ms490.128us; 80%=006ms566.644us; 90%=008ms229.750us; 95%=010ms580.871us; 99%=011ms97.745us
Metric: TransferFromServerTime
TotalSamples: 238
Counter: 891ms777.283us
ValueRate: 002ms524.891us / second
Rate: 0.407424 / second
Percentiles: 1%=785.058us; 5%=851.507us; 10%=960.713us; 20%=001ms121.827us; 50%=001ms357.165us; 80%=003ms664.984us; 90%=004ms898.422us; 95%=008ms588.023us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3187
Counter: 07m29s920ms838.523us
ValueRate: 771ms949.435us / second
Rate: 5.65153 / second
Percentiles: 1%=001ms222.465us; 5%=001ms433.798us; 10%=002ms547.137us; 20%=002ms707.224us; 50%=007ms23.125us; 80%=216ms589.328us; 90%=519ms93.484us; 95%=523ms967.085us; 99%=529ms620.605us
Counter: CachedSyncTensors
Value: 1185
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 448031
Counter: CreateXlaTensor
Value: 3506233
Counter: DestroyDataHandles
Value: 447151
Counter: DestroyXlaTensor
Value: 3505481
Counter: ReleaseDataHandles
Value: 447151
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 238
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:56:06, device xla:1, step 0
validation/ 18:56:08, device xla:1, step 10
validation/ 18:56:10, device xla:1, step 20
validation stats on subset "valid" - 18:56:11
| epoch 027 | valid on 'valid' subset | loss 10.560 | nll_loss 9.467 | ppl 707.48 | num_updates 594
old learning rate: 7.158570000000001e-05
new learning rate: 7.433515000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1215
Counter: 09m36s084ms963.233us
ValueRate: 878ms615.117us / second
Rate: 2.16127 / second
Percentiles: 1%=002ms274.850us; 5%=216ms948.458us; 10%=217ms563.635us; 20%=217ms294.495us; 50%=219ms75.287us; 80%=618ms579.077us; 90%=619ms831.666us; 95%=620ms118.734us; 99%=622ms732.126us
Metric: InboundData
TotalSamples: 242
Counter: 890.00B
ValueRate: 1.51B / second
Rate: 0.410733 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3217
Counter: 1.01GB
ValueRate: 407.32KB / second
Rate: 5.6574 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 17654
Counter: 01m01s478ms238.491us
ValueRate: 085ms14.853us / second
Rate: 29.9455 / second
Percentiles: 1%=592.922us; 5%=749.532us; 10%=853.658us; 20%=001ms8.542us; 50%=001ms408.520us; 80%=005ms921.926us; 90%=008ms17.640us; 95%=009ms372.025us; 99%=011ms79.264us
Metric: TransferFromServerTime
TotalSamples: 242
Counter: 897ms432.246us
ValueRate: 002ms523.161us / second
Rate: 0.410733 / second
Percentiles: 1%=785.058us; 5%=860.901us; 10%=967.728us; 20%=001ms122.360us; 50%=001ms357.165us; 80%=003ms615.203us; 90%=004ms829.824us; 95%=006ms413.828us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3217
Counter: 08m34s837ms164.600us
ValueRate: 776ms96.001us / second
Rate: 5.65755 / second
Percentiles: 1%=001ms222.465us; 5%=001ms433.798us; 10%=002ms548.857us; 20%=002ms723.209us; 50%=007ms151.826us; 80%=215ms56.845us; 90%=519ms830.442us; 95%=523ms856.287us; 99%=529ms620.605us
Counter: CachedSyncTensors
Value: 1208
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 448309
Counter: CreateXlaTensor
Value: 3530834
Counter: DestroyDataHandles
Value: 447428
Counter: DestroyXlaTensor
Value: 3530082
Counter: ReleaseDataHandles
Value: 447429
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 242
Epoch 28 begin 18:56:11
training/ 18:56:18, device xla:1, step 10, Rate=168.36, Global Rate=193.01
training/ 18:56:25, device xla:1, step 20, Rate=178.10, Global Rate=186.55
Epoch 28 Training stats:
device xla:1
| epoch 028 | loss 9.363 | nll_loss 8.520 | ppl 367.05 | wps 2612 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 616 | lr 7.70846e-05 | gnorm 6.895 | clip 0.000 | oom 0.000 | wall 783 | train_wall 403
Epoch 28 Tracker Rates:
Rate=177.43, Global Rate=185.56
Epoch 28 end 18:56:27
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1237
Counter: 09m50s660ms268.770us
ValueRate: 876ms141.360us / second
Rate: 2.13496 / second
Percentiles: 1%=002ms274.850us; 5%=216ms954.673us; 10%=217ms606.075us; 20%=217ms357.129us; 50%=220ms823.010us; 80%=618ms618.646us; 90%=619ms889.324us; 95%=620ms150.485us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 247
Counter: 910.00B
ValueRate: 1.50B / second
Rate: 0.408242 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3305
Counter: 1.02GB
ValueRate: 392.21KB / second
Rate: 5.65585 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 18210
Counter: 01m03s227ms270.717us
ValueRate: 096ms32.465us / second
Rate: 31.928 / second
Percentiles: 1%=629.686us; 5%=737.042us; 10%=823.957us; 20%=990.075us; 50%=001ms463.931us; 80%=005ms428.837us; 90%=008ms201.149us; 95%=010ms751.234us; 99%=011ms364.554us
Metric: TransferFromServerTime
TotalSamples: 247
Counter: 908ms904.755us
ValueRate: 002ms500.587us / second
Rate: 0.408242 / second
Percentiles: 1%=763.276us; 5%=849.291us; 10%=944.157us; 20%=001ms121.827us; 50%=001ms354.718us; 80%=003ms664.984us; 90%=004ms898.422us; 95%=006ms413.828us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3305
Counter: 08m45s812ms699.393us
ValueRate: 771ms181.884us / second
Rate: 5.65586 / second
Percentiles: 1%=001ms213.529us; 5%=001ms426.122us; 10%=002ms539.731us; 20%=002ms700.090us; 50%=007ms51.983us; 80%=215ms164.198us; 90%=519ms93.484us; 95%=522ms449.923us; 99%=528ms55.995us
Counter: CachedSyncTensors
Value: 1230
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 464627
Counter: CreateXlaTensor
Value: 3636984
Counter: DestroyDataHandles
Value: 463747
Counter: DestroyXlaTensor
Value: 3636232
Counter: ReleaseDataHandles
Value: 463747
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 247
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:56:27, device xla:1, step 0
validation/ 18:56:29, device xla:1, step 10
validation/ 18:56:31, device xla:1, step 20
validation stats on subset "valid" - 18:56:32
| epoch 028 | valid on 'valid' subset | loss 10.202 | nll_loss 9.227 | ppl 599.42 | num_updates 616
old learning rate: 7.433515000000001e-05
new learning rate: 7.708460000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1260
Counter: 09m54s434ms997.704us
ValueRate: 878ms753.135us / second
Rate: 2.16151 / second
Percentiles: 1%=002ms274.850us; 5%=216ms976.284us; 10%=217ms594.216us; 20%=217ms317.213us; 50%=219ms75.287us; 80%=618ms597.866us; 90%=619ms848.354us; 95%=620ms118.734us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 251
Counter: 923.00B
ValueRate: 1.51B / second
Rate: 0.411421 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3335
Counter: 1.02GB
ValueRate: 407.88KB / second
Rate: 5.6651 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 18312
Counter: 01m03s335ms348.404us
ValueRate: 083ms812.845us / second
Rate: 30.3814 / second
Percentiles: 1%=629.686us; 5%=747.925us; 10%=823.841us; 20%=967.216us; 50%=001ms336.088us; 80%=005ms566.126us; 90%=008ms648.880us; 95%=010ms515.300us; 99%=011ms913.232us
Metric: TransferFromServerTime
TotalSamples: 251
Counter: 914ms936.101us
ValueRate: 001ms498.056us / second
Rate: 0.411421 / second
Percentiles: 1%=763.276us; 5%=849.291us; 10%=944.157us; 20%=001ms111.567us; 50%=001ms353.268us; 80%=003ms664.984us; 90%=004ms829.824us; 95%=006ms413.828us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3335
Counter: 08m50s725ms16.562us
ValueRate: 776ms463.663us / second
Rate: 5.65922 / second
Percentiles: 1%=001ms213.529us; 5%=001ms432.409us; 10%=002ms545.531us; 20%=002ms707.224us; 50%=007ms179.751us; 80%=215ms26.475us; 90%=519ms963.268us; 95%=522ms449.923us; 99%=528ms55.995us
Counter: CachedSyncTensors
Value: 1253
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 464905
Counter: CreateXlaTensor
Value: 3661585
Counter: DestroyDataHandles
Value: 464024
Counter: DestroyXlaTensor
Value: 3660834
Counter: ReleaseDataHandles
Value: 464026
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 251
Epoch 29 begin 18:56:32
training/ 18:56:38, device xla:1, step 10, Rate=166.77, Global Rate=191.41
training/ 18:56:46, device xla:1, step 20, Rate=178.81, Global Rate=185.28
Epoch 29 Training stats:
device xla:1
| epoch 029 | loss 9.177 | nll_loss 8.309 | ppl 317.14 | wps 2635 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 638 | lr 7.98341e-05 | gnorm 6.985 | clip 0.000 | oom 0.000 | wall 804 | train_wall 416
Epoch 29 Tracker Rates:
Rate=179.23, Global Rate=184.79
Epoch 29 end 18:56:48
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1282
Counter: 09m08s956ms654.431us
ValueRate: 876ms22.940us / second
Rate: 2.1349 / second
Percentiles: 1%=002ms274.850us; 5%=216ms980.352us; 10%=217ms601.459us; 20%=217ms334.220us; 50%=220ms823.010us; 80%=618ms597.866us; 90%=619ms848.354us; 95%=620ms118.734us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 256
Counter: 943.00B
ValueRate: 1.51B / second
Rate: 0.408958 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3426
Counter: 1.03GB
ValueRate: 393.70KB / second
Rate: 5.67733 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 18876
Counter: 01m05s002ms272.284us
ValueRate: 094ms983.048us / second
Rate: 33.1362 / second
Percentiles: 1%=642.363us; 5%=749.445us; 10%=819.585us; 20%=967.216us; 50%=001ms421.822us; 80%=005ms893.174us; 90%=008ms656.473us; 95%=010ms526.250us; 99%=011ms863.356us
Metric: TransferFromServerTime
TotalSamples: 256
Counter: 923ms419.284us
ValueRate: 001ms475.154us / second
Rate: 0.408958 / second
Percentiles: 1%=763.276us; 5%=849.291us; 10%=944.157us; 20%=001ms118.044us; 50%=001ms353.268us; 80%=003ms664.984us; 90%=004ms898.422us; 95%=006ms413.828us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3426
Counter: 08m02s921ms617.806us
ValueRate: 778ms44.734us / second
Rate: 5.67734 / second
Percentiles: 1%=001ms181.137us; 5%=001ms428.759us; 10%=002ms543.415us; 20%=002ms703.450us; 50%=007ms18.726us; 80%=215ms333.745us; 90%=519ms581.513us; 95%=523ms924.050us; 99%=530ms946.063us
Counter: CachedSyncTensors
Value: 1275
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 481226
Counter: CreateXlaTensor
Value: 3767735
Counter: DestroyDataHandles
Value: 480347
Counter: DestroyXlaTensor
Value: 3766984
Counter: ReleaseDataHandles
Value: 480347
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 256
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:56:48, device xla:1, step 0
validation/ 18:56:50, device xla:1, step 10
validation/ 18:56:52, device xla:1, step 20
validation stats on subset "valid" - 18:56:53
| epoch 029 | valid on 'valid' subset | loss 10.624 | nll_loss 9.630 | ppl 792.41 | num_updates 638
old learning rate: 7.708460000000001e-05
new learning rate: 7.983405000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1305
Counter: 09m13s738ms505.698us
ValueRate: 878ms544.169us / second
Rate: 2.16134 / second
Percentiles: 1%=002ms274.850us; 5%=216ms914.202us; 10%=217ms547.116us; 20%=217ms285.109us; 50%=219ms50.229us; 80%=618ms541.287us; 90%=619ms773.910us; 95%=620ms999.913us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 260
Counter: 956.00B
ValueRate: 1.51B / second
Rate: 0.412024 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3454
Counter: 1.03GB
ValueRate: 408.20KB / second
Rate: 5.66962 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 18971
Counter: 01m05s115ms446.207us
ValueRate: 083ms272.838us / second
Rate: 30.8963 / second
Percentiles: 1%=643.375us; 5%=762.181us; 10%=841.043us; 20%=984.972us; 50%=001ms355.152us; 80%=004ms237.825us; 90%=007ms298.514us; 95%=009ms370.475us; 99%=011ms863.356us
Metric: TransferFromServerTime
TotalSamples: 260
Counter: 930ms113.218us
ValueRate: 001ms473.957us / second
Rate: 0.412024 / second
Percentiles: 1%=763.276us; 5%=851.507us; 10%=960.713us; 20%=001ms121.827us; 50%=001ms354.718us; 80%=003ms664.984us; 90%=004ms898.422us; 95%=006ms413.828us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3454
Counter: 08m07s851ms531.713us
ValueRate: 786ms865.072us / second
Rate: 5.67885 / second
Percentiles: 1%=001ms213.529us; 5%=001ms433.332us; 10%=002ms546.907us; 20%=002ms713.811us; 50%=007ms160.425us; 80%=215ms26.475us; 90%=518ms151.149us; 95%=523ms657.012us; 99%=530ms946.063us
Counter: CachedSyncTensors
Value: 1298
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 481502
Counter: CreateXlaTensor
Value: 3792336
Counter: DestroyDataHandles
Value: 480621
Counter: DestroyXlaTensor
Value: 3791584
Counter: ReleaseDataHandles
Value: 480622
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 260
Epoch 30 begin 18:56:53
training/ 18:56:59, device xla:1, step 10, Rate=173.82, Global Rate=197.27
training/ 18:57:06, device xla:1, step 20, Rate=179.44, Global Rate=188.28
Epoch 30 Training stats:
device xla:1
| epoch 030 | loss 8.992 | nll_loss 8.100 | ppl 274.34 | wps 2657 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 660 | lr 8.25835e-05 | gnorm 7.020 | clip 0.000 | oom 0.000 | wall 825 | train_wall 430
Epoch 30 Tracker Rates:
Rate=179.49, Global Rate=187.46
Epoch 30 end 18:57:08
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1327
Counter: 09m26s287ms680.816us
ValueRate: 876ms396.659us / second
Rate: 2.13607 / second
Percentiles: 1%=002ms274.850us; 5%=216ms914.202us; 10%=217ms547.116us; 20%=217ms285.109us; 50%=220ms823.010us; 80%=618ms558.532us; 90%=619ms784.975us; 95%=620ms999.913us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 265
Counter: 976.00B
ValueRate: 1.51B / second
Rate: 0.409764 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3542
Counter: 1.03GB
ValueRate: 392.75KB / second
Rate: 5.6636 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 19551
Counter: 01m07s904ms984.403us
ValueRate: 095ms816.945us / second
Rate: 33.351 / second
Percentiles: 1%=628.065us; 5%=715.582us; 10%=821.775us; 20%=001ms15.574us; 50%=001ms442.766us; 80%=005ms59.668us; 90%=008ms529.154us; 95%=009ms840.124us; 99%=011ms706.997us
Metric: TransferFromServerTime
TotalSamples: 265
Counter: 940ms988.175us
ValueRate: 001ms453.484us / second
Rate: 0.409764 / second
Percentiles: 1%=763.276us; 5%=851.507us; 10%=960.713us; 20%=001ms121.827us; 50%=001ms353.268us; 80%=003ms664.984us; 90%=004ms898.422us; 95%=005ms178.402us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3542
Counter: 08m18s923ms905.824us
ValueRate: 776ms184.399us / second
Rate: 5.66361 / second
Percentiles: 1%=001ms195.611us; 5%=001ms428.084us; 10%=002ms536.714us; 20%=002ms702.708us; 50%=007ms6.516us; 80%=215ms97.111us; 90%=519ms715.385us; 95%=524ms516.679us; 99%=530ms946.063us
Counter: CachedSyncTensors
Value: 1320
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 497820
Counter: CreateXlaTensor
Value: 3898486
Counter: DestroyDataHandles
Value: 496940
Counter: DestroyXlaTensor
Value: 3897734
Counter: ReleaseDataHandles
Value: 496940
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 265
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:57:08, device xla:1, step 0
validation/ 18:57:11, device xla:1, step 10
validation/ 18:57:13, device xla:1, step 20
validation stats on subset "valid" - 18:57:13
| epoch 030 | valid on 'valid' subset | loss 10.001 | nll_loss 8.993 | ppl 509.38 | num_updates 660
old learning rate: 7.983405000000001e-05
new learning rate: 8.258350000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1350
Counter: 10m31s081ms359.667us
ValueRate: 878ms973.631us / second
Rate: 2.16261 / second
Percentiles: 1%=002ms138.434us; 5%=216ms914.202us; 10%=217ms547.116us; 20%=217ms266.069us; 50%=219ms75.287us; 80%=618ms501.147us; 90%=619ms699.820us; 95%=620ms911.328us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 269
Counter: 989.00B
ValueRate: 1.52B / second
Rate: 0.412722 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3570
Counter: 1.04GB
ValueRate: 407.29KB / second
Rate: 5.65699 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 19650
Counter: 01m07s026ms576.693us
ValueRate: 084ms476.513us / second
Rate: 31.1267 / second
Percentiles: 1%=628.065us; 5%=733.695us; 10%=831.097us; 20%=001ms19.111us; 50%=001ms366.335us; 80%=005ms598.015us; 90%=007ms248.197us; 95%=009ms717.095us; 99%=011ms706.997us
Metric: TransferFromServerTime
TotalSamples: 269
Counter: 946ms798.081us
ValueRate: 001ms451.121us / second
Rate: 0.412722 / second
Percentiles: 1%=763.276us; 5%=851.507us; 10%=960.713us; 20%=001ms118.044us; 50%=001ms353.268us; 80%=003ms648.791us; 90%=004ms898.422us; 95%=005ms178.402us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3570
Counter: 08m23s889ms283.857us
ValueRate: 784ms62.949us / second
Rate: 5.66587 / second
Percentiles: 1%=001ms195.611us; 5%=001ms428.084us; 10%=002ms536.714us; 20%=002ms720.719us; 50%=007ms150.705us; 80%=215ms26.475us; 90%=519ms705.845us; 95%=524ms516.679us; 99%=530ms946.063us
Counter: CachedSyncTensors
Value: 1343
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 498096
Counter: CreateXlaTensor
Value: 3923087
Counter: DestroyDataHandles
Value: 497215
Counter: DestroyXlaTensor
Value: 3922336
Counter: ReleaseDataHandles
Value: 497217
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 269
Epoch 31 begin 18:57:13
training/ 18:57:20, device xla:1, step 10, Rate=169.19, Global Rate=194.46
training/ 18:57:27, device xla:1, step 20, Rate=178.50, Global Rate=186.97
Epoch 31 Training stats:
device xla:1
| epoch 031 | loss 8.814 | nll_loss 7.897 | ppl 238.39 | wps 2678 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 682 | lr 8.5333e-05 | gnorm 7.055 | clip 0.000 | oom 0.000 | wall 845 | train_wall 444
Epoch 31 Tracker Rates:
Rate=178.50, Global Rate=186.14
Epoch 31 end 18:57:29
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1372
Counter: 10m45s630ms66.391us
ValueRate: 877ms627.313us / second
Rate: 2.13695 / second
Percentiles: 1%=002ms138.434us; 5%=216ms914.202us; 10%=217ms547.116us; 20%=217ms266.069us; 50%=220ms823.010us; 80%=617ms497.350us; 90%=619ms699.820us; 95%=620ms911.328us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 274
Counter: 1009.00B
ValueRate: 1.51B / second
Rate: 0.410452 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3661
Counter: 1.04GB
ValueRate: 393.01KB / second
Rate: 5.66743 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 20254
Counter: 01m09s843ms156.649us
ValueRate: 099ms54.316us / second
Rate: 34.856 / second
Percentiles: 1%=592.854us; 5%=708.435us; 10%=823.421us; 20%=997.204us; 50%=001ms438.399us; 80%=005ms837.883us; 90%=008ms892.885us; 95%=009ms753.000us; 99%=011ms887.324us
Metric: TransferFromServerTime
TotalSamples: 274
Counter: 954ms795.733us
ValueRate: 001ms428.785us / second
Rate: 0.410452 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=944.157us; 20%=001ms100.351us; 50%=001ms351.346us; 80%=003ms648.791us; 90%=004ms898.422us; 95%=005ms178.402us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3661
Counter: 09m34s954ms808.243us
ValueRate: 778ms500.080us / second
Rate: 5.66756 / second
Percentiles: 1%=001ms158.120us; 5%=001ms399.672us; 10%=002ms501.234us; 20%=002ms674.388us; 50%=007ms947.845us; 80%=215ms233.425us; 90%=519ms198.661us; 95%=525ms504.477us; 99%=530ms946.063us
Counter: CachedSyncTensors
Value: 1365
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 514417
Counter: CreateXlaTensor
Value: 4029237
Counter: DestroyDataHandles
Value: 513538
Counter: DestroyXlaTensor
Value: 4028486
Counter: ReleaseDataHandles
Value: 513538
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 274
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:57:29, device xla:1, step 0
validation/ 18:57:31, device xla:1, step 10
validation/ 18:57:34, device xla:1, step 20
validation stats on subset "valid" - 18:57:34
| epoch 031 | valid on 'valid' subset | loss 10.172 | nll_loss 9.081 | ppl 541.46 | num_updates 682
old learning rate: 8.258350000000001e-05
new learning rate: 8.533295000000001e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1395
Counter: 10m49s420ms43.011us
ValueRate: 878ms81.927us / second
Rate: 2.16326 / second
Percentiles: 1%=002ms138.434us; 5%=216ms914.202us; 10%=217ms536.296us; 20%=217ms222.366us; 50%=219ms3.516us; 80%=617ms403.652us; 90%=619ms598.454us; 95%=620ms883.266us; 99%=622ms732.126us
Metric: InboundData
TotalSamples: 278
Counter: 1022.00B
ValueRate: 1.52B / second
Rate: 0.413299 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3690
Counter: 1.05GB
ValueRate: 408.57KB / second
Rate: 5.67481 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 20352
Counter: 01m09s963ms851.868us
ValueRate: 086ms615.800us / second
Rate: 31.7265 / second
Percentiles: 1%=622.111us; 5%=729.710us; 10%=843.153us; 20%=997.204us; 50%=001ms378.442us; 80%=005ms555.049us; 90%=008ms711.662us; 95%=009ms709.163us; 99%=011ms887.324us
Metric: TransferFromServerTime
TotalSamples: 278
Counter: 961ms482.877us
ValueRate: 001ms429.423us / second
Rate: 0.413299 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=944.157us; 20%=001ms111.567us; 50%=001ms353.268us; 80%=003ms648.791us; 90%=004ms898.422us; 95%=005ms178.402us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3690
Counter: 09m39s897ms638.664us
ValueRate: 782ms89.437us / second
Rate: 5.66865 / second
Percentiles: 1%=001ms158.120us; 5%=001ms401.377us; 10%=002ms513.663us; 20%=002ms691.964us; 50%=007ms19.148us; 80%=215ms56.845us; 90%=519ms137.155us; 95%=524ms226.735us; 99%=530ms853.429us
Counter: CachedSyncTensors
Value: 1388
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 514694
Counter: CreateXlaTensor
Value: 4053838
Counter: DestroyDataHandles
Value: 513813
Counter: DestroyXlaTensor
Value: 4053086
Counter: ReleaseDataHandles
Value: 513814
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 278
Epoch 32 begin 18:57:34
training/ 18:57:41, device xla:1, step 10, Rate=170.05, Global Rate=195.00
training/ 18:57:48, device xla:1, step 20, Rate=176.58, Global Rate=185.57
Epoch 32 Training stats:
device xla:1
| epoch 032 | loss 8.646 | nll_loss 7.706 | ppl 208.85 | wps 2697 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 704 | lr 8.80824e-05 | gnorm 7.107 | clip 0.000 | oom 0.000 | wall 866 | train_wall 458
Epoch 32 Tracker Rates:
Rate=176.94, Global Rate=184.79
Epoch 32 end 18:57:50
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1417
Counter: 10m03s986ms34.344us
ValueRate: 876ms410.873us / second
Rate: 2.13669 / second
Percentiles: 1%=002ms138.434us; 5%=216ms914.202us; 10%=217ms536.296us; 20%=217ms222.366us; 50%=221ms690.555us; 80%=617ms375.181us; 90%=619ms598.454us; 95%=620ms827.221us; 99%=622ms734.815us
Metric: InboundData
TotalSamples: 283
Counter: 1.02KB
ValueRate: 1.51B / second
Rate: 0.411012 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3779
Counter: 1.05GB
ValueRate: 392.69KB / second
Rate: 5.6628 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 20914
Counter: 01m11s663ms449.774us
ValueRate: 095ms841.269us / second
Rate: 33.8928 / second
Percentiles: 1%=622.111us; 5%=733.350us; 10%=834.963us; 20%=962.533us; 50%=001ms411.961us; 80%=005ms154.201us; 90%=008ms774.351us; 95%=009ms714.490us; 99%=010ms306.281us
Metric: TransferFromServerTime
TotalSamples: 283
Counter: 975ms885.051us
ValueRate: 001ms415.864us / second
Rate: 0.411012 / second
Percentiles: 1%=737.370us; 5%=850.771us; 10%=955.894us; 20%=001ms118.044us; 50%=001ms357.165us; 80%=003ms664.984us; 90%=004ms934.657us; 95%=005ms113.124us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3779
Counter: 09m50s848ms353.545us
ValueRate: 776ms903.629us / second
Rate: 5.66281 / second
Percentiles: 1%=001ms158.120us; 5%=001ms380.055us; 10%=001ms494.103us; 20%=002ms666.008us; 50%=007ms831.758us; 80%=215ms196.912us; 90%=519ms715.385us; 95%=524ms766.245us; 99%=530ms853.429us
Counter: CachedSyncTensors
Value: 1410
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 531013
Counter: CreateXlaTensor
Value: 4159988
Counter: DestroyDataHandles
Value: 530133
Counter: DestroyXlaTensor
Value: 4159236
Counter: ReleaseDataHandles
Value: 530133
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 283
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:57:50, device xla:1, step 0
validation/ 18:57:52, device xla:1, step 10
validation/ 18:57:55, device xla:1, step 20
validation stats on subset "valid" - 18:57:55
| epoch 032 | valid on 'valid' subset | loss 10.309 | nll_loss 9.258 | ppl 612.14 | num_updates 704
old learning rate: 8.533295000000001e-05
new learning rate: 8.808240000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1440
Counter: 10m08s754ms173.942us
ValueRate: 878ms735.695us / second
Rate: 2.16278 / second
Percentiles: 1%=002ms138.434us; 5%=216ms832.391us; 10%=216ms460.790us; 20%=217ms112.909us; 50%=219ms973.944us; 80%=617ms317.428us; 90%=618ms380.440us; 95%=620ms682.373us; 99%=622ms608.159us
Metric: InboundData
TotalSamples: 287
Counter: 1.03KB
ValueRate: 1.52B / second
Rate: 0.413778 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3809
Counter: 1.05GB
ValueRate: 408.15KB / second
Rate: 5.66888 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 21012
Counter: 01m11s781ms624.461us
ValueRate: 084ms173.544us / second
Rate: 31.5252 / second
Percentiles: 1%=622.111us; 5%=745.894us; 10%=846.030us; 20%=968.979us; 50%=001ms363.160us; 80%=004ms406.797us; 90%=008ms656.819us; 95%=009ms653.457us; 99%=010ms283.628us
Metric: TransferFromServerTime
TotalSamples: 287
Counter: 981ms874.155us
ValueRate: 001ms414.160us / second
Rate: 0.413778 / second
Percentiles: 1%=737.370us; 5%=850.771us; 10%=955.894us; 20%=001ms121.827us; 50%=001ms357.165us; 80%=003ms648.791us; 90%=004ms934.657us; 95%=005ms113.124us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3809
Counter: 09m55s755ms556.854us
ValueRate: 777ms369.863us / second
Rate: 5.66302 / second
Percentiles: 1%=001ms158.120us; 5%=001ms380.055us; 10%=001ms496.124us; 20%=002ms672.262us; 50%=007ms939.987us; 80%=215ms26.475us; 90%=518ms151.149us; 95%=524ms598.321us; 99%=530ms682.508us
Counter: CachedSyncTensors
Value: 1433
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 531291
Counter: CreateXlaTensor
Value: 4184589
Counter: DestroyDataHandles
Value: 530410
Counter: DestroyXlaTensor
Value: 4183837
Counter: ReleaseDataHandles
Value: 530411
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 287
Epoch 33 begin 18:57:55
training/ 18:58:02, device xla:1, step 10, Rate=173.65, Global Rate=197.53
training/ 18:58:09, device xla:1, step 20, Rate=177.78, Global Rate=186.79
Epoch 33 Training stats:
device xla:1
| epoch 033 | loss 8.477 | nll_loss 7.515 | ppl 182.85 | wps 2716 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 726 | lr 9.08319e-05 | gnorm 7.097 | clip 0.000 | oom 0.000 | wall 887 | train_wall 472
Epoch 33 Tracker Rates:
Rate=178.21, Global Rate=186.06
Epoch 33 end 18:58:11
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1462
Counter: 10m21s465ms97.485us
ValueRate: 877ms919.111us / second
Rate: 2.13768 / second
Percentiles: 1%=002ms138.434us; 5%=216ms832.391us; 10%=216ms460.790us; 20%=217ms112.909us; 50%=220ms823.010us; 80%=617ms348.739us; 90%=619ms567.827us; 95%=620ms696.548us; 99%=622ms696.441us
Metric: InboundData
TotalSamples: 292
Counter: 1.05KB
ValueRate: 1.52B / second
Rate: 0.411616 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3896
Counter: 1.06GB
ValueRate: 392.88KB / second
Rate: 5.66557 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 21526
Counter: 01m12s379ms297.099us
ValueRate: 093ms179.616us / second
Rate: 32.3699 / second
Percentiles: 1%=613.784us; 5%=735.655us; 10%=826.475us; 20%=968.621us; 50%=001ms420.580us; 80%=005ms154.201us; 90%=008ms782.910us; 95%=009ms940.979us; 99%=011ms719.476us
Metric: TransferFromServerTime
TotalSamples: 292
Counter: 987ms563.826us
ValueRate: 001ms390.702us / second
Rate: 0.411616 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=935.226us; 20%=001ms100.351us; 50%=001ms354.718us; 80%=003ms615.203us; 90%=004ms898.422us; 95%=005ms113.124us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3896
Counter: 09m06s730ms197.371us
ValueRate: 775ms324.834us / second
Rate: 5.68105 / second
Percentiles: 1%=001ms146.135us; 5%=001ms365.526us; 10%=001ms488.694us; 20%=002ms661.707us; 50%=007ms800.587us; 80%=215ms395.242us; 90%=518ms206.431us; 95%=524ms776.884us; 99%=530ms853.429us
Counter: CachedSyncTensors
Value: 1455
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 547608
Counter: CreateXlaTensor
Value: 4290739
Counter: DestroyDataHandles
Value: 546728
Counter: DestroyXlaTensor
Value: 4289987
Counter: ReleaseDataHandles
Value: 546728
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 292
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:58:11, device xla:1, step 0
validation/ 18:58:13, device xla:1, step 10
validation/ 18:58:15, device xla:1, step 20
validation stats on subset "valid" - 18:58:16
| epoch 033 | valid on 'valid' subset | loss 10.081 | nll_loss 9.153 | ppl 569.11 | num_updates 726
old learning rate: 8.808240000000002e-05
new learning rate: 9.083185000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1485
Counter: 10m26s227ms516.410us
ValueRate: 879ms564.094us / second
Rate: 2.16456 / second
Percentiles: 1%=002ms138.434us; 5%=216ms764.703us; 10%=216ms317.827us; 20%=217ms7.153us; 50%=219ms957.791us; 80%=617ms295.492us; 90%=618ms380.440us; 95%=620ms658.590us; 99%=622ms696.441us
Metric: InboundData
TotalSamples: 296
Counter: 1.06KB
ValueRate: 1.52B / second
Rate: 0.414304 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 3925
Counter: 1.06GB
ValueRate: 407.76KB / second
Rate: 5.66349 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 21626
Counter: 01m12s489ms780.330us
ValueRate: 080ms248.766us / second
Rate: 30.2544 / second
Percentiles: 1%=613.784us; 5%=738.312us; 10%=833.917us; 20%=952.945us; 50%=001ms344.943us; 80%=004ms265.141us; 90%=008ms567.020us; 95%=009ms806.286us; 99%=010ms410.829us
Metric: TransferFromServerTime
TotalSamples: 296
Counter: 993ms729.880us
ValueRate: 001ms389.499us / second
Rate: 0.414304 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=935.226us; 20%=001ms100.351us; 50%=001ms354.718us; 80%=003ms581.618us; 90%=004ms898.422us; 95%=005ms113.124us; 99%=041ms986.188us
Metric: TransferToServerTime
TotalSamples: 3925
Counter: 09m11s632ms811.354us
ValueRate: 777ms173.211us / second
Rate: 5.66347 / second
Percentiles: 1%=001ms158.120us; 5%=001ms375.952us; 10%=001ms496.124us; 20%=002ms669.289us; 50%=007ms947.845us; 80%=215ms56.845us; 90%=518ms736.933us; 95%=523ms439.106us; 99%=530ms853.429us
Counter: CachedSyncTensors
Value: 1478
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 547885
Counter: CreateXlaTensor
Value: 4315340
Counter: DestroyDataHandles
Value: 547004
Counter: DestroyXlaTensor
Value: 4314588
Counter: ReleaseDataHandles
Value: 547005
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 296
Epoch 34 begin 18:58:16
training/ 18:58:23, device xla:1, step 10, Rate=171.54, Global Rate=195.95
training/ 18:58:30, device xla:1, step 20, Rate=176.53, Global Rate=186.32
Epoch 34 Training stats:
device xla:1
| epoch 034 | loss 8.317 | nll_loss 7.333 | ppl 161.18 | wps 2734 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 748 | lr 9.35813e-05 | gnorm 7.109 | clip 0.000 | oom 0.000 | wall 908 | train_wall 486
Epoch 34 Tracker Rates:
Rate=177.74, Global Rate=185.71
Epoch 34 end 18:58:32
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1507
Counter: 11m40s770ms173.879us
ValueRate: 877ms101.669us / second
Rate: 2.13851 / second
Percentiles: 1%=002ms138.434us; 5%=216ms764.703us; 10%=216ms317.827us; 20%=217ms7.153us; 50%=220ms823.010us; 80%=617ms303.839us; 90%=619ms534.130us; 95%=620ms682.373us; 99%=622ms864.520us
Metric: InboundData
TotalSamples: 301
Counter: 1.08KB
ValueRate: 1.52B / second
Rate: 0.412177 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4015
Counter: 1.07GB
ValueRate: 392.92KB / second
Rate: 5.66615 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 22205
Counter: 01m14s175ms262.678us
ValueRate: 093ms322.322us / second
Rate: 33.8982 / second
Percentiles: 1%=578.794us; 5%=712.034us; 10%=820.323us; 20%=957.918us; 50%=001ms422.807us; 80%=005ms547.324us; 90%=007ms252.255us; 95%=009ms719.575us; 99%=011ms692.772us
Metric: TransferFromServerTime
TotalSamples: 301
Counter: 999ms414.719us
ValueRate: 001ms368.557us / second
Rate: 0.412177 / second
Percentiles: 1%=738.270us; 5%=849.291us; 10%=944.157us; 20%=001ms98.713us; 50%=001ms351.346us; 80%=003ms522.011us; 90%=004ms829.824us; 95%=005ms817.284us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4015
Counter: 09m22s651ms729.757us
ValueRate: 775ms330.598us / second
Rate: 5.68248 / second
Percentiles: 1%=001ms146.135us; 5%=001ms351.065us; 10%=001ms487.383us; 20%=002ms656.955us; 50%=007ms698.995us; 80%=215ms308.015us; 90%=518ms458.133us; 95%=524ms805.913us; 99%=530ms853.429us
Counter: CachedSyncTensors
Value: 1500
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 564205
Counter: CreateXlaTensor
Value: 4421490
Counter: DestroyDataHandles
Value: 563325
Counter: DestroyXlaTensor
Value: 4420738
Counter: ReleaseDataHandles
Value: 563325
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 301
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:58:32, device xla:1, step 0
validation/ 18:58:34, device xla:1, step 10
validation/ 18:58:36, device xla:1, step 20
validation stats on subset "valid" - 18:58:37
| epoch 034 | valid on 'valid' subset | loss 10.087 | nll_loss 9.176 | ppl 578.27 | num_updates 748
old learning rate: 9.083185000000002e-05
new learning rate: 9.358130000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1530
Counter: 11m45s530ms484.790us
ValueRate: 879ms758.937us / second
Rate: 2.16545 / second
Percentiles: 1%=002ms97.286us; 5%=216ms592.444us; 10%=216ms238.827us; 20%=217ms903.937us; 50%=219ms922.931us; 80%=617ms257.187us; 90%=618ms341.629us; 95%=620ms607.270us; 99%=622ms696.441us
Metric: InboundData
TotalSamples: 305
Counter: 1.09KB
ValueRate: 1.52B / second
Rate: 0.4148 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4043
Counter: 1.07GB
ValueRate: 408.24KB / second
Rate: 5.67011 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 22297
Counter: 01m14s278ms218.918us
ValueRate: 081ms944.533us / second
Rate: 31.68 / second
Percentiles: 1%=578.930us; 5%=737.035us; 10%=824.931us; 20%=959.982us; 50%=001ms331.280us; 80%=004ms982.741us; 90%=007ms868.960us; 95%=008ms280.727us; 99%=010ms323.952us
Metric: TransferFromServerTime
TotalSamples: 305
Counter: 01s007ms182.688us
ValueRate: 001ms369.768us / second
Rate: 0.4148 / second
Percentiles: 1%=738.270us; 5%=849.291us; 10%=944.157us; 20%=001ms100.351us; 50%=001ms354.718us; 80%=003ms581.618us; 90%=004ms829.824us; 95%=005ms817.284us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4043
Counter: 09m27s577ms845.833us
ValueRate: 777ms326.969us / second
Rate: 5.66402 / second
Percentiles: 1%=001ms146.135us; 5%=001ms365.526us; 10%=001ms494.103us; 20%=002ms668.018us; 50%=007ms910.660us; 80%=215ms756.405us; 90%=518ms840.016us; 95%=524ms776.884us; 99%=530ms853.429us
Counter: CachedSyncTensors
Value: 1523
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 564481
Counter: CreateXlaTensor
Value: 4446091
Counter: DestroyDataHandles
Value: 563600
Counter: DestroyXlaTensor
Value: 4445339
Counter: ReleaseDataHandles
Value: 563601
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 305
Epoch 35 begin 18:58:37
training/ 18:58:44, device xla:1, step 10, Rate=169.60, Global Rate=195.16
training/ 18:58:51, device xla:1, step 20, Rate=179.81, Global Rate=187.80
Epoch 35 Training stats:
device xla:1
| epoch 035 | loss 8.158 | nll_loss 7.151 | ppl 142.17 | wps 2752 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 770 | lr 9.63308e-05 | gnorm 7.064 | clip 0.000 | oom 0.000 | wall 929 | train_wall 499
Epoch 35 Tracker Rates:
Rate=179.77, Global Rate=187.04
Epoch 35 end 18:58:53
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1552
Counter: 11m58s056ms18.878us
ValueRate: 878ms517.554us / second
Rate: 2.14005 / second
Percentiles: 1%=002ms97.286us; 5%=216ms592.444us; 10%=216ms238.827us; 20%=217ms903.937us; 50%=220ms823.010us; 80%=617ms253.517us; 90%=618ms341.629us; 95%=620ms650.605us; 99%=622ms696.441us
Metric: InboundData
TotalSamples: 310
Counter: 1.11KB
ValueRate: 1.52B / second
Rate: 0.412778 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4133
Counter: 1.07GB
ValueRate: 394.15KB / second
Rate: 5.6838 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 22839
Counter: 01m16s932ms303.137us
ValueRate: 095ms911.626us / second
Rate: 33.2769 / second
Percentiles: 1%=597.142us; 5%=729.082us; 10%=827.754us; 20%=975.261us; 50%=001ms445.545us; 80%=005ms34.761us; 90%=007ms286.443us; 95%=008ms455.896us; 99%=011ms848.315us
Metric: TransferFromServerTime
TotalSamples: 310
Counter: 01s016ms811.896us
ValueRate: 001ms352.595us / second
Rate: 0.412778 / second
Percentiles: 1%=737.370us; 5%=845.519us; 10%=935.226us; 20%=001ms96.757us; 50%=001ms354.718us; 80%=003ms581.618us; 90%=004ms898.422us; 95%=005ms817.284us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4133
Counter: 10m38s528ms450.238us
ValueRate: 773ms501.443us / second
Rate: 5.68383 / second
Percentiles: 1%=001ms146.135us; 5%=001ms351.065us; 10%=001ms483.796us; 20%=002ms644.786us; 50%=007ms588.402us; 80%=215ms86.356us; 90%=518ms206.431us; 95%=524ms807.196us; 99%=530ms946.063us
Counter: CachedSyncTensors
Value: 1545
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 580801
Counter: CreateXlaTensor
Value: 4552241
Counter: DestroyDataHandles
Value: 579921
Counter: DestroyXlaTensor
Value: 4551489
Counter: ReleaseDataHandles
Value: 579921
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 310
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:58:53, device xla:1, step 0
validation/ 18:58:55, device xla:1, step 10
validation/ 18:58:57, device xla:1, step 20
validation stats on subset "valid" - 18:58:58
| epoch 035 | valid on 'valid' subset | loss 9.842 | nll_loss 8.895 | ppl 476.22 | num_updates 770
old learning rate: 9.358130000000002e-05
new learning rate: 9.633075000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1575
Counter: 11m03s849ms505.643us
ValueRate: 879ms147.335us / second
Rate: 2.16678 / second
Percentiles: 1%=002ms97.286us; 5%=216ms573.053us; 10%=216ms232.045us; 20%=217ms887.373us; 50%=219ms937.099us; 80%=617ms82.513us; 90%=618ms227.071us; 95%=620ms526.896us; 99%=622ms696.441us
Metric: InboundData
TotalSamples: 314
Counter: 1.13KB
ValueRate: 1.53B / second
Rate: 0.415307 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4162
Counter: 1.08GB
ValueRate: 409.71KB / second
Rate: 5.69056 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 22941
Counter: 01m16s044ms938.293us
ValueRate: 082ms999.765us / second
Rate: 31.0285 / second
Percentiles: 1%=601.098us; 5%=732.029us; 10%=837.703us; 20%=974.652us; 50%=001ms341.010us; 80%=005ms520.557us; 90%=007ms116.186us; 95%=008ms244.133us; 99%=010ms474.594us
Metric: TransferFromServerTime
TotalSamples: 314
Counter: 01s022ms780.154us
ValueRate: 001ms351.440us / second
Rate: 0.415307 / second
Percentiles: 1%=737.370us; 5%=845.519us; 10%=935.226us; 20%=001ms96.757us; 50%=001ms353.268us; 80%=003ms522.011us; 90%=004ms829.824us; 95%=005ms817.284us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4162
Counter: 10m42s486ms723.190us
ValueRate: 774ms895.884us / second
Rate: 5.68438 / second
Percentiles: 1%=001ms146.135us; 5%=001ms362.109us; 10%=001ms487.383us; 20%=002ms648.096us; 50%=007ms684.198us; 80%=215ms557.017us; 90%=518ms713.572us; 95%=524ms766.245us; 99%=530ms682.508us
Counter: CachedSyncTensors
Value: 1568
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 581078
Counter: CreateXlaTensor
Value: 4576842
Counter: DestroyDataHandles
Value: 580197
Counter: DestroyXlaTensor
Value: 4576091
Counter: ReleaseDataHandles
Value: 580199
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 314
Epoch 36 begin 18:58:58
training/ 18:59:04, device xla:1, step 10, Rate=171.18, Global Rate=197.02
training/ 18:59:11, device xla:1, step 20, Rate=178.03, Global Rate=187.80
Epoch 36 Training stats:
device xla:1
| epoch 036 | loss 8.002 | nll_loss 6.974 | ppl 125.71 | wps 2768 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 792 | lr 9.90802e-05 | gnorm 6.988 | clip 0.000 | oom 0.000 | wall 950 | train_wall 513
Epoch 36 Tracker Rates:
Rate=178.70, Global Rate=187.04
Epoch 36 end 18:59:13
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1597
Counter: 11m16s411ms116.455us
ValueRate: 878ms888.467us / second
Rate: 2.14112 / second
Percentiles: 1%=002ms97.286us; 5%=216ms573.053us; 10%=216ms232.045us; 20%=217ms887.373us; 50%=220ms823.010us; 80%=617ms90.001us; 90%=618ms263.931us; 95%=620ms607.270us; 99%=622ms864.520us
Metric: InboundData
TotalSamples: 319
Counter: 1.15KB
ValueRate: 1.52B / second
Rate: 0.413328 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4249
Counter: 1.08GB
ValueRate: 392.47KB / second
Rate: 5.65968 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 23474
Counter: 01m18s713ms226.168us
ValueRate: 092ms261.461us / second
Rate: 32.5824 / second
Percentiles: 1%=594.351us; 5%=734.257us; 10%=849.539us; 20%=994.113us; 50%=001ms392.098us; 80%=005ms64.507us; 90%=007ms453.193us; 95%=009ms738.299us; 99%=011ms643.463us
Metric: TransferFromServerTime
TotalSamples: 319
Counter: 01s030ms833.946us
ValueRate: 001ms334.356us / second
Rate: 0.413328 / second
Percentiles: 1%=737.370us; 5%=845.519us; 10%=935.226us; 20%=001ms98.713us; 50%=001ms354.718us; 80%=003ms522.011us; 90%=004ms829.824us; 95%=005ms817.284us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4249
Counter: 10m53s367ms6.091us
ValueRate: 768ms128.216us / second
Rate: 5.65968 / second
Percentiles: 1%=001ms188.492us; 5%=001ms393.852us; 10%=001ms494.103us; 20%=002ms667.147us; 50%=006ms445.888us; 80%=215ms344.062us; 90%=517ms239.596us; 95%=523ms439.106us; 99%=530ms682.508us
Counter: CachedSyncTensors
Value: 1590
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 597395
Counter: CreateXlaTensor
Value: 4682992
Counter: DestroyDataHandles
Value: 596516
Counter: DestroyXlaTensor
Value: 4682241
Counter: ReleaseDataHandles
Value: 596516
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 319
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:59:14, device xla:1, step 0
validation/ 18:59:16, device xla:1, step 10
validation/ 18:59:18, device xla:1, step 20
validation stats on subset "valid" - 18:59:18
| epoch 036 | valid on 'valid' subset | loss 9.868 | nll_loss 8.920 | ppl 484.45 | num_updates 792
old learning rate: 9.633075000000002e-05
new learning rate: 9.908020000000002e-05
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1620
Counter: 11m21s185ms286.754us
ValueRate: 880ms755.299us / second
Rate: 2.16848 / second
Percentiles: 1%=002ms97.286us; 5%=215ms475.731us; 10%=216ms150.289us; 20%=217ms819.327us; 50%=219ms941.547us; 80%=617ms12.284us; 90%=618ms217.504us; 95%=619ms489.421us; 99%=622ms864.520us
Metric: InboundData
TotalSamples: 323
Counter: 1.16KB
ValueRate: 1.53B / second
Rate: 0.415799 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4278
Counter: 1.09GB
ValueRate: 408.22KB / second
Rate: 5.66995 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 23575
Counter: 01m18s829ms17.511us
ValueRate: 081ms483.542us / second
Rate: 30.4821 / second
Percentiles: 1%=604.035us; 5%=745.394us; 10%=858.932us; 20%=994.914us; 50%=001ms336.190us; 80%=005ms600.736us; 90%=007ms328.361us; 95%=009ms552.702us; 99%=011ms643.463us
Metric: TransferFromServerTime
TotalSamples: 323
Counter: 01s036ms338.599us
ValueRate: 001ms334.082us / second
Rate: 0.415799 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.157us; 20%=001ms96.757us; 50%=001ms353.268us; 80%=003ms522.011us; 90%=004ms543.772us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4278
Counter: 10m58s328ms729.955us
ValueRate: 773ms185.667us / second
Rate: 5.66404 / second
Percentiles: 1%=001ms188.492us; 5%=001ms401.377us; 10%=001ms498.951us; 20%=002ms680.420us; 50%=007ms588.402us; 80%=215ms977.602us; 90%=517ms124.482us; 95%=523ms393.753us; 99%=529ms326.354us
Counter: CachedSyncTensors
Value: 1613
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 597672
Counter: CreateXlaTensor
Value: 4707593
Counter: DestroyDataHandles
Value: 596792
Counter: DestroyXlaTensor
Value: 4706841
Counter: ReleaseDataHandles
Value: 596792
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 323
Epoch 37 begin 18:59:19
training/ 18:59:25, device xla:1, step 10, Rate=170.80, Global Rate=194.76
training/ 18:59:32, device xla:1, step 20, Rate=177.88, Global Rate=187.25
Epoch 37 Training stats:
device xla:1
| epoch 037 | loss 7.851 | nll_loss 6.802 | ppl 111.60 | wps 2784 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 814 | lr 0.00010183 | gnorm 6.902 | clip 0.000 | oom 0.000 | wall 971 | train_wall 527
Epoch 37 Tracker Rates:
Rate=178.45, Global Rate=186.51
Epoch 37 end 18:59:34
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1642
Counter: 12m35s713ms937.939us
ValueRate: 878ms417.228us / second
Rate: 2.14285 / second
Percentiles: 1%=002ms97.286us; 5%=215ms475.731us; 10%=216ms150.289us; 20%=217ms819.327us; 50%=220ms823.010us; 80%=617ms941.539us; 90%=618ms172.641us; 95%=619ms420.618us; 99%=622ms696.441us
Metric: InboundData
TotalSamples: 328
Counter: 1.18KB
ValueRate: 1.52B / second
Rate: 0.413839 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4366
Counter: 1.09GB
ValueRate: 392.60KB / second
Rate: 5.66149 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 24113
Counter: 01m20s524ms674.702us
ValueRate: 095ms109.323us / second
Rate: 32.583 / second
Percentiles: 1%=575.908us; 5%=732.473us; 10%=818.250us; 20%=987.147us; 50%=001ms437.711us; 80%=005ms297.228us; 90%=008ms942.504us; 95%=009ms998.606us; 99%=011ms686.824us
Metric: TransferFromServerTime
TotalSamples: 328
Counter: 01s046ms694.269us
ValueRate: 001ms319.358us / second
Rate: 0.413839 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=935.226us; 20%=001ms96.757us; 50%=001ms357.165us; 80%=003ms522.011us; 90%=004ms543.772us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4366
Counter: 10m09s220ms965.534us
ValueRate: 768ms466.481us / second
Rate: 5.66158 / second
Percentiles: 1%=001ms187.740us; 5%=001ms391.925us; 10%=001ms497.380us; 20%=002ms674.388us; 50%=006ms369.923us; 80%=215ms418.019us; 90%=517ms192.304us; 95%=524ms745.620us; 99%=529ms326.354us
Counter: CachedSyncTensors
Value: 1635
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 613990
Counter: CreateXlaTensor
Value: 4813743
Counter: DestroyDataHandles
Value: 613110
Counter: DestroyXlaTensor
Value: 4812991
Counter: ReleaseDataHandles
Value: 613110
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 328
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:59:34, device xla:1, step 0
validation/ 18:59:36, device xla:1, step 10
validation/ 18:59:39, device xla:1, step 20
validation stats on subset "valid" - 18:59:39
| epoch 037 | valid on 'valid' subset | loss 9.931 | nll_loss 9.040 | ppl 526.30 | num_updates 814
old learning rate: 9.908020000000002e-05
new learning rate: 0.00010182965000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1665
Counter: 12m39s480ms645.685us
ValueRate: 880ms8.533us / second
Rate: 2.16969 / second
Percentiles: 1%=002ms97.286us; 5%=215ms432.182us; 10%=216ms58.274us; 20%=217ms717.107us; 50%=219ms941.547us; 80%=617ms789.188us; 90%=618ms62.793us; 95%=619ms227.877us; 99%=622ms591.221us
Metric: InboundData
TotalSamples: 332
Counter: 1.19KB
ValueRate: 1.53B / second
Rate: 0.416236 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4395
Counter: 1.09GB
ValueRate: 407.27KB / second
Rate: 5.6567 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 24215
Counter: 01m20s655ms595.818us
ValueRate: 082ms298.987us / second
Rate: 30.4241 / second
Percentiles: 1%=589.542us; 5%=736.929us; 10%=831.777us; 20%=995.825us; 50%=001ms346.862us; 80%=005ms741.074us; 90%=008ms540.395us; 95%=009ms814.220us; 99%=011ms686.824us
Metric: TransferFromServerTime
TotalSamples: 332
Counter: 01s052ms311.967us
ValueRate: 001ms319.307us / second
Rate: 0.416236 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.157us; 20%=001ms96.757us; 50%=001ms354.718us; 80%=003ms522.011us; 90%=003ms487.366us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4395
Counter: 10m14s150ms724.285us
ValueRate: 773ms313.651us / second
Rate: 5.66658 / second
Percentiles: 1%=001ms187.740us; 5%=001ms391.925us; 10%=002ms504.700us; 20%=002ms682.203us; 50%=006ms463.869us; 80%=214ms493.168us; 90%=517ms183.693us; 95%=523ms439.106us; 99%=529ms246.676us
Counter: CachedSyncTensors
Value: 1658
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 614267
Counter: CreateXlaTensor
Value: 4838344
Counter: DestroyDataHandles
Value: 613386
Counter: DestroyXlaTensor
Value: 4837592
Counter: ReleaseDataHandles
Value: 613387
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 332
Epoch 38 begin 18:59:39
training/ 18:59:46, device xla:1, step 10, Rate=170.55, Global Rate=194.48
training/ 18:59:53, device xla:1, step 20, Rate=180.27, Global Rate=187.59
Epoch 38 Training stats:
device xla:1
| epoch 038 | loss 7.721 | nll_loss 6.654 | ppl 100.70 | wps 2800 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 836 | lr 0.000104579 | gnorm 6.946 | clip 0.000 | oom 0.000 | wall 991 | train_wall 540
Epoch 38 Tracker Rates:
Rate=180.10, Global Rate=186.86
Epoch 38 end 18:59:55
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1687
Counter: 12m53s009ms246.459us
ValueRate: 879ms729.111us / second
Rate: 2.14421 / second
Percentiles: 1%=002ms97.286us; 5%=215ms432.182us; 10%=216ms58.274us; 20%=217ms717.107us; 50%=220ms823.010us; 80%=617ms673.083us; 90%=618ms31.524us; 95%=619ms176.792us; 99%=622ms539.778us
Metric: InboundData
TotalSamples: 337
Counter: 1.21KB
ValueRate: 1.52B / second
Rate: 0.414332 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4483
Counter: 1.10GB
ValueRate: 391.77KB / second
Rate: 5.64956 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 24751
Counter: 01m21s267ms864.044us
ValueRate: 092ms128.794us / second
Rate: 31.9082 / second
Percentiles: 1%=628.496us; 5%=740.751us; 10%=831.777us; 20%=001ms25.398us; 50%=001ms481.590us; 80%=005ms943.953us; 90%=008ms791.943us; 95%=009ms962.363us; 99%=011ms546.920us
Metric: TransferFromServerTime
TotalSamples: 337
Counter: 01s065ms748.637us
ValueRate: 001ms309.079us / second
Rate: 0.414332 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.157us; 20%=001ms95.189us; 50%=001ms353.268us; 80%=003ms522.011us; 90%=004ms543.772us; 95%=005ms817.284us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4483
Counter: 10m25s088ms175.972us
ValueRate: 765ms688.958us / second
Rate: 5.66608 / second
Percentiles: 1%=001ms188.492us; 5%=001ms393.852us; 10%=001ms498.951us; 20%=002ms680.420us; 50%=006ms403.103us; 80%=215ms401.166us; 90%=518ms840.016us; 95%=524ms723.805us; 99%=529ms246.676us
Counter: CachedSyncTensors
Value: 1680
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 630585
Counter: CreateXlaTensor
Value: 4944494
Counter: DestroyDataHandles
Value: 629705
Counter: DestroyXlaTensor
Value: 4943742
Counter: ReleaseDataHandles
Value: 629705
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 337
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 18:59:55, device xla:1, step 0
validation/ 18:59:57, device xla:1, step 10
validation/ 18:59:59, device xla:1, step 20
validation stats on subset "valid" - 19:00:00
| epoch 038 | valid on 'valid' subset | loss 9.998 | nll_loss 8.975 | ppl 503.24 | num_updates 836
old learning rate: 0.00010182965000000002
new learning rate: 0.00010457910000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1710
Counter: 12m58s779ms244.084us
ValueRate: 881ms538.211us / second
Rate: 2.17157 / second
Percentiles: 1%=002ms97.286us; 5%=215ms432.182us; 10%=216ms980.352us; 20%=217ms643.295us; 50%=219ms937.099us; 80%=617ms593.355us; 90%=618ms923.914us; 95%=619ms5.387us; 99%=621ms456.373us
Metric: InboundData
TotalSamples: 341
Counter: 1.22KB
ValueRate: 1.53B / second
Rate: 0.416668 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4512
Counter: 1.10GB
ValueRate: 406.99KB / second
Rate: 5.65273 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 24853
Counter: 01m21s385ms184.762us
ValueRate: 080ms142.407us / second
Rate: 29.8354 / second
Percentiles: 1%=635.213us; 5%=775.692us; 10%=892.480us; 20%=001ms25.398us; 50%=001ms377.070us; 80%=004ms357.597us; 90%=008ms529.127us; 95%=009ms814.670us; 99%=010ms53.115us
Metric: TransferFromServerTime
TotalSamples: 341
Counter: 01s071ms415.414us
ValueRate: 001ms309.164us / second
Rate: 0.416668 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.157us; 20%=001ms92.148us; 50%=001ms351.346us; 80%=003ms522.011us; 90%=003ms487.366us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4512
Counter: 11m30s016ms649.789us
ValueRate: 766ms445.339us / second
Rate: 5.64676 / second
Percentiles: 1%=001ms188.492us; 5%=001ms397.360us; 10%=002ms507.902us; 20%=002ms692.550us; 50%=007ms501.475us; 80%=215ms977.602us; 90%=517ms183.693us; 95%=523ms393.753us; 99%=529ms867.017us
Counter: CachedSyncTensors
Value: 1703
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 630862
Counter: CreateXlaTensor
Value: 4969095
Counter: DestroyDataHandles
Value: 629982
Counter: DestroyXlaTensor
Value: 4968343
Counter: ReleaseDataHandles
Value: 629982
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 341
Epoch 39 begin 19:00:00
training/ 19:00:07, device xla:1, step 10, Rate=172.26, Global Rate=196.26
training/ 19:00:14, device xla:1, step 20, Rate=177.31, Global Rate=186.63
Epoch 39 Training stats:
device xla:1
| epoch 039 | loss 7.588 | nll_loss 6.502 | ppl 90.64 | wps 2814 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 858 | lr 0.000107329 | gnorm 6.894 | clip 0.000 | oom 0.000 | wall 1012 | train_wall 554
Epoch 39 Tracker Rates:
Rate=177.12, Global Rate=185.69
Epoch 39 end 19:00:16
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1732
Counter: 12m11s337ms413.920us
ValueRate: 879ms83.203us / second
Rate: 2.14544 / second
Percentiles: 1%=002ms97.286us; 5%=215ms432.182us; 10%=216ms980.352us; 20%=217ms643.295us; 50%=220ms823.010us; 80%=617ms593.355us; 90%=618ms943.225us; 95%=619ms116.560us; 99%=621ms456.373us
Metric: InboundData
TotalSamples: 346
Counter: 1.24KB
ValueRate: 1.53B / second
Rate: 0.414756 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4603
Counter: 1.11GB
ValueRate: 394.25KB / second
Rate: 5.68523 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 25396
Counter: 01m23s103ms260.457us
ValueRate: 093ms710.629us / second
Rate: 31.8384 / second
Percentiles: 1%=596.136us; 5%=740.751us; 10%=858.757us; 20%=001ms33.269us; 50%=001ms468.092us; 80%=005ms123.258us; 90%=008ms799.123us; 95%=009ms342.030us; 99%=011ms898.217us
Metric: TransferFromServerTime
TotalSamples: 346
Counter: 01s079ms483.697us
ValueRate: 001ms293.994us / second
Rate: 0.414756 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.157us; 20%=001ms95.189us; 50%=001ms353.268us; 80%=003ms522.011us; 90%=003ms487.366us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4603
Counter: 11m41s946ms397.571us
ValueRate: 761ms861.761us / second
Rate: 5.6852 / second
Percentiles: 1%=001ms188.492us; 5%=001ms392.394us; 10%=002ms505.759us; 20%=002ms689.738us; 50%=006ms405.271us; 80%=215ms308.015us; 90%=517ms168.960us; 95%=523ms169.517us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1725
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 647183
Counter: CreateXlaTensor
Value: 5075245
Counter: DestroyDataHandles
Value: 646303
Counter: DestroyXlaTensor
Value: 5074493
Counter: ReleaseDataHandles
Value: 646303
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 346
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:00:16, device xla:1, step 0
validation/ 19:00:18, device xla:1, step 10
validation/ 19:00:20, device xla:1, step 20
validation stats on subset "valid" - 19:00:21
| epoch 039 | valid on 'valid' subset | loss 9.862 | nll_loss 8.932 | ppl 488.57 | num_updates 858
old learning rate: 0.00010457910000000002
new learning rate: 0.00010732855000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1755
Counter: 12m16s126ms908.748us
ValueRate: 881ms557.605us / second
Rate: 2.17194 / second
Percentiles: 1%=002ms138.434us; 5%=215ms432.182us; 10%=216ms976.284us; 20%=217ms621.293us; 50%=219ms937.099us; 80%=616ms438.153us; 90%=618ms849.031us; 95%=619ms937.467us; 99%=621ms244.099us
Metric: InboundData
TotalSamples: 350
Counter: 1.26KB
ValueRate: 1.53B / second
Rate: 0.41701 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4632
Counter: 1.11GB
ValueRate: 407.90KB / second
Rate: 5.66548 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 25495
Counter: 01m23s226ms116.577us
ValueRate: 081ms303.418us / second
Rate: 29.7563 / second
Percentiles: 1%=596.136us; 5%=772.004us; 10%=907.583us; 20%=001ms36.506us; 50%=001ms395.439us; 80%=004ms430.932us; 90%=008ms582.490us; 95%=009ms254.030us; 99%=011ms898.217us
Metric: TransferFromServerTime
TotalSamples: 350
Counter: 01s086ms464.175us
ValueRate: 001ms294.476us / second
Rate: 0.41701 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.298us; 20%=001ms96.757us; 50%=001ms353.268us; 80%=003ms539.015us; 90%=003ms487.366us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4632
Counter: 11m46s874ms458.798us
ValueRate: 765ms321.261us / second
Rate: 5.66545 / second
Percentiles: 1%=001ms196.212us; 5%=001ms395.497us; 10%=002ms511.684us; 20%=002ms699.786us; 50%=007ms534.215us; 80%=215ms557.017us; 90%=517ms888.110us; 95%=523ms934.179us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1748
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 647460
Counter: CreateXlaTensor
Value: 5099846
Counter: DestroyDataHandles
Value: 646579
Counter: DestroyXlaTensor
Value: 5099094
Counter: ReleaseDataHandles
Value: 646580
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 350
Epoch 40 begin 19:00:21
training/ 19:00:28, device xla:1, step 10, Rate=169.83, Global Rate=194.08
training/ 19:00:35, device xla:1, step 20, Rate=177.12, Global Rate=186.32
Epoch 40 Training stats:
device xla:1
| epoch 040 | loss 7.463 | nll_loss 6.359 | ppl 82.10 | wps 2828 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 880 | lr 0.000110078 | gnorm 6.861 | clip 0.000 | oom 0.000 | wall 1033 | train_wall 568
Epoch 40 Tracker Rates:
Rate=176.68, Global Rate=185.31
Epoch 40 end 19:00:37
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1777
Counter: 12m30s669ms731.358us
ValueRate: 879ms25.415us / second
Rate: 2.14567 / second
Percentiles: 1%=002ms138.434us; 5%=215ms432.182us; 10%=216ms976.284us; 20%=217ms621.293us; 50%=221ms825.769us; 80%=616ms435.354us; 90%=618ms842.188us; 95%=619ms889.324us; 99%=621ms244.099us
Metric: InboundData
TotalSamples: 355
Counter: 1.28KB
ValueRate: 1.53B / second
Rate: 0.415119 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4722
Counter: 1.12GB
ValueRate: 392.69KB / second
Rate: 5.66283 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 26065
Counter: 01m25s972ms81.813us
ValueRate: 094ms394.469us / second
Rate: 33.1335 / second
Percentiles: 1%=622.002us; 5%=755.219us; 10%=853.301us; 20%=001ms31.393us; 50%=001ms466.891us; 80%=005ms230.339us; 90%=007ms357.625us; 95%=009ms625.846us; 99%=011ms540.457us
Metric: TransferFromServerTime
TotalSamples: 355
Counter: 01s094ms240.294us
ValueRate: 001ms279.548us / second
Rate: 0.415119 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.298us; 20%=001ms96.757us; 50%=001ms351.346us; 80%=003ms522.011us; 90%=003ms484.851us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4722
Counter: 11m57s326ms903.925us
ValueRate: 762ms454.293us / second
Rate: 5.66285 / second
Percentiles: 1%=001ms233.342us; 5%=001ms408.457us; 10%=002ms535.982us; 20%=002ms714.540us; 50%=006ms414.591us; 80%=216ms552.281us; 90%=516ms252.667us; 95%=522ms741.275us; 99%=529ms867.017us
Counter: CachedSyncTensors
Value: 1770
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 663780
Counter: CreateXlaTensor
Value: 5205996
Counter: DestroyDataHandles
Value: 662900
Counter: DestroyXlaTensor
Value: 5205244
Counter: ReleaseDataHandles
Value: 662900
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 355
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:00:37, device xla:1, step 0
validation/ 19:00:39, device xla:1, step 10
validation/ 19:00:41, device xla:1, step 20
validation stats on subset "valid" - 19:00:42
| epoch 040 | valid on 'valid' subset | loss 10.028 | nll_loss 9.193 | ppl 585.22 | num_updates 880
old learning rate: 0.00010732855000000002
new learning rate: 0.00011007800000000003
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1800
Counter: 13m34s439ms844.911us
ValueRate: 881ms713.411us / second
Rate: 2.17272 / second
Percentiles: 1%=002ms138.434us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms598.882us; 50%=219ms922.931us; 80%=616ms346.100us; 90%=618ms704.203us; 95%=619ms831.666us; 99%=621ms244.099us
Metric: InboundData
TotalSamples: 359
Counter: 1.29KB
ValueRate: 1.53B / second
Rate: 0.417328 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4752
Counter: 1.12GB
ValueRate: 409.83KB / second
Rate: 5.69217 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 26168
Counter: 01m25s091ms514.229us
ValueRate: 084ms429.183us / second
Rate: 31.5001 / second
Percentiles: 1%=623.361us; 5%=765.306us; 10%=859.750us; 20%=001ms22.556us; 50%=001ms388.249us; 80%=005ms551.996us; 90%=007ms185.314us; 95%=009ms608.508us; 99%=011ms540.457us
Metric: TransferFromServerTime
TotalSamples: 359
Counter: 01s099ms335.494us
ValueRate: 001ms277.949us / second
Rate: 0.417328 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=944.157us; 20%=001ms92.148us; 50%=001ms346.061us; 80%=002ms456.336us; 90%=003ms484.851us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4752
Counter: 11m02s220ms993.573us
ValueRate: 767ms401.258us / second
Rate: 5.68618 / second
Percentiles: 1%=001ms233.342us; 5%=001ms410.938us; 10%=002ms546.292us; 20%=002ms718.537us; 50%=007ms520.178us; 80%=214ms426.682us; 90%=516ms832.050us; 95%=521ms449.951us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1793
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 664058
Counter: CreateXlaTensor
Value: 5230597
Counter: DestroyDataHandles
Value: 663177
Counter: DestroyXlaTensor
Value: 5229845
Counter: ReleaseDataHandles
Value: 663178
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 359
Epoch 41 begin 19:00:42
training/ 19:00:49, device xla:1, step 10, Rate=170.48, Global Rate=195.28
training/ 19:00:56, device xla:1, step 20, Rate=177.13, Global Rate=186.37
Epoch 41 Training stats:
device xla:1
| epoch 041 | loss 7.342 | nll_loss 6.221 | ppl 74.61 | wps 2841 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 902 | lr 0.000112827 | gnorm 6.824 | clip 0.000 | oom 0.000 | wall 1054 | train_wall 582
Epoch 41 Tracker Rates:
Rate=177.07, Global Rate=185.47
Epoch 41 end 19:00:58
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1822
Counter: 13m48s990ms236.091us
ValueRate: 879ms325.868us / second
Rate: 2.14674 / second
Percentiles: 1%=002ms138.434us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms598.882us; 50%=221ms825.769us; 80%=616ms336.608us; 90%=618ms682.408us; 95%=619ms831.666us; 99%=621ms244.099us
Metric: InboundData
TotalSamples: 364
Counter: 1.31KB
ValueRate: 1.53B / second
Rate: 0.415483 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4839
Counter: 1.12GB
ValueRate: 391.57KB / second
Rate: 5.64667 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 26719
Counter: 01m27s791ms114.421us
ValueRate: 095ms482.524us / second
Rate: 33.1113 / second
Percentiles: 1%=641.085us; 5%=755.770us; 10%=853.301us; 20%=001ms10.071us; 50%=001ms448.215us; 80%=005ms251.547us; 90%=008ms512.800us; 95%=009ms729.865us; 99%=011ms778.726us
Metric: TransferFromServerTime
TotalSamples: 364
Counter: 01s112ms397.308us
ValueRate: 001ms269.732us / second
Rate: 0.415483 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=944.298us; 20%=001ms95.189us; 50%=001ms354.718us; 80%=002ms456.336us; 90%=003ms484.851us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4839
Counter: 11m14s653ms164.678us
ValueRate: 768ms828.748us / second
Rate: 5.66178 / second
Percentiles: 1%=001ms258.410us; 5%=001ms423.861us; 10%=002ms556.070us; 20%=002ms723.344us; 50%=006ms441.240us; 80%=215ms418.019us; 90%=516ms75.596us; 95%=522ms741.275us; 99%=529ms867.017us
Counter: CachedSyncTensors
Value: 1815
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 680375
Counter: CreateXlaTensor
Value: 5336747
Counter: DestroyDataHandles
Value: 679495
Counter: DestroyXlaTensor
Value: 5335995
Counter: ReleaseDataHandles
Value: 679495
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 364
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:00:58, device xla:1, step 0
validation/ 19:01:00, device xla:1, step 10
validation/ 19:01:02, device xla:1, step 20
validation stats on subset "valid" - 19:01:03
| epoch 041 | valid on 'valid' subset | loss 9.865 | nll_loss 8.980 | ppl 505.01 | num_updates 902
old learning rate: 0.00011007800000000003
new learning rate: 0.00011282745000000001
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1845
Counter: 13m53s765ms527.923us
ValueRate: 881ms978.411us / second
Rate: 2.17367 / second
Percentiles: 1%=002ms184.801us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms580.632us; 50%=219ms922.931us; 80%=616ms226.102us; 90%=618ms579.077us; 95%=619ms744.182us; 99%=621ms244.099us
Metric: InboundData
TotalSamples: 368
Counter: 1.32KB
ValueRate: 1.53B / second
Rate: 0.417632 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4867
Counter: 1.13GB
ValueRate: 406.95KB / second
Rate: 5.65227 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 26818
Counter: 01m27s123ms618.098us
ValueRate: 091ms837.839us / second
Rate: 30.8756 / second
Percentiles: 1%=641.899us; 5%=775.690us; 10%=866.064us; 20%=001ms11.024us; 50%=001ms388.030us; 80%=005ms827.428us; 90%=007ms337.053us; 95%=009ms626.530us; 99%=011ms778.726us
Metric: TransferFromServerTime
TotalSamples: 368
Counter: 01s118ms679.974us
ValueRate: 001ms268.420us / second
Rate: 0.417632 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=944.157us; 20%=001ms92.148us; 50%=001ms353.268us; 80%=002ms448.918us; 90%=003ms484.851us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4867
Counter: 11m19s542ms350.829us
ValueRate: 772ms468.641us / second
Rate: 5.6613 / second
Percentiles: 1%=001ms258.410us; 5%=001ms425.017us; 10%=002ms558.423us; 20%=002ms745.164us; 50%=007ms580.976us; 80%=215ms756.311us; 90%=516ms967.850us; 95%=522ms649.530us; 99%=529ms867.017us
Counter: CachedSyncTensors
Value: 1838
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 680651
Counter: CreateXlaTensor
Value: 5361348
Counter: DestroyDataHandles
Value: 679770
Counter: DestroyXlaTensor
Value: 5360596
Counter: ReleaseDataHandles
Value: 679771
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 368
Epoch 42 begin 19:01:03
training/ 19:01:10, device xla:1, step 10, Rate=169.37, Global Rate=193.03
training/ 19:01:17, device xla:1, step 20, Rate=176.42, Global Rate=184.70
Epoch 42 Training stats:
device xla:1
| epoch 042 | loss 7.220 | nll_loss 6.081 | ppl 67.70 | wps 2853 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 924 | lr 0.000115577 | gnorm 6.709 | clip 0.000 | oom 0.000 | wall 1075 | train_wall 596
Epoch 42 Tracker Rates:
Rate=177.08, Global Rate=184.10
Epoch 42 end 19:01:19
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1867
Counter: 13m06s308ms808.307us
ValueRate: 879ms137.767us / second
Rate: 2.14657 / second
Percentiles: 1%=002ms184.801us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms580.632us; 50%=221ms825.769us; 80%=616ms220.654us; 90%=618ms558.984us; 95%=619ms720.166us; 99%=621ms244.099us
Metric: InboundData
TotalSamples: 373
Counter: 1.34KB
ValueRate: 1.53B / second
Rate: 0.415776 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4958
Counter: 1.13GB
ValueRate: 393.77KB / second
Rate: 5.67831 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 27410
Counter: 01m29s915ms292.993us
ValueRate: 105ms386.568us / second
Rate: 34.5042 / second
Percentiles: 1%=632.154us; 5%=760.590us; 10%=844.034us; 20%=994.145us; 50%=001ms424.969us; 80%=005ms116.315us; 90%=008ms782.199us; 95%=009ms326.836us; 99%=011ms105.077us
Metric: TransferFromServerTime
TotalSamples: 373
Counter: 01s125ms802.131us
ValueRate: 001ms253.794us / second
Rate: 0.415776 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=944.157us; 20%=001ms91.944us; 50%=001ms351.346us; 80%=002ms448.918us; 90%=003ms432.681us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4958
Counter: 11m29s486ms659.516us
ValueRate: 764ms155.418us / second
Rate: 5.67844 / second
Percentiles: 1%=001ms263.765us; 5%=001ms427.073us; 10%=002ms558.725us; 20%=002ms728.477us; 50%=006ms419.803us; 80%=215ms418.019us; 90%=516ms669.625us; 95%=521ms994.027us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1860
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 696972
Counter: CreateXlaTensor
Value: 5467498
Counter: DestroyDataHandles
Value: 696092
Counter: DestroyXlaTensor
Value: 5466746
Counter: ReleaseDataHandles
Value: 696092
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 373
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:01:19, device xla:1, step 0
validation/ 19:01:21, device xla:1, step 10
validation/ 19:01:23, device xla:1, step 20
validation stats on subset "valid" - 19:01:24
| epoch 042 | valid on 'valid' subset | loss 9.900 | nll_loss 9.045 | ppl 528.27 | num_updates 924
old learning rate: 0.00011282745000000001
new learning rate: 0.00011557690000000001
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1890
Counter: 13m11s085ms183.475us
ValueRate: 881ms692.284us / second
Rate: 2.17326 / second
Percentiles: 1%=002ms75.239us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms563.288us; 50%=219ms937.099us; 80%=616ms131.752us; 90%=617ms483.404us; 95%=619ms598.454us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 377
Counter: 1.35KB
ValueRate: 1.54B / second
Rate: 0.417871 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 4987
Counter: 1.14GB
ValueRate: 407.64KB / second
Rate: 5.66178 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 27508
Counter: 01m29s043ms501.176us
ValueRate: 092ms167.304us / second
Rate: 32.1082 / second
Percentiles: 1%=634.686us; 5%=762.375us; 10%=847.809us; 20%=995.533us; 50%=001ms342.788us; 80%=004ms384.854us; 90%=007ms433.352us; 95%=009ms52.600us; 99%=011ms557.655us
Metric: TransferFromServerTime
TotalSamples: 377
Counter: 01s131ms776.553us
ValueRate: 001ms253.365us / second
Rate: 0.417871 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=944.157us; 20%=001ms92.148us; 50%=001ms346.061us; 80%=002ms428.530us; 90%=003ms432.681us; 95%=005ms806.005us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 4987
Counter: 12m34s395ms197.938us
ValueRate: 772ms579.651us / second
Rate: 5.67823 / second
Percentiles: 1%=001ms263.765us; 5%=001ms433.528us; 10%=002ms561.079us; 20%=002ms748.512us; 50%=007ms551.861us; 80%=215ms729.433us; 90%=515ms328.662us; 95%=521ms939.560us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1883
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 697249
Counter: CreateXlaTensor
Value: 5492099
Counter: DestroyDataHandles
Value: 696368
Counter: DestroyXlaTensor
Value: 5491347
Counter: ReleaseDataHandles
Value: 696369
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 377
Epoch 43 begin 19:01:24
training/ 19:01:31, device xla:1, step 10, Rate=169.19, Global Rate=193.62
training/ 19:01:38, device xla:1, step 20, Rate=176.85, Global Rate=185.50
Epoch 43 Training stats:
device xla:1
| epoch 043 | loss 7.107 | nll_loss 5.952 | ppl 61.90 | wps 2865 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 946 | lr 0.000118326 | gnorm 6.637 | clip 0.000 | oom 0.000 | wall 1096 | train_wall 610
Epoch 43 Tracker Rates:
Rate=177.03, Global Rate=184.73
Epoch 43 end 19:01:40
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1912
Counter: 13m25s625ms261.954us
ValueRate: 879ms929.526us / second
Rate: 2.14628 / second
Percentiles: 1%=002ms75.239us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms573.188us; 50%=221ms825.769us; 80%=616ms131.752us; 90%=618ms501.531us; 95%=619ms586.896us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 382
Counter: 1.37KB
ValueRate: 1.53B / second
Rate: 0.416076 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5076
Counter: 1.14GB
ValueRate: 393.44KB / second
Rate: 5.67366 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 28089
Counter: 02m31s705ms472.277us
ValueRate: 095ms357.285us / second
Rate: 34.5241 / second
Percentiles: 1%=604.038us; 5%=739.563us; 10%=837.522us; 20%=996.785us; 50%=001ms415.499us; 80%=005ms993.458us; 90%=007ms410.766us; 95%=009ms826.304us; 99%=010ms186.010us
Metric: TransferFromServerTime
TotalSamples: 382
Counter: 01s138ms239.311us
ValueRate: 001ms239.775us / second
Rate: 0.416076 / second
Percentiles: 1%=737.370us; 5%=850.771us; 10%=944.298us; 20%=001ms95.189us; 50%=001ms346.016us; 80%=002ms428.530us; 90%=003ms415.193us; 95%=005ms753.517us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 5076
Counter: 12m46s872ms108.181us
ValueRate: 766ms3.699us / second
Rate: 5.67379 / second
Percentiles: 1%=001ms273.296us; 5%=001ms431.933us; 10%=002ms558.725us; 20%=002ms746.319us; 50%=006ms407.039us; 80%=216ms576.140us; 90%=515ms65.006us; 95%=521ms502.305us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1905
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 713568
Counter: CreateXlaTensor
Value: 5598249
Counter: DestroyDataHandles
Value: 712688
Counter: DestroyXlaTensor
Value: 5597497
Counter: ReleaseDataHandles
Value: 712688
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 382
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:01:40, device xla:1, step 0
validation/ 19:01:42, device xla:1, step 10
validation/ 19:01:44, device xla:1, step 20
validation stats on subset "valid" - 19:01:45
| epoch 043 | valid on 'valid' subset | loss 9.906 | nll_loss 8.953 | ppl 495.72 | num_updates 946
old learning rate: 0.00011557690000000001
new learning rate: 0.00011832635000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1935
Counter: 13m29s401ms615.437us
ValueRate: 880ms434.928us / second
Rate: 2.17281 / second
Percentiles: 1%=002ms184.801us; 5%=215ms432.182us; 10%=216ms954.673us; 20%=217ms573.188us; 50%=219ms922.931us; 80%=616ms87.357us; 90%=617ms374.538us; 95%=619ms501.157us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 386
Counter: 1.38KB
ValueRate: 1.54B / second
Rate: 0.418127 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5104
Counter: 1.14GB
ValueRate: 407.51KB / second
Rate: 5.66007 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 28184
Counter: 02m31s818ms824.888us
ValueRate: 081ms212.470us / second
Rate: 31.5046 / second
Percentiles: 1%=604.038us; 5%=746.680us; 10%=837.522us; 20%=986.947us; 50%=001ms365.053us; 80%=004ms317.938us; 90%=007ms22.359us; 95%=009ms606.255us; 99%=010ms961.537us
Metric: TransferFromServerTime
TotalSamples: 386
Counter: 01s146ms611.942us
ValueRate: 001ms240.963us / second
Rate: 0.418127 / second
Percentiles: 1%=737.370us; 5%=850.771us; 10%=944.298us; 20%=001ms96.757us; 50%=001ms346.061us; 80%=002ms428.530us; 90%=003ms415.193us; 95%=005ms753.517us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 5104
Counter: 12m51s776ms641.206us
ValueRate: 771ms746.024us / second
Rate: 5.65402 / second
Percentiles: 1%=001ms286.057us; 5%=001ms437.565us; 10%=002ms565.348us; 20%=002ms755.781us; 50%=007ms530.882us; 80%=215ms336.753us; 90%=515ms29.689us; 95%=520ms467.318us; 99%=529ms521.048us
Counter: CachedSyncTensors
Value: 1928
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 713844
Counter: CreateXlaTensor
Value: 5622850
Counter: DestroyDataHandles
Value: 712963
Counter: DestroyXlaTensor
Value: 5622098
Counter: ReleaseDataHandles
Value: 712964
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 386
Epoch 44 begin 19:01:45
training/ 19:01:52, device xla:1, step 10, Rate=168.56, Global Rate=193.24
training/ 19:01:59, device xla:1, step 20, Rate=177.01, Global Rate=185.20
Epoch 44 Training stats:
device xla:1
| epoch 044 | loss 6.997 | nll_loss 5.826 | ppl 56.73 | wps 2877 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 968 | lr 0.000121076 | gnorm 6.556 | clip 0.000 | oom 0.000 | wall 1117 | train_wall 623
Epoch 44 Tracker Rates:
Rate=177.07, Global Rate=184.43
Epoch 44 end 19:02:01
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1957
Counter: 14m43s955ms513.109us
ValueRate: 879ms911.766us / second
Rate: 2.14631 / second
Percentiles: 1%=002ms184.801us; 5%=215ms436.254us; 10%=216ms20.961us; 20%=217ms596.593us; 50%=220ms823.010us; 80%=616ms121.658us; 90%=617ms447.165us; 95%=619ms534.130us; 99%=621ms456.373us
Metric: InboundData
TotalSamples: 391
Counter: 1.40KB
ValueRate: 1.53B / second
Rate: 0.416357 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5194
Counter: 1.15GB
ValueRate: 391.89KB / second
Rate: 5.65124 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 28782
Counter: 02m33s548ms691.738us
ValueRate: 094ms69.667us / second
Rate: 34.5451 / second
Percentiles: 1%=586.608us; 5%=749.492us; 10%=830.527us; 20%=983.689us; 50%=001ms421.350us; 80%=005ms993.458us; 90%=007ms949.815us; 95%=009ms567.122us; 99%=010ms851.335us
Metric: TransferFromServerTime
TotalSamples: 391
Counter: 01s154ms255.476us
ValueRate: 001ms229.110us / second
Rate: 0.416357 / second
Percentiles: 1%=737.370us; 5%=850.771us; 10%=944.298us; 20%=001ms95.189us; 50%=001ms346.061us; 80%=002ms428.530us; 90%=003ms395.031us; 95%=005ms753.517us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 5194
Counter: 12m03s796ms163.179us
ValueRate: 774ms545.113us / second
Rate: 5.66672 / second
Percentiles: 1%=001ms306.454us; 5%=001ms438.772us; 10%=002ms558.725us; 20%=002ms744.094us; 50%=006ms410.496us; 80%=216ms672.522us; 90%=515ms65.006us; 95%=520ms763.269us; 99%=528ms425.833us
Counter: CachedSyncTensors
Value: 1950
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 730164
Counter: CreateXlaTensor
Value: 5729000
Counter: DestroyDataHandles
Value: 729284
Counter: DestroyXlaTensor
Value: 5728248
Counter: ReleaseDataHandles
Value: 729284
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 391
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:02:01, device xla:1, step 0
validation/ 19:02:03, device xla:1, step 10
validation/ 19:02:05, device xla:1, step 20
validation stats on subset "valid" - 19:02:06
| epoch 044 | valid on 'valid' subset | loss 9.682 | nll_loss 8.751 | ppl 430.95 | num_updates 968
old learning rate: 0.00011832635000000002
new learning rate: 0.00012107580000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 1980
Counter: 14m48s722ms141.872us
ValueRate: 881ms554.005us / second
Rate: 2.17308 / second
Percentiles: 1%=002ms184.801us; 5%=215ms479.522us; 10%=216ms24.612us; 20%=217ms577.610us; 50%=219ms922.931us; 80%=616ms87.357us; 90%=617ms374.538us; 95%=619ms501.157us; 99%=621ms456.373us
Metric: InboundData
TotalSamples: 395
Counter: 1.42KB
ValueRate: 1.54B / second
Rate: 0.418362 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5226
Counter: 1.15GB
ValueRate: 408.19KB / second
Rate: 5.6694 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 28886
Counter: 02m33s657ms746.839us
ValueRate: 082ms949.193us / second
Rate: 32.1785 / second
Percentiles: 1%=593.883us; 5%=749.492us; 10%=836.037us; 20%=973.898us; 50%=001ms368.864us; 80%=004ms307.346us; 90%=007ms670.891us; 95%=008ms310.962us; 99%=010ms759.463us
Metric: TransferFromServerTime
TotalSamples: 395
Counter: 01s161ms867.144us
ValueRate: 001ms229.527us / second
Rate: 0.418362 / second
Percentiles: 1%=737.370us; 5%=850.771us; 10%=944.157us; 20%=001ms95.189us; 50%=001ms351.346us; 80%=002ms428.530us; 90%=003ms395.031us; 95%=005ms753.517us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 5226
Counter: 12m08s704ms478.661us
ValueRate: 776ms614.928us / second
Rate: 5.66958 / second
Percentiles: 1%=001ms306.454us; 5%=001ms438.772us; 10%=002ms558.725us; 20%=002ms747.905us; 50%=006ms491.423us; 80%=215ms208.396us; 90%=515ms29.689us; 95%=520ms547.619us; 99%=527ms768.419us
Counter: CachedSyncTensors
Value: 1973
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 730444
Counter: CreateXlaTensor
Value: 5753601
Counter: DestroyDataHandles
Value: 729563
Counter: DestroyXlaTensor
Value: 5752850
Counter: ReleaseDataHandles
Value: 729565
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 395
Epoch 45 begin 19:02:06
training/ 19:02:12, device xla:1, step 10, Rate=172.09, Global Rate=195.97
training/ 19:02:20, device xla:1, step 20, Rate=177.14, Global Rate=186.80
Epoch 45 Training stats:
device xla:1
| epoch 045 | loss 6.890 | nll_loss 5.703 | ppl 52.09 | wps 2888 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 990 | lr 0.000123825 | gnorm 6.454 | clip 0.000 | oom 0.000 | wall 1138 | train_wall 637
Epoch 45 Tracker Rates:
Rate=176.77, Global Rate=185.78
Epoch 45 end 19:02:22
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2002
Counter: 14m01s277ms723.398us
ValueRate: 879ms232.405us / second
Rate: 2.14704 / second
Percentiles: 1%=002ms184.801us; 5%=215ms479.522us; 10%=216ms58.274us; 20%=217ms594.216us; 50%=220ms823.010us; 80%=616ms160.190us; 90%=618ms501.147us; 95%=619ms567.827us; 99%=622ms591.221us
Metric: InboundData
TotalSamples: 400
Counter: 1.44KB
ValueRate: 1.53B / second
Rate: 0.416674 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5313
Counter: 1.16GB
ValueRate: 392.80KB / second
Rate: 5.66444 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 29429
Counter: 02m34s389ms736.936us
ValueRate: 094ms89.507us / second
Rate: 33.8888 / second
Percentiles: 1%=600.149us; 5%=759.486us; 10%=838.485us; 20%=989.820us; 50%=001ms422.148us; 80%=005ms136.954us; 90%=007ms113.015us; 95%=008ms497.834us; 99%=011ms815.553us
Metric: TransferFromServerTime
TotalSamples: 400
Counter: 01s171ms320.690us
ValueRate: 001ms220.147us / second
Rate: 0.416674 / second
Percentiles: 1%=738.270us; 5%=851.507us; 10%=944.298us; 20%=001ms96.757us; 50%=001ms354.718us; 80%=002ms428.530us; 90%=003ms415.193us; 95%=005ms753.517us; 99%=041ms787.486us
Metric: TransferToServerTime
TotalSamples: 5313
Counter: 12m18s442ms98.226us
ValueRate: 769ms226.385us / second
Rate: 5.66443 / second
Percentiles: 1%=001ms294.373us; 5%=001ms447.184us; 10%=002ms560.815us; 20%=002ms745.359us; 50%=006ms419.827us; 80%=215ms441.887us; 90%=515ms691.664us; 95%=519ms299.843us; 99%=527ms768.419us
Counter: CachedSyncTensors
Value: 1995
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 746761
Counter: CreateXlaTensor
Value: 5859751
Counter: DestroyDataHandles
Value: 745882
Counter: DestroyXlaTensor
Value: 5859000
Counter: ReleaseDataHandles
Value: 745882
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 400
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:02:22, device xla:1, step 0
validation/ 19:02:24, device xla:1, step 10
validation/ 19:02:26, device xla:1, step 20
validation stats on subset "valid" - 19:02:27
| epoch 045 | valid on 'valid' subset | loss 9.872 | nll_loss 9.008 | ppl 514.76 | num_updates 990
old learning rate: 0.00012107580000000002
new learning rate: 0.00012382525000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2025
Counter: 14m06s046ms981.291us
ValueRate: 880ms478.075us / second
Rate: 2.17301 / second
Percentiles: 1%=002ms174.273us; 5%=215ms479.522us; 10%=216ms58.274us; 20%=217ms563.288us; 50%=219ms922.931us; 80%=616ms87.357us; 90%=617ms298.979us; 95%=618ms282.824us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 404
Counter: 1.45KB
ValueRate: 1.54B / second
Rate: 0.418638 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5341
Counter: 1.16GB
ValueRate: 407.92KB / second
Rate: 5.66571 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 29525
Counter: 02m34s498ms870.489us
ValueRate: 083ms852.979us / second
Rate: 31.5782 / second
Percentiles: 1%=600.149us; 5%=761.815us; 10%=852.394us; 20%=978.129us; 50%=001ms344.633us; 80%=004ms448.028us; 90%=007ms887.347us; 95%=008ms401.196us; 99%=011ms815.553us
Metric: TransferFromServerTime
TotalSamples: 404
Counter: 01s178ms66.429us
ValueRate: 001ms220.751us / second
Rate: 0.418638 / second
Percentiles: 1%=738.270us; 5%=851.507us; 10%=944.298us; 20%=001ms96.757us; 50%=001ms353.268us; 80%=002ms428.530us; 90%=003ms395.031us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5341
Counter: 12m23s335ms243.842us
ValueRate: 774ms34.951us / second
Rate: 5.66583 / second
Percentiles: 1%=001ms295.422us; 5%=001ms457.853us; 10%=002ms565.848us; 20%=002ms755.781us; 50%=007ms510.475us; 80%=215ms935.472us; 90%=515ms594.842us; 95%=519ms917.157us; 99%=527ms768.419us
Counter: CachedSyncTensors
Value: 2018
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 747037
Counter: CreateXlaTensor
Value: 5884352
Counter: DestroyDataHandles
Value: 746156
Counter: DestroyXlaTensor
Value: 5883600
Counter: ReleaseDataHandles
Value: 746157
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 404
Epoch 46 begin 19:02:27
training/ 19:02:33, device xla:1, step 10, Rate=169.56, Global Rate=194.05
training/ 19:02:41, device xla:1, step 20, Rate=176.91, Global Rate=185.54
Epoch 46 Training stats:
device xla:1
| epoch 046 | loss 6.788 | nll_loss 5.584 | ppl 47.98 | wps 2899 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 1012 | lr 0.000126575 | gnorm 6.348 | clip 0.000 | oom 0.000 | wall 1159 | train_wall 651
Epoch 46 Tracker Rates:
Rate=177.07, Global Rate=184.77
Epoch 46 end 19:02:43
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2047
Counter: 14m20s587ms983.454us
ValueRate: 879ms788.848us / second
Rate: 2.14636 / second
Percentiles: 1%=002ms174.273us; 5%=215ms479.522us; 10%=216ms58.274us; 20%=217ms563.635us; 50%=220ms823.010us; 80%=616ms69.036us; 90%=617ms232.588us; 95%=618ms175.887us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 409
Counter: 1.47KB
ValueRate: 1.53B / second
Rate: 0.416948 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5432
Counter: 1.16GB
ValueRate: 389.59KB / second
Rate: 5.68012 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 30114
Counter: 02m36s263ms710.043us
ValueRate: 096ms735.822us / second
Rate: 33.828 / second
Percentiles: 1%=622.902us; 5%=769.258us; 10%=874.924us; 20%=001ms16.998us; 50%=001ms449.463us; 80%=005ms914.928us; 90%=007ms352.271us; 95%=009ms54.623us; 99%=011ms996.504us
Metric: TransferFromServerTime
TotalSamples: 409
Counter: 01s185ms940.995us
ValueRate: 001ms207.968us / second
Rate: 0.416948 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=930.002us; 20%=001ms92.148us; 50%=001ms351.346us; 80%=002ms428.530us; 90%=003ms395.031us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5432
Counter: 13m34s221ms34.175us
ValueRate: 769ms698.855us / second
Rate: 5.68024 / second
Percentiles: 1%=001ms295.422us; 5%=001ms447.184us; 10%=002ms558.885us; 20%=002ms738.934us; 50%=006ms437.105us; 80%=215ms422.581us; 90%=515ms910.015us; 95%=518ms233.474us; 99%=527ms768.419us
Counter: CachedSyncTensors
Value: 2040
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 763358
Counter: CreateXlaTensor
Value: 5990502
Counter: DestroyDataHandles
Value: 762478
Counter: DestroyXlaTensor
Value: 5989750
Counter: ReleaseDataHandles
Value: 762478
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 409
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:02:43, device xla:1, step 0
validation/ 19:02:45, device xla:1, step 10
validation/ 19:02:47, device xla:1, step 20
validation stats on subset "valid" - 19:02:48
| epoch 046 | valid on 'valid' subset | loss 10.194 | nll_loss 9.347 | ppl 651.36 | num_updates 1012
old learning rate: 0.00012382525000000002
new learning rate: 0.00012657470000000003
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2070
Counter: 14m24s353ms274.797us
ValueRate: 880ms118.526us / second
Rate: 2.17245 / second
Percentiles: 1%=002ms174.273us; 5%=215ms475.731us; 10%=216ms24.612us; 20%=217ms548.642us; 50%=219ms916.603us; 80%=616ms956.780us; 90%=617ms180.598us; 95%=618ms64.007us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 413
Counter: 1.48KB
ValueRate: 1.54B / second
Rate: 0.418867 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5460
Counter: 1.17GB
ValueRate: 408.09KB / second
Rate: 5.66805 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 30213
Counter: 02m36s371ms255.768us
ValueRate: 084ms886.725us / second
Rate: 31.4971 / second
Percentiles: 1%=635.189us; 5%=778.604us; 10%=861.083us; 20%=001ms0.091us; 50%=001ms357.254us; 80%=004ms351.014us; 90%=007ms144.557us; 95%=009ms927.213us; 99%=011ms836.486us
Metric: TransferFromServerTime
TotalSamples: 413
Counter: 01s191ms194.315us
ValueRate: 001ms208.116us / second
Rate: 0.418867 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=935.226us; 20%=001ms95.189us; 50%=001ms354.718us; 80%=002ms418.245us; 90%=003ms232.427us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5460
Counter: 13m39s116ms873.263us
ValueRate: 773ms287.581us / second
Rate: 5.66198 / second
Percentiles: 1%=001ms317.661us; 5%=001ms457.853us; 10%=002ms574.978us; 20%=002ms752.772us; 50%=007ms503.073us; 80%=215ms962.364us; 90%=515ms618.466us; 95%=518ms20.443us; 99%=527ms768.419us
Counter: CachedSyncTensors
Value: 2063
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 763634
Counter: CreateXlaTensor
Value: 6015103
Counter: DestroyDataHandles
Value: 762753
Counter: DestroyXlaTensor
Value: 6014351
Counter: ReleaseDataHandles
Value: 762754
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 413
Epoch 47 begin 19:02:48
training/ 19:02:54, device xla:1, step 10, Rate=169.32, Global Rate=194.25
training/ 19:03:01, device xla:1, step 20, Rate=177.32, Global Rate=186.04
Epoch 47 Training stats:
device xla:1
| epoch 047 | loss 6.689 | nll_loss 5.470 | ppl 44.32 | wps 2910 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 1034 | lr 0.000129324 | gnorm 6.242 | clip 0.000 | oom 0.000 | wall 1180 | train_wall 665
Epoch 47 Tracker Rates:
Rate=177.23, Global Rate=185.21
Epoch 47 end 19:03:04
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2092
Counter: 15m38s897ms334.927us
ValueRate: 879ms545.378us / second
Rate: 2.14588 / second
Percentiles: 1%=002ms174.273us; 5%=215ms475.731us; 10%=216ms77.350us; 20%=217ms563.288us; 50%=220ms823.010us; 80%=616ms1.509us; 90%=617ms200.964us; 95%=618ms80.485us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 418
Counter: 1.50KB
ValueRate: 1.53B / second
Rate: 0.417225 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5548
Counter: 1.17GB
ValueRate: 392.13KB / second
Rate: 5.65467 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 30804
Counter: 02m38s052ms688.899us
ValueRate: 095ms880.000us / second
Rate: 35.4807 / second
Percentiles: 1%=642.032us; 5%=770.113us; 10%=846.737us; 20%=982.598us; 50%=001ms383.228us; 80%=004ms426.080us; 90%=007ms159.906us; 95%=009ms635.416us; 99%=011ms901.314us
Metric: TransferFromServerTime
TotalSamples: 418
Counter: 01s200ms519.155us
ValueRate: 001ms197.294us / second
Rate: 0.417225 / second
Percentiles: 1%=737.370us; 5%=845.519us; 10%=929.694us; 20%=001ms92.148us; 50%=001ms354.718us; 80%=002ms418.245us; 90%=003ms395.031us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5548
Counter: 13m50s951ms338.480us
ValueRate: 767ms884.520us / second
Rate: 5.65466 / second
Percentiles: 1%=001ms317.661us; 5%=001ms457.853us; 10%=002ms574.978us; 20%=002ms728.941us; 50%=006ms405.271us; 80%=215ms422.581us; 90%=515ms896.288us; 95%=518ms829.924us; 99%=523ms351.916us
Counter: CachedSyncTensors
Value: 2085
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 779952
Counter: CreateXlaTensor
Value: 6121253
Counter: DestroyDataHandles
Value: 779072
Counter: DestroyXlaTensor
Value: 6120501
Counter: ReleaseDataHandles
Value: 779072
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 418
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:03:04, device xla:1, step 0
validation/ 19:03:06, device xla:1, step 10
validation/ 19:03:08, device xla:1, step 20
validation stats on subset "valid" - 19:03:09
| epoch 047 | valid on 'valid' subset | loss 9.833 | nll_loss 8.941 | ppl 491.34 | num_updates 1034
old learning rate: 0.00012657470000000003
new learning rate: 0.00012932415000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2115
Counter: 15m43s662ms237.656us
ValueRate: 880ms921.586us / second
Rate: 2.17207 / second
Percentiles: 1%=002ms184.801us; 5%=215ms430.574us; 10%=216ms976.284us; 20%=217ms515.711us; 50%=219ms916.603us; 80%=616ms934.390us; 90%=617ms95.149us; 95%=618ms75.389us; 99%=621ms117.380us
Metric: InboundData
TotalSamples: 422
Counter: 1.51KB
ValueRate: 1.54B / second
Rate: 0.419098 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5576
Counter: 1.18GB
ValueRate: 407.58KB / second
Rate: 5.66095 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 30904
Counter: 02m38s167ms880.836us
ValueRate: 080ms306.477us / second
Rate: 32.2385 / second
Percentiles: 1%=661.368us; 5%=781.077us; 10%=854.074us; 20%=984.982us; 50%=001ms311.127us; 80%=004ms747.181us; 90%=007ms557.584us; 95%=008ms344.942us; 99%=011ms836.486us
Metric: TransferFromServerTime
TotalSamples: 422
Counter: 01s207ms579.998us
ValueRate: 001ms198.282us / second
Rate: 0.419098 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=930.002us; 20%=001ms95.189us; 50%=001ms354.718us; 80%=002ms418.245us; 90%=003ms232.427us; 95%=004ms473.119us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5576
Counter: 13m55s853ms38.439us
ValueRate: 772ms777.744us / second
Rate: 5.65498 / second
Percentiles: 1%=001ms330.942us; 5%=001ms476.565us; 10%=002ms580.541us; 20%=002ms738.885us; 50%=006ms468.706us; 80%=215ms756.311us; 90%=515ms670.092us; 95%=518ms742.197us; 99%=523ms276.828us
Counter: CachedSyncTensors
Value: 2108
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 780228
Counter: CreateXlaTensor
Value: 6145854
Counter: DestroyDataHandles
Value: 779347
Counter: DestroyXlaTensor
Value: 6145102
Counter: ReleaseDataHandles
Value: 779348
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 422
Epoch 48 begin 19:03:09
training/ 19:03:15, device xla:1, step 10, Rate=169.55, Global Rate=193.79
training/ 19:03:22, device xla:1, step 20, Rate=178.20, Global Rate=186.49
Epoch 48 Training stats:
device xla:1
| epoch 048 | loss 6.601 | nll_loss 5.370 | ppl 41.34 | wps 2920 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 1056 | lr 0.000132074 | gnorm 6.238 | clip 0.000 | oom 0.000 | wall 1201 | train_wall 679
Epoch 48 Tracker Rates:
Rate=177.90, Global Rate=185.62
Epoch 48 end 19:03:24
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2137
Counter: 15m56s199ms88.063us
ValueRate: 878ms389.605us / second
Rate: 2.14572 / second
Percentiles: 1%=002ms184.801us; 5%=215ms432.182us; 10%=216ms17.653us; 20%=217ms548.642us; 50%=220ms823.010us; 80%=616ms934.390us; 90%=617ms57.405us; 95%=618ms11.687us; 99%=621ms105.453us
Metric: InboundData
TotalSamples: 427
Counter: 1.53KB
ValueRate: 1.54B / second
Rate: 0.417497 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5662
Counter: 1.18GB
ValueRate: 390.60KB / second
Rate: 5.63271 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 31454
Counter: 02m40s823ms243.608us
ValueRate: 092ms368.434us / second
Rate: 33.8568 / second
Percentiles: 1%=661.023us; 5%=777.195us; 10%=861.261us; 20%=001ms3.989us; 50%=001ms420.094us; 80%=005ms507.949us; 90%=007ms311.865us; 95%=009ms592.518us; 99%=010ms390.599us
Metric: TransferFromServerTime
TotalSamples: 427
Counter: 01s217ms609.152us
ValueRate: 001ms189.533us / second
Rate: 0.417497 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=930.002us; 20%=001ms96.757us; 50%=001ms353.268us; 80%=002ms418.245us; 90%=003ms395.031us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5662
Counter: 13m06s656ms287.880us
ValueRate: 766ms914.493us / second
Rate: 5.63285 / second
Percentiles: 1%=001ms330.942us; 5%=001ms461.985us; 10%=002ms579.010us; 20%=002ms728.941us; 50%=006ms399.529us; 80%=215ms255.390us; 90%=515ms618.466us; 95%=518ms545.432us; 99%=523ms701.595us
Counter: CachedSyncTensors
Value: 2130
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 796544
Counter: CreateXlaTensor
Value: 6252004
Counter: DestroyDataHandles
Value: 795664
Counter: DestroyXlaTensor
Value: 6251252
Counter: ReleaseDataHandles
Value: 795664
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 427
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:03:25, device xla:1, step 0
validation/ 19:03:27, device xla:1, step 10
validation/ 19:03:29, device xla:1, step 20
validation stats on subset "valid" - 19:03:30
| epoch 048 | valid on 'valid' subset | loss 9.633 | nll_loss 8.667 | ppl 406.58 | num_updates 1056
old learning rate: 0.00012932415000000002
new learning rate: 0.0001320736
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2160
Counter: 15m01s975ms725.555us
ValueRate: 880ms797.604us / second
Rate: 2.17197 / second
Percentiles: 1%=002ms184.801us; 5%=215ms432.182us; 10%=216ms17.653us; 20%=217ms548.642us; 50%=219ms891.981us; 80%=616ms861.137us; 90%=617ms10.174us; 95%=618ms877.032us; 99%=621ms902.551us
Metric: InboundData
TotalSamples: 431
Counter: 1.55KB
ValueRate: 1.54B / second
Rate: 0.419328 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5691
Counter: 1.18GB
ValueRate: 404.46KB / second
Rate: 5.61771 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 31548
Counter: 02m40s934ms866.056us
ValueRate: 081ms318.830us / second
Rate: 30.8847 / second
Percentiles: 1%=661.023us; 5%=789.460us; 10%=875.375us; 20%=001ms13.612us; 50%=001ms361.813us; 80%=004ms224.712us; 90%=007ms162.962us; 95%=009ms561.791us; 99%=010ms340.344us
Metric: TransferFromServerTime
TotalSamples: 431
Counter: 01s223ms178.759us
ValueRate: 001ms190.053us / second
Rate: 0.419328 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=930.002us; 20%=001ms95.189us; 50%=001ms351.346us; 80%=002ms418.245us; 90%=003ms391.030us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5691
Counter: 13m11s574ms389.825us
ValueRate: 774ms654.056us / second
Rate: 5.63349 / second
Percentiles: 1%=001ms333.997us; 5%=001ms475.630us; 10%=002ms581.652us; 20%=002ms735.972us; 50%=006ms490.709us; 80%=215ms935.472us; 90%=514ms493.751us; 95%=517ms385.913us; 99%=523ms701.595us
Counter: CachedSyncTensors
Value: 2153
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 796821
Counter: CreateXlaTensor
Value: 6276605
Counter: DestroyDataHandles
Value: 795940
Counter: DestroyXlaTensor
Value: 6275854
Counter: ReleaseDataHandles
Value: 795942
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 431
Epoch 49 begin 19:03:30
training/ 19:03:36, device xla:1, step 10, Rate=167.34, Global Rate=191.30
training/ 19:03:43, device xla:1, step 20, Rate=176.13, Global Rate=184.40
Epoch 49 Training stats:
device xla:1
| epoch 049 | loss 6.513 | nll_loss 5.269 | ppl 38.56 | wps 2929 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 1078 | lr 0.000134823 | gnorm 6.177 | clip 0.000 | oom 0.000 | wall 1222 | train_wall 692
Epoch 49 Tracker Rates:
Rate=176.50, Global Rate=183.72
Epoch 49 end 19:03:46
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2182
Counter: 15m15s508ms794.185us
ValueRate: 878ms881.089us / second
Rate: 2.14463 / second
Percentiles: 1%=002ms184.801us; 5%=215ms432.182us; 10%=216ms24.612us; 20%=217ms558.214us; 50%=219ms345.886us; 80%=616ms817.896us; 90%=617ms31.058us; 95%=618ms872.414us; 99%=621ms902.551us
Metric: InboundData
TotalSamples: 436
Counter: 1.57KB
ValueRate: 1.54B / second
Rate: 0.417693 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5782
Counter: 1.19GB
ValueRate: 390.29KB / second
Rate: 5.62813 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 32092
Counter: 02m42s684ms709.386us
ValueRate: 096ms826.827us / second
Rate: 32.2392 / second
Percentiles: 1%=659.784us; 5%=789.460us; 10%=885.585us; 20%=001ms50.123us; 50%=001ms446.422us; 80%=006ms570.775us; 90%=008ms743.546us; 95%=009ms98.951us; 99%=011ms602.028us
Metric: TransferFromServerTime
TotalSamples: 436
Counter: 01s233ms920.455us
ValueRate: 001ms181.152us / second
Rate: 0.417693 / second
Percentiles: 1%=737.370us; 5%=848.229us; 10%=930.002us; 20%=001ms95.189us; 50%=001ms351.346us; 80%=002ms428.530us; 90%=003ms395.031us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5782
Counter: 13m22s528ms812.302us
ValueRate: 763ms698.144us / second
Rate: 5.62829 / second
Percentiles: 1%=001ms294.373us; 5%=001ms454.754us; 10%=002ms568.002us; 20%=002ms728.758us; 50%=006ms399.529us; 80%=215ms334.074us; 90%=515ms40.593us; 95%=518ms674.603us; 99%=522ms158.336us
Counter: CachedSyncTensors
Value: 2175
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 813142
Counter: CreateXlaTensor
Value: 6382755
Counter: DestroyDataHandles
Value: 812263
Counter: DestroyXlaTensor
Value: 6382004
Counter: ReleaseDataHandles
Value: 812263
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 436
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:03:46, device xla:1, step 0
validation/ 19:03:48, device xla:1, step 10
validation/ 19:03:50, device xla:1, step 20
validation stats on subset "valid" - 19:03:51
| epoch 049 | valid on 'valid' subset | loss 10.003 | nll_loss 8.987 | ppl 507.30 | num_updates 1078
old learning rate: 0.0001320736
new learning rate: 0.00013482305000000002
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2205
Counter: 15m19s272ms123.187us
ValueRate: 879ms58.655us / second
Rate: 2.17042 / second
Percentiles: 1%=002ms197.620us; 5%=215ms415.896us; 10%=216ms950.239us; 20%=217ms502.551us; 50%=219ms891.981us; 80%=616ms716.770us; 90%=617ms944.981us; 95%=618ms759.235us; 99%=621ms594.478us
Metric: InboundData
TotalSamples: 440
Counter: 1.58KB
ValueRate: 1.54B / second
Rate: 0.419497 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5812
Counter: 1.19GB
ValueRate: 405.78KB / second
Rate: 5.63597 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 32182
Counter: 02m42s791ms261.922us
ValueRate: 083ms273.468us / second
Rate: 30.1172 / second
Percentiles: 1%=661.023us; 5%=789.460us; 10%=883.625us; 20%=001ms41.906us; 50%=001ms368.401us; 80%=005ms18.136us; 90%=007ms338.894us; 95%=009ms859.693us; 99%=011ms529.809us
Metric: TransferFromServerTime
TotalSamples: 440
Counter: 01s240ms238.672us
ValueRate: 001ms182.446us / second
Rate: 0.419497 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=935.226us; 20%=001ms96.757us; 50%=001ms354.718us; 80%=002ms448.918us; 90%=003ms395.031us; 95%=005ms724.579us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5812
Counter: 13m26s418ms195.848us
ValueRate: 768ms918.977us / second
Rate: 5.62998 / second
Percentiles: 1%=001ms294.373us; 5%=001ms456.125us; 10%=002ms578.477us; 20%=002ms732.876us; 50%=006ms466.474us; 80%=215ms802.195us; 90%=515ms29.689us; 95%=518ms643.093us; 99%=522ms158.336us
Counter: CachedSyncTensors
Value: 2198
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 813420
Counter: CreateXlaTensor
Value: 6407356
Counter: DestroyDataHandles
Value: 812539
Counter: DestroyXlaTensor
Value: 6406604
Counter: ReleaseDataHandles
Value: 812540
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 440
Epoch 50 begin 19:03:51
training/ 19:03:57, device xla:1, step 10, Rate=167.32, Global Rate=191.69
training/ 19:04:04, device xla:1, step 20, Rate=176.86, Global Rate=185.37
Epoch 50 Training stats:
device xla:1
| epoch 050 | loss 6.429 | nll_loss 5.172 | ppl 36.04 | wps 2938 | ups 1 | wpb 3319.773 | bsz 128.000 | num_updates 1100 | lr 0.000137573 | gnorm 6.117 | clip 0.000 | oom 0.000 | wall 1243 | train_wall 706
Epoch 50 Tracker Rates:
Rate=176.68, Global Rate=184.51
Epoch 50 end 19:04:06
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2227
Counter: 16m33s817ms265.187us
ValueRate: 877ms153.643us / second
Rate: 2.14312 / second
Percentiles: 1%=002ms197.620us; 5%=215ms415.896us; 10%=216ms950.239us; 20%=217ms529.887us; 50%=219ms345.886us; 80%=616ms753.025us; 90%=617ms903.403us; 95%=618ms759.235us; 99%=620ms468.095us
Metric: InboundData
TotalSamples: 445
Counter: 1.60KB
ValueRate: 1.54B / second
Rate: 0.417918 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5903
Counter: 1.20GB
ValueRate: 391.87KB / second
Rate: 5.65096 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=8.00B; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 32777
Counter: 02m44s502ms942.370us
ValueRate: 095ms629.870us / second
Rate: 32.966 / second
Percentiles: 1%=688.850us; 5%=784.044us; 10%=870.453us; 20%=001ms24.992us; 50%=001ms453.325us; 80%=005ms105.105us; 90%=008ms506.282us; 95%=009ms684.548us; 99%=010ms396.635us
Metric: TransferFromServerTime
TotalSamples: 445
Counter: 01s249ms963.940us
ValueRate: 001ms172.954us / second
Rate: 0.417918 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=935.226us; 20%=001ms96.757us; 50%=001ms351.346us; 80%=002ms448.918us; 90%=003ms395.031us; 95%=004ms473.119us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5903
Counter: 14m37s337ms717.020us
ValueRate: 760ms278.621us / second
Rate: 5.65117 / second
Percentiles: 1%=001ms294.373us; 5%=001ms453.186us; 10%=002ms561.432us; 20%=002ms720.647us; 50%=006ms394.956us; 80%=215ms990.442us; 90%=515ms266.219us; 95%=518ms780.877us; 99%=522ms51.728us
Counter: CachedSyncTensors
Value: 2220
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 829741
Counter: CreateXlaTensor
Value: 6513506
Counter: DestroyDataHandles
Value: 828861
Counter: DestroyXlaTensor
Value: 6512754
Counter: ReleaseDataHandles
Value: 828861
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 445
Validating the subset "valid"
| WARNING: 72 samples have invalid sizes and will be skipped, max_positions=(64, 64), first few sample ids=[1808, 612, 1824, 2185, 1244, 723, 1226, 388, 1791, 378]
validation/ 19:04:07, device xla:1, step 0
validation/ 19:04:09, device xla:1, step 10
validation/ 19:04:11, device xla:1, step 20
validation stats on subset "valid" - 19:04:12
| epoch 050 | valid on 'valid' subset | loss 9.521 | nll_loss 8.592 | ppl 385.88 | num_updates 1100
Args
no_progress_bar True
log_interval 100
log_format None
tensorboard_logdir
tbmf_wrapper False
seed 1
cpu False
fp16 False
memory_efficient_fp16 False
fp16_init_scale 128
fp16_scale_window None
fp16_scale_tolerance 0.0
min_loss_scale 0.0001
threshold_loss_scale None
user_dir None
criterion label_smoothed_cross_entropy
optimizer adam
lr_scheduler inverse_sqrt
task translation
num_workers 0
skip_invalid_size_inputs_valid_test True
max_tokens None
max_sentences 128
required_batch_size_multiple 128
dataset_impl cached
train_subset train
valid_subset valid
validate_interval 1
disable_validation False
max_tokens_valid None
max_sentences_valid 128
curriculum 4
distributed_world_size 1
distributed_rank 0
distributed_backend nccl
distributed_init_method None
distributed_port -1
device_id 0
distributed_no_spawn False
ddp_backend c10d
bucket_cap_mb 25
fix_batches_to_gpus False
find_unused_parameters False
arch transformer_vaswani_wmt_en_de_big
max_epoch 50
max_update 0
clip_norm 25
sentence_avg False
update_freq [1]
lr [0.0005]
min_lr 1e-09
use_bmuf False
save_dir checkpoints
restore_file checkpoint_last.pt
reset_dataloader False
reset_lr_scheduler False
reset_meters False
reset_optimizer False
optimizer_overrides {}
save_interval 1
save_interval_updates 0
keep_interval_updates -1
keep_last_epochs -1
no_save True
no_epoch_checkpoints False
no_last_checkpoints False
no_save_optimizer_state False
best_checkpoint_metric loss
maximize_best_checkpoint_metric False
num_cores 1
pad_to_length 64
log_steps 10
use_gpu False
metrics_debug True
no_token_positional_embeddings False
label_smoothing 0.1
adam_betas (0.9, 0.98)
adam_eps 1e-08
weight_decay 0.0
warmup_updates 4000
warmup_init_lr 1e-07
data /home/taylanbil/data/dummy
source_lang en
target_lang de
lazy_load False
raw_text False
left_pad_source True
left_pad_target False
max_source_positions 64
max_target_positions 64
upsample_primary 1
attention_dropout 0.1
share_all_embeddings True
dropout 0.3
encoder_embed_dim 1024
encoder_ffn_embed_dim 4096
encoder_attention_heads 16
encoder_normalize_before False
decoder_embed_dim 1024
decoder_ffn_embed_dim 4096
decoder_attention_heads 16
encoder_embed_path None
encoder_layers 6
encoder_learned_pos False
decoder_embed_path None
decoder_layers 6
decoder_normalize_before False
decoder_learned_pos False
activation_dropout 0.0
activation_fn relu
adaptive_softmax_cutoff None
adaptive_softmax_dropout 0
share_decoder_input_output_embed False
adaptive_input False
decoder_output_dim 1024
decoder_input_dim 1024
---------
old learning rate: 0.00013482305000000002
new learning rate: 0.0001375725
Metric: CompileTime
TotalSamples: 7
Counter: 03m56s897ms46.438us
ValueRate: 737ms590.273us / second
Rate: 0.0293134 / second
Percentiles: 1%=091ms699.712us; 5%=091ms699.712us; 10%=091ms699.712us; 20%=206ms627.326us; 50%=11s663ms676.532us; 80%=01m11s278ms507.643us; 90%=01m12s293ms561.423us; 95%=01m12s293ms561.423us; 99%=01m12s293ms561.423us
Metric: ExecuteTime
TotalSamples: 2250
Counter: 16m38s581ms425.726us
ValueRate: 879ms676.934us / second
Rate: 2.16965 / second
Percentiles: 1%=002ms213.056us; 5%=215ms430.574us; 10%=216ms950.239us; 20%=217ms502.551us; 50%=219ms891.981us; 80%=616ms680.345us; 90%=617ms796.496us; 95%=618ms693.366us; 99%=620ms832.368us
Metric: InboundData
TotalSamples: 449
Counter: 1.61KB
ValueRate: 1.54B / second
Rate: 0.419687 / second
Percentiles: 1%=1.00B; 5%=1.00B; 10%=1.00B; 20%=4.00B; 50%=4.00B; 80%=4.00B; 90%=4.00B; 95%=4.00B; 99%=4.00B
Metric: OutboundData
TotalSamples: 5932
Counter: 1.20GB
ValueRate: 407.46KB / second
Rate: 5.65935 / second
Percentiles: 1%=4.00B; 5%=4.00B; 10%=4.00B; 20%=4.00B; 50%=8.00B; 80%=8.00B; 90%=388.00KB; 95%=776.00KB; 99%=776.00KB
Metric: ReleaseDataHandlesTime
TotalSamples: 32873
Counter: 02m44s617ms849.663us
ValueRate: 083ms566.978us / second
Rate: 30.8116 / second
Percentiles: 1%=689.665us; 5%=787.104us; 10%=877.768us; 20%=001ms32.331us; 50%=001ms398.897us; 80%=005ms592.619us; 90%=007ms130.802us; 95%=008ms496.955us; 99%=010ms365.974us
Metric: TransferFromServerTime
TotalSamples: 449
Counter: 01s256ms416.163us
ValueRate: 001ms174.391us / second
Rate: 0.419687 / second
Percentiles: 1%=737.370us; 5%=849.291us; 10%=935.226us; 20%=001ms96.757us; 50%=001ms353.268us; 80%=002ms448.918us; 90%=003ms395.031us; 95%=004ms473.119us; 99%=041ms732.304us
Metric: TransferToServerTime
TotalSamples: 5932
Counter: 14m42s234ms381.940us
ValueRate: 765ms17.654us / second
Rate: 5.65333 / second
Percentiles: 1%=001ms314.144us; 5%=001ms456.179us; 10%=002ms578.477us; 20%=002ms728.941us; 50%=006ms454.024us; 80%=215ms729.433us; 90%=515ms65.006us; 95%=518ms584.398us; 99%=522ms949.854us
Counter: CachedSyncTensors
Value: 2243
Counter: CreateCompileHandles
Value: 7
Counter: CreateDataHandles
Value: 830018
Counter: CreateXlaTensor
Value: 6538107
Counter: DestroyDataHandles
Value: 829137
Counter: DestroyXlaTensor
Value: 6537356
Counter: ReleaseDataHandles
Value: 829139
Counter: SyncTensorsToData
Value: 184
Counter: UncachedSyncTensors
Value: 7
Counter: XRTAllocateFromTensor_Empty
Value: 230
Counter: XrtCompile_Empty
Value: 1280
Counter: XrtExecuteChained_Empty
Value: 1280
Counter: XrtExecute_Empty
Value: 1280
Counter: XrtRead_Empty
Value: 1280
Counter: XrtReleaseAllocationHandle_Empty
Value: 1280
Counter: XrtReleaseCompileHandle_Empty
Value: 1280
Counter: XrtSessionCount
Value: 12
Counter: XrtSubTuple_Empty
Value: 1280
Counter: aten::_local_scalar_dense
Value: 449
| done training in 1247.7 seconds
Fri Aug 16 19:04:12 UTC 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment