Skip to content

Instantly share code, notes, and snippets.

@vanbasten23
Last active July 8, 2025 22:12
Show Gist options
  • Save vanbasten23/045ef20a29e1d2fcf3a13b6ad1c2582f to your computer and use it in GitHub Desktop.
Save vanbasten23/045ef20a29e1d2fcf3a13b6ad1c2582f to your computer and use it in GitHub Desktop.
WARNING:root:libtpu.so and TPU device found. Setting PJRT_DEVICE=TPU.
INFO 07-08 18:16:39 [__init__.py:253] Automatically detected platform tpu.
INFO 07-08 18:16:39 [tpu.py:187] tpu_commons not found, using vLLM's TpuPlatform
============================= test session starts ==============================
platform linux -- Python 3.10.18, pytest-8.4.1, pluggy-1.6.0 -- /home/xiowei/miniconda3/envs/vllm/bin/python3.10
cachedir: .pytest_cache
rootdir: /home/xiowei/vllm
configfile: pyproject.toml
plugins: anyio-4.9.0
collecting ... collected 1 item
vllm/tests/tpu/test_quantization_accuracy.py::test_gsm8k_correctness[config0] WARNING:lm_eval.evaluator:Model appears to be an instruct variant but chat template is not applied. Recommend setting `apply_chat_template` (optionally `fewshot_as_multiturn`).
INFO 07-08 18:16:46 [config.py:852] This model supports multiple tasks: {'generate', 'reward', 'embed', 'classify'}. Defaulting to 'generate'.
INFO 07-08 18:16:46 [config.py:1489] Using max model len 4096
INFO 07-08 18:16:46 [importing.py:43] Triton is installed but 0 active driver(s) found (expected 1). Disabling Triton to prevent runtime errors.
INFO 07-08 18:16:46 [importing.py:63] Triton not installed or not compatible; certain GPU-related functions will not be available.
INFO 07-08 18:16:46 [config.py:2302] Chunked prefill is enabled with max_num_batched_tokens=8192.
INFO 07-08 18:16:46 [arg_utils.py:1036] Using Tensorizer args from --model-loader-extra-config. Note that you can now simply pass the S3 directory in the model tag instead of providing the JSON string.
INFO 07-08 18:16:46 [tpu.py:103] [TPU] Forcing DYNAMO_ONCE compilation level
INFO 07-08 18:16:47 [core.py:526] Waiting for init message from front-end.
INFO 07-08 18:16:47 [core.py:69] Initializing a V1 LLM engine (v0.8.5.dev1734+g7bcf2c6dc) with config: model='neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8', speculative_config=None, tokenizer='neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config={}, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=4096, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=True, quantization=compressed-tensors, enforce_eager=False, kv_cache_dtype=auto, device_config=None, decoding_config=DecodingConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=1234, served_model_name=neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=True, chunked_prefill_enabled=True, use_async_output_proc=False, pooler_config=None, compilation_config={"level":2,"debug_dump_path":"","cache_dir":"","backend":"openxla","custom_ops":[],"splitting_ops":["vllm.unified_attention","vllm.unified_attention_with_output"],"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[512,504,496,488,480,472,464,456,448,440,432,424,416,408,400,392,384,376,368,360,352,344,336,328,320,312,304,296,288,280,272,264,256,248,240,232,224,216,208,200,192,184,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"max_capture_size":512,"local_cache_dir":null}
INFO 07-08 18:16:47 [tpu_worker.py:318] tpu_commons not found, using vLLM's TPUWorker.
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
INFO 07-08 18:16:47 [parallel_state.py:1076] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
WARNING 07-08 18:17:06 [tpu.py:154] Pin memory is not supported on TPU.
INFO 07-08 18:17:06 [tpu_model_runner.py:1780] Using exponential token paddings:
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 16
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 32
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 64
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 128
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 256
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 512
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 1024
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 2048
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 4096
INFO 07-08 18:17:06 [tpu_model_runner.py:1782] 8192
INFO 07-08 18:17:06 [tpu_model_runner.py:1746] Preparing request paddings:
INFO 07-08 18:17:06 [tpu_model_runner.py:1753] 8
INFO 07-08 18:17:06 [tpu_model_runner.py:1753] 16
INFO 07-08 18:17:06 [tpu_model_runner.py:1753] 32
INFO 07-08 18:17:06 [tpu_model_runner.py:1142] Loading model from scratch...
INFO 07-08 18:17:06 [compressed_tensors_w8a8_int8.py:52] Using XLAScaledMMLinearKernel for CompressedTensorsW8A8Int8
INFO 07-08 18:17:07 [tpu.py:51] Cannot use None backend on TPU.
INFO 07-08 18:17:07 [tpu.py:55] Using Pallas V1 backend.
INFO 07-08 18:17:07 [weight_utils.py:292] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s]
2025-07-08 18:17:07.693570: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:08.735413: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:08.942409: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:08.988685: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.206157: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.252239: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.430388: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.462238: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.679429: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.705381: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:09.904122: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:03<00:03, 3.21s/it]
2025-07-08 18:17:11.031288: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:03<00:00, 1.78s/it]
Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:03<00:00, 1.99s/it]
INFO 07-08 18:17:11 [default_loader.py:272] Loading weights took 4.10 seconds
2025-07-08 18:17:11.742579: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:11.792697: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:17:12.248796: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:12 [kv_cache_utils.py:716] GPU KV cache size: 159,232 tokens
INFO 07-08 18:17:12 [kv_cache_utils.py:720] Maximum concurrency for 4,096 tokens per request: 38.88x
INFO 07-08 18:17:13 [tpu_model_runner.py:1304] Compiling the model with different input shapes.
INFO 07-08 18:17:13 [tpu_model_runner.py:1307] -- num_tokens: 16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=toGrch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:14 [tpu_model_runner.py:1307] -- num_tokens: 32
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:14 [tpu_model_runner.py:1307] -- num_tokens: 64
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:15 [tpu_model_runner.py:1307] -- num_tokens: 128
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:15 [tpu_model_runner.py:1307] -- num_tokens: 256
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:16 [tpu_model_runner.py:1307] -- num_tokens: 512
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:16 [tpu_model_runner.py:1307] -- num_tokens: 1024
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:17 [tpu_model_runner.py:1307] -- num_tokens: 2048
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:17 [tpu_model_runner.py:1307] -- num_tokens: 4096
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:18 [tpu_model_runner.py:1307] -- num_tokens: 8192
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
xw32 x.dtype=torch.bfloat16, out.dtype=torch.bfloat16
INFO 07-08 18:17:19 [tpu_model_runner.py:1315] Compilation finished in 5.57 [secs].
INFO 07-08 18:17:19 [tpu_model_runner.py:1321] Compiling select_hidden_states with different input shapes.
2025-07-08 18:17:21.263973: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:18.555850: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:18 [tpu_model_runner.py:1336] -- num_tokens: 16, num_seqs: 8
2025-07-08 18:18:18.602872: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:18.616456: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:18 [tpu_model_runner.py:1336] -- num_tokens: 16, num_seqs: 16
2025-07-08 18:18:18.676210: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:18.692217: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:18 [tpu_model_runner.py:1336] -- num_tokens: 32, num_seqs: 8
2025-07-08 18:18:18.736906: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:18 [tpu_model_runner.py:1336] -- num_tokens: 32, num_seqs: 16
2025-07-08 18:18:18.800374: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:18.814655: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:18 [tpu_model_runner.py:1336] -- num_tokens: 32, num_seqs: 32
2025-07-08 18:18:18.918232: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:18.936958: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:18 [tpu_model_runner.py:1336] -- num_tokens: 64, num_seqs: 8
2025-07-08 18:18:18.989486: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 64, num_seqs: 16
2025-07-08 18:18:19.062379: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 64, num_seqs: 32
2025-07-08 18:18:19.175990: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:19.197618: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 128, num_seqs: 8
2025-07-08 18:18:19.266100: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 128, num_seqs: 16
2025-07-08 18:18:19.357400: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 128, num_seqs: 32
2025-07-08 18:18:19.484463: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:19.512703: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 256, num_seqs: 8
2025-07-08 18:18:19.614343: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 256, num_seqs: 16
2025-07-08 18:18:19.737552: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:19 [tpu_model_runner.py:1336] -- num_tokens: 256, num_seqs: 32
2025-07-08 18:18:19.898561: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:19.959538: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:20 [tpu_model_runner.py:1336] -- num_tokens: 512, num_seqs: 8
2025-07-08 18:18:20.126451: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:20 [tpu_model_runner.py:1336] -- num_tokens: 512, num_seqs: 16
2025-07-08 18:18:20.315569: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:20 [tpu_model_runner.py:1336] -- num_tokens: 512, num_seqs: 32
2025-07-08 18:18:20.546707: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:20.564852: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:20 [tpu_model_runner.py:1336] -- num_tokens: 1024, num_seqs: 8
2025-07-08 18:18:20.868885: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 1024, num_seqs: 16
2025-07-08 18:18:21.197656: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 1024, num_seqs: 32
2025-07-08 18:18:21.570854: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:21.589296: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 2048, num_seqs: 8
2025-07-08 18:18:21.614033: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 2048, num_seqs: 16
2025-07-08 18:18:21.645113: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 2048, num_seqs: 32
2025-07-08 18:18:21.688410: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:21.706854: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 4096, num_seqs: 8
2025-07-08 18:18:21.732181: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 4096, num_seqs: 16
2025-07-08 18:18:21.763613: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 4096, num_seqs: 32
2025-07-08 18:18:21.806565: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:21.825273: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 8192, num_seqs: 8
2025-07-08 18:18:21.849867: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 8192, num_seqs: 16
2025-07-08 18:18:21.881156: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:21 [tpu_model_runner.py:1336] -- num_tokens: 8192, num_seqs: 32
INFO 07-08 18:18:21 [tpu_model_runner.py:1344] Compilation finished in 62.85 [secs].
INFO 07-08 18:18:21 [tpu_model_runner.py:1348] Compiling compute_logits with different input shapes.
2025-07-08 18:18:22.024261: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:22.041726: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:22 [tpu_model_runner.py:1357] -- num_seqs: 8
2025-07-08 18:18:22.246278: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:22.261090: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:22 [tpu_model_runner.py:1357] -- num_seqs: 16
2025-07-08 18:18:22.500162: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:22.515591: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:22 [tpu_model_runner.py:1357] -- num_seqs: 32
INFO 07-08 18:18:22 [tpu_model_runner.py:1360] Compilation finished in 0.86 [secs].
INFO 07-08 18:18:22 [tpu_model_runner.py:1364] Compiling structured_decoding with different input shapes.
2025-07-08 18:18:22.993265: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:23.038304: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:23 [tpu_model_runner.py:1381] -- num_seqs: 8
2025-07-08 18:18:23.916178: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:24.022642: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:24 [tpu_model_runner.py:1381] -- num_seqs: 16
2025-07-08 18:18:25.470441: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:25.746117: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:26 [tpu_model_runner.py:1381] -- num_seqs: 32
INFO 07-08 18:18:26 [tpu_model_runner.py:1384] Compilation finished in 4.12 [secs].
INFO 07-08 18:18:26 [tpu_model_runner.py:1388] Compiling sample_from_logits with different input shapes.
2025-07-08 18:18:27.051051: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:34.161226: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:34 [tpu_model_runner.py:1412] -- num_seqs: 8
2025-07-08 18:18:34.595231: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:41.630993: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:42 [tpu_model_runner.py:1412] -- num_seqs: 16
2025-07-08 18:18:42.387856: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
2025-07-08 18:18:49.235962: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:50 [tpu_model_runner.py:1412] -- num_seqs: 32
INFO 07-08 18:18:50 [tpu_model_runner.py:1415] Compilation finished in 23.70 [secs].
INFO 07-08 18:18:50 [tpu_model_runner.py:1419] Compiling gather_logprobs with different input shapes.
2025-07-08 18:18:50.659232: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:51 [tpu_model_runner.py:1430] -- num_seqs: 8
2025-07-08 18:18:51.369842: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:52 [tpu_model_runner.py:1430] -- num_seqs: 16
2025-07-08 18:18:52.258746: W torch_xla/csrc/runtime/pjrt_computation_client.cpp:682] Failed to deserialize executable: UNIMPLEMENTED: Deserializing serialized executable not supported.
INFO 07-08 18:18:53 [tpu_model_runner.py:1430] -- num_seqs: 32
INFO 07-08 18:18:53 [tpu_model_runner.py:1433] Compilation finished in 3.30 [secs].
INFO 07-08 18:18:53 [core.py:172] init engine (profile, create kv cache, warmup model) took 101.94 seconds
INFO 07-08 18:18:54 [tpu.py:103] [TPU] Forcing DYNAMO_ONCE compilation level
FAILEDWARNING 07-08 18:18:55 [interface.py:519] Current platform tpu does not have 'empty_cache' attribute.
WARNING 07-08 18:18:55 [parallel_state.py:1229] torch._C._host_emptyCache() only available in Pytorch >=2.5
=================================== FAILURES ===================================
_______________________ test_gsm8k_correctness[config0] ________________________
config = GSM8KAccuracyTestConfig(model_name='neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w8a8', excepted_value=0.76)
@pytest.mark.parametrize("config", ACCURACY_CONFIGS)
def test_gsm8k_correctness(config: GSM8KAccuracyTestConfig):
> results = lm_eval.simple_evaluate(
model="vllm",
model_args=config.get_model_args(),
tasks="gsm8k",
batch_size="auto",
)
vllm/tests/tpu/test_quantization_accuracy.py:41:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
miniconda3/envs/vllm/lib/python3.10/site-packages/lm_eval/utils.py:439: in _wrapper
return fn(*args, **kwargs)
miniconda3/envs/vllm/lib/python3.10/site-packages/lm_eval/evaluator.py:230: in simple_evaluate
lm = lm_eval.api.registry.get_model(model).create_from_arg_string(
miniconda3/envs/vllm/lib/python3.10/site-packages/lm_eval/api/model.py:151: in create_from_arg_string
return cls(**args, **args2)
miniconda3/envs/vllm/lib/python3.10/site-packages/lm_eval/models/vllm_causallms.py:240: in __init__
self.hf_chat_template = resolve_hf_chat_template(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
args = ()
kwargs = {'chat_template': None, 'tokenizer': CachedPreTrainedTokenizerFast(name_or_path='neuralmagic/Meta-Llama-3.1-8B-Instruc...alse, lstrip=False, single_word=False, normalized=False, special=True),
}
), 'tools': None, 'trust_remote_code': False}
deprecated_kwargs = {'trust_remote_code'}
msg = "The keyword arguments {'trust_remote_code'} are deprecated and will be removed in a future update. Please use `model_config.trust_remote_code` instead."
@wraps(fn)
def inner(*args, **kwargs):
if is_deprecated():
deprecated_kwargs = kwargs.keys() & deprecated_kws
if deprecated_kwargs:
msg = (
f"The keyword arguments {deprecated_kwargs} are "
"deprecated and will be removed in a future update.")
if additional_message is not None:
msg += f" {additional_message}"
warnings.warn(
DeprecationWarning(msg),
stacklevel=3, # The inner function takes up one level
)
> return fn(*args, **kwargs)
E TypeError: resolve_hf_chat_template() missing 1 required keyword-only argument: 'model_config'
vllm/vllm/utils/__init__.py:1484: TypeError
------------------------------ Captured log call -------------------------------
WARNING lm_eval.evaluator:evaluator.py:163 Model appears to be an instruct variant but chat template is not applied. Recommend setting `apply_chat_template` (optionally `fewshot_as_multiturn`).
=============================== warnings summary ===============================
tests/tpu/test_quantization_accuracy.py::test_gsm8k_correctness[config0]
/home/xiowei/miniconda3/envs/vllm/lib/python3.10/site-packages/lm_eval/api/model.py:151: DeprecationWarning: The keyword arguments {'trust_remote_code'} are deprecated and will be removed in a future update. Please use `model_config.trust_remote_code` instead.
return cls(**args, **args2)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED vllm/tests/tpu/test_quantization_accuracy.py::test_gsm8k_correctness[config0]
=================== 1 failed, 1 warning in 135.28s (0:02:15) ===================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment