Skip to content

Instantly share code, notes, and snippets.

@davidberard98
Last active September 15, 2022 00:47
Show Gist options
  • Save davidberard98/e5054d628c0855cb560837600cd35399 to your computer and use it in GitHub Desktop.
Save davidberard98/e5054d628c0855cb560837600cd35399 to your computer and use it in GitHub Desktop.
This file has been truncated, but you can view the full file.
submitit INFO (2022-09-15 00:31:04,046) - Starting with JobEnvironment(job_id=64923, hostname=a100-st-p4d24xlarge-15, local_rank=0(8), node=0(1), global_rank=0(8))
submitit INFO (2022-09-15 00:31:04,047) - Loading pickle: /fsx/users/dberard/scratch-local/bench-fast/benchmark/logs/64923_submitted.pkl
Process group: 8 tasks, rank: 0
a100-st-p4d24xlarge-15:26872:26872 [0] NCCL INFO NCCL_SOCKET_IFNAME set by environment to ens
a100-st-p4d24xlarge-15:26872:26872 [0] NCCL INFO NCCL_SOCKET_IFNAME set to ens
a100-st-p4d24xlarge-15:26872:26872 [0] NCCL INFO Bootstrap : Using ens32:10.200.83.149<0>
a100-st-p4d24xlarge-15:26872:26872 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v6 symbol.
a100-st-p4d24xlarge-15:26872:26872 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (v4 or v5).
a100-st-p4d24xlarge-15:26872:26872 [0] NCCL INFO cudaDriverVersion 11060
NCCL version 2.13.4+cuda11.6
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/OFI Using aws-ofi-nccl 1.4.0aws
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/OFI Running on p4d.24xlarge platform, Setting NCCL_TOPO_FILE environment variable to /fsx/users/dberard/scratch-local/bench-fast/aws-ofi-nccl/share/aws-ofi-nccl/xml/p4d-24xl-topo.xml
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/OFI Setting FI_EFA_FORK_SAFE environment variable to 1
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/OFI Selected Provider is efa
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Using network AWS Libfabric
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NCCL_TOPO_FILE set by environment to /fsx/users/dberard/scratch-local/bench-fast/aws-ofi-nccl/share/aws-ofi-nccl/xml/p4d-24xl-topo.xml
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/AWS Libfabric : GPU Direct RDMA Enabled for HCA 0 'rdmap16s27'
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/AWS Libfabric : GPU Direct RDMA Enabled for HCA 1 'rdmap32s27'
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/AWS Libfabric : GPU Direct RDMA Enabled for HCA 2 'rdmap144s27'
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NET/AWS Libfabric : GPU Direct RDMA Enabled for HCA 3 'rdmap160s27'
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101c0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101d0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201c0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201d0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901c0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901d0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01c0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01d0 / HCA 0 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101c0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101d0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201c0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201d0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901c0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901d0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01c0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01d0 / HCA 1 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101c0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101d0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201c0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201d0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901c0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901d0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01c0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01d0 / HCA 2 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101c0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 101d0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201c0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 201d0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901c0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU 901d0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01c0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO GPU Direct RDMA Enabled for GPU a01d0 / HCA 3 (distance 3 <= 4), read 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Setting affinity for GPU 0 to 1f0000,0000001f
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 00/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 01/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 02/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 03/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 04/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 05/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 06/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 07/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 08/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 09/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 10/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 11/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 12/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 13/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 14/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 15/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 16/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 17/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 18/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 19/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 20/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 21/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 22/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 23/24 : 0 1 2 3 4 5 6 7
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] 1/-1/-1->0->-1 [7] 1/-1/-1->0->-1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] 1/-1/-1->0->-1 [19] 1/-1/-1->0->-1 [20] 1/-1/-1->0->-1 [21] 1/-1/-1->0->-1 [22] 1/-1/-1->0->-1 [23] 1/-1/-1->0->-1
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002d20
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 0 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002d58
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 1 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 2 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002d90
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 3 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002dc8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 4 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002e00
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 5 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002e38
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 6 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002e70
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 7 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002ea8
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002ee0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 8 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002f18
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 9 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002f50
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 10 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002f88
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 11 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002fc0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 12 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964002ff8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 13 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003030
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 14 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003068
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 15 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640030a0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 16 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640030d8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 17 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003110
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 18 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003148
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 19 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003180
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 20 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640031b8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 21 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640031f0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 22 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003228
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 23 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 00 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003260
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 24 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 01 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003298
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 25 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 02 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640032d0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 26 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 03 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003308
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 27 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 04 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003340
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 28 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 05 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003378
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 29 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 06 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640033b0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 30 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 07 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640033e8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 31 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 08 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003420
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 32 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 09 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003458
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 33 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 10 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003490
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 34 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 11 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640034c8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 35 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 12 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003500
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 36 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 13 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003538
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 37 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 14 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003570
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 38 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 15 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640035a8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 39 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 16 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640035e0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 40 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 17 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003618
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 41 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 18 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003650
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 42 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 19 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003688
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 43 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 20 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640036c0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 44 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 21 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640036f8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 45 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 22 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003730
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 46 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Channel 23 : 0[101c0] -> 1[101d0] via P2P/IPC/read
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003768
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 47 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connected all rings
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640037a0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 48 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640037d8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 49 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003810
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 50 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003848
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 51 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003880
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 52 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640038b8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 53 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640038f0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 54 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003928
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 55 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003960
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 56 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003998
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 57 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f69640039d0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 58 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003a08
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 59 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003a40
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 60 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003a78
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 61 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003ab0
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 62 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003ae8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 63 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003b20
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 64 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003b58
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 65 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003b90
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 66 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003bc8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 67 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003c00
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 68 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003c38
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 69 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003c70
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 70 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003ca8
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy recv connection 71 from local rank 0, transport 0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connected all trees
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO NCCL_ALGO set by environment to ring
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 8/8/512
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO 24 coll channels, 32 p2p channels, 32 p2p channels per peer
a100-st-p4d24xlarge-15:26872:27199 [0] NCCL INFO New proxy send connection 72 from local rank 0, transport 2
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO Connection to proxy localRank 0 -> connection 0x7f6964003ce0
a100-st-p4d24xlarge-15:26872:27153 [0] NCCL INFO comm 0x7f6968002c30 rank 0 nranks 8 cudaDev 0 busId 101c0 - Init COMPLETE
-> ctx <function enable_profiling_executor at 0x7f6a1abe19d0>
torchdynamo.eval_frame: [DEBUG] skipping __init__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/contextlib.py
torchdynamo.eval_frame: [DEBUG] skipping __enter__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/contextlib.py
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR model [UserDefinedObjectVariable(Model)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 0 [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR example_inputs [NNModuleVariable(), TupleVariable(), UserDefinedObjectVariable(Model)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 1 [NNModuleVariable(), TupleVariable(), ConstDictVariable()]
torchdynamo.convert_frame: [ERROR] WON'T CONVERT train /fsx/users/dberard/scratch-local/bench-fast/benchmark/torchbenchmark/util/framework/huggingface/model_factory.py line 119
120 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (model)
4 BUILD_TUPLE 0
6 LOAD_FAST 0 (self)
8 LOAD_ATTR 1 (example_inputs)
10 CALL_FUNCTION_EX 1
12 STORE_FAST 1 (outputs)
121 14 LOAD_FAST 1 (outputs)
16 LOAD_ATTR 2 (loss)
18 STORE_FAST 2 (loss)
122 20 LOAD_FAST 2 (loss)
22 LOAD_METHOD 3 (backward)
24 CALL_METHOD 0
26 POP_TOP
28 LOAD_CONST 0 (None)
30 RETURN_VALUE
========== TorchDynamo Stack Trace ==========
Traceback (most recent call last):
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/convert_frame.py", line 313, in _convert_frame_assert
code = transform_code_object(frame.f_code, transform)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/bytecode_transformation.py", line 338, in transform_code_object
transformations(instructions, code_options)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/convert_frame.py", line 301, in transform
tracer.run()
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/symbolic_convert.py", line 331, in run
and self.step()
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/symbolic_convert.py", line 304, in step
getattr(self, inst.opname)(inst)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/symbolic_convert.py", line 154, in wrapper
return inner_fn(self, inst)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/symbolic_convert.py", line 731, in CALL_FUNCTION_EX
self.call_function(fn, argsvars.items, kwargsvars.items)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/symbolic_convert.py", line 241, in call_function
self.push(fn.call_function(self, args, kwargs))
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/variables/nn_module.py", line 195, in call_function
return variables.TensorVariable.create(
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torchdynamo/variables/tensor.py", line 278, in create
assert (
AssertionError: torch.* op returned non-Tensor MaskedLMOutput call_module self_model
========== The above exception occurred while processing the following code ==========
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/submitit/core/_submit.py", line 11, in <module>
submitit_main()
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/submitit/core/submission.py", line 72, in submitit_main
process_job(args.folder)
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/submitit/core/submission.py", line 54, in process_job
result = delayed.result()
File "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/submitit/core/utils.py", line 133, in result
self._result = self.function(*self.args, **self.kwargs)
File "ddp_experiments.py", line 151, in __call__
return trainer_class(self.args, model_class, model_args=self.model_args).measure()
File "/fsx/users/dberard/scratch-local/bench-fast/benchmark/torchbenchmark/util/distributed/core_model/trainer.py", line 79, in measure
self.benchmark.invoke()
File "/fsx/users/dberard/scratch-local/bench-fast/benchmark/torchbenchmark/util/model.py", line 190, in invoke
self.train()
File "/fsx/users/dberard/scratch-local/bench-fast/benchmark/torchbenchmark/util/framework/huggingface/model_factory.py", line 119, in train
def train(self):
File "/fsx/users/dberard/scratch-local/bench-fast/benchmark/torchbenchmark/util/framework/huggingface/model_factory.py", line 120, in train
outputs = self.model(**self.example_inputs)
==========
torchdynamo.eval_frame: [DEBUG] skipping _call_impl /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/modules/module.py
torchdynamo.eval_frame: [DEBUG] skipping forward /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parallel/distributed.py
torchdynamo.eval_frame: [DEBUG] skipping __setattr__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/modules/module.py
torchdynamo.eval_frame: [DEBUG] skipping __instancecheck__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parameter.py
torchdynamo.eval_frame: [DEBUG] skipping notify_join_context /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/distributed/algorithms/join.py
torchdynamo.eval_frame: [DEBUG] skipping __getattr__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/modules/module.py
torchdynamo.eval_frame: [DEBUG] skipping _check_sync_bufs_pre_fwd /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parallel/distributed.py
torchdynamo.eval_frame: [DEBUG] skipping will_sync_module_buffers /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parallel/distributed.py
torchdynamo.eval_frame: [DEBUG] skipping _run_ddp_forward /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parallel/distributed.py
torchdynamo.eval_frame: [DEBUG] skipping _to_kwargs /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/distributed/utils.py
torchdynamo.eval_frame: [DEBUG] skipping _recursive_to /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/distributed/utils.py
torchdynamo.eval_frame: [DEBUG] skipping to_map /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/distributed/utils.py
torchdynamo.eval_frame: [DEBUG] skipping _is_namedtuple /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parallel/scatter_gather.py
torchdynamo.eval_frame: [DEBUG] skipping <listcomp> /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/distributed/utils.py
torchdynamo.eval_frame: [DEBUG] skipping <listcomp> /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/distributed/utils.py
torchdynamo.eval_frame: [DEBUG] skipping helper /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/contextlib.py
torchdynamo.eval_frame: [DEBUG] skipping __init__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/contextlib.py
torchdynamo.eval_frame: [DEBUG] skipping __enter__ /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/contextlib.py
torchdynamo.eval_frame: [DEBUG] skipping _inside_ddp_forward /data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/parallel/distributed.py
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST return_dict []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 12 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR use_return_dict [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST return_dict [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR bert [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST token_type_ids [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST position_ids [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs_embeds [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST return_dict [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('attention_mask', 'token_type_ids', 'position_ids', 'head_mask', 'inputs_embeds', 'encoder_hidden_states', 'encoder_attention_mask', 'output_attentions', 'output_hidden_states', 'return_dict') [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 11 [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100cdd40, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 909>
952 0 LOAD_FAST 11 (output_attentions)
2 LOAD_CONST 1 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 12
8 LOAD_FAST 11 (output_attentions)
10 JUMP_FORWARD 6 (to 18)
>> 12 LOAD_FAST 0 (self)
14 LOAD_ATTR 0 (config)
16 LOAD_ATTR 1 (output_attentions)
>> 18 STORE_FAST 11 (output_attentions)
954 20 LOAD_FAST 12 (output_hidden_states)
22 LOAD_CONST 1 (None)
24 COMPARE_OP 9 (is not)
26 POP_JUMP_IF_FALSE 32
28 LOAD_FAST 12 (output_hidden_states)
30 JUMP_FORWARD 6 (to 38)
>> 32 LOAD_FAST 0 (self)
34 LOAD_ATTR 0 (config)
36 LOAD_ATTR 2 (output_hidden_states)
953 >> 38 STORE_FAST 12 (output_hidden_states)
956 40 LOAD_FAST 13 (return_dict)
42 LOAD_CONST 1 (None)
44 COMPARE_OP 9 (is not)
46 POP_JUMP_IF_FALSE 52
48 LOAD_FAST 13 (return_dict)
50 JUMP_FORWARD 6 (to 58)
>> 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 0 (config)
56 LOAD_ATTR 3 (use_return_dict)
>> 58 STORE_FAST 13 (return_dict)
958 60 LOAD_FAST 0 (self)
62 LOAD_ATTR 0 (config)
64 LOAD_ATTR 4 (is_decoder)
66 POP_JUMP_IF_FALSE 90
959 68 LOAD_FAST 10 (use_cache)
70 LOAD_CONST 1 (None)
72 COMPARE_OP 9 (is not)
74 POP_JUMP_IF_FALSE 80
76 LOAD_FAST 10 (use_cache)
78 JUMP_FORWARD 6 (to 86)
>> 80 LOAD_FAST 0 (self)
82 LOAD_ATTR 0 (config)
84 LOAD_ATTR 5 (use_cache)
>> 86 STORE_FAST 10 (use_cache)
88 JUMP_FORWARD 4 (to 94)
961 >> 90 LOAD_CONST 2 (False)
92 STORE_FAST 10 (use_cache)
963 >> 94 LOAD_FAST 1 (input_ids)
96 LOAD_CONST 1 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 120
102 LOAD_FAST 6 (inputs_embeds)
104 LOAD_CONST 1 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 120
964 110 LOAD_GLOBAL 6 (ValueError)
112 LOAD_CONST 3 ('You cannot specify both input_ids and inputs_embeds at the same time')
114 CALL_FUNCTION 1
116 RAISE_VARARGS 1
118 JUMP_FORWARD 52 (to 172)
965 >> 120 LOAD_FAST 1 (input_ids)
122 LOAD_CONST 1 (None)
124 COMPARE_OP 9 (is not)
126 POP_JUMP_IF_FALSE 138
966 128 LOAD_FAST 1 (input_ids)
130 LOAD_METHOD 7 (size)
132 CALL_METHOD 0
134 STORE_FAST 14 (input_shape)
136 JUMP_FORWARD 34 (to 172)
967 >> 138 LOAD_FAST 6 (inputs_embeds)
140 LOAD_CONST 1 (None)
142 COMPARE_OP 9 (is not)
144 POP_JUMP_IF_FALSE 164
968 146 LOAD_FAST 6 (inputs_embeds)
148 LOAD_METHOD 7 (size)
150 CALL_METHOD 0
152 LOAD_CONST 1 (None)
154 LOAD_CONST 4 (-1)
156 BUILD_SLICE 2
158 BINARY_SUBSCR
160 STORE_FAST 14 (input_shape)
162 JUMP_FORWARD 8 (to 172)
970 >> 164 LOAD_GLOBAL 6 (ValueError)
166 LOAD_CONST 5 ('You have to specify either input_ids or inputs_embeds')
168 CALL_FUNCTION 1
170 RAISE_VARARGS 1
972 >> 172 LOAD_FAST 14 (input_shape)
174 UNPACK_SEQUENCE 2
176 STORE_FAST 15 (batch_size)
178 STORE_FAST 16 (seq_length)
973 180 LOAD_FAST 1 (input_ids)
182 LOAD_CONST 1 (None)
184 COMPARE_OP 9 (is not)
186 POP_JUMP_IF_FALSE 194
188 LOAD_FAST 1 (input_ids)
190 LOAD_ATTR 8 (device)
192 JUMP_FORWARD 4 (to 198)
>> 194 LOAD_FAST 6 (inputs_embeds)
196 LOAD_ATTR 8 (device)
>> 198 STORE_FAST 17 (device)
976 200 LOAD_FAST 9 (past_key_values)
202 LOAD_CONST 1 (None)
204 COMPARE_OP 9 (is not)
206 POP_JUMP_IF_FALSE 226
208 LOAD_FAST 9 (past_key_values)
210 LOAD_CONST 6 (0)
212 BINARY_SUBSCR
214 LOAD_CONST 6 (0)
216 BINARY_SUBSCR
218 LOAD_ATTR 9 (shape)
220 LOAD_CONST 7 (2)
222 BINARY_SUBSCR
224 JUMP_FORWARD 2 (to 228)
>> 226 LOAD_CONST 6 (0)
>> 228 STORE_FAST 18 (past_key_values_length)
978 230 LOAD_FAST 2 (attention_mask)
232 LOAD_CONST 1 (None)
234 COMPARE_OP 8 (is)
236 EXTENDED_ARG 1
238 POP_JUMP_IF_FALSE 262
979 240 LOAD_GLOBAL 10 (torch)
242 LOAD_ATTR 11 (ones)
244 LOAD_FAST 15 (batch_size)
246 LOAD_FAST 16 (seq_length)
248 LOAD_FAST 18 (past_key_values_length)
250 BINARY_ADD
252 BUILD_TUPLE 2
254 LOAD_FAST 17 (device)
256 LOAD_CONST 8 (('device',))
258 CALL_FUNCTION_KW 2
260 STORE_FAST 2 (attention_mask)
981 >> 262 LOAD_FAST 3 (token_type_ids)
264 LOAD_CONST 1 (None)
266 COMPARE_OP 8 (is)
268 EXTENDED_ARG 1
270 POP_JUMP_IF_FALSE 346
982 272 LOAD_GLOBAL 12 (hasattr)
274 LOAD_FAST 0 (self)
276 LOAD_ATTR 13 (embeddings)
278 LOAD_CONST 9 ('token_type_ids')
280 CALL_FUNCTION 2
282 EXTENDED_ARG 1
284 POP_JUMP_IF_FALSE 328
983 286 LOAD_FAST 0 (self)
288 LOAD_ATTR 13 (embeddings)
290 LOAD_ATTR 14 (token_type_ids)
292 LOAD_CONST 1 (None)
294 LOAD_CONST 1 (None)
296 BUILD_SLICE 2
298 LOAD_CONST 1 (None)
300 LOAD_FAST 16 (seq_length)
302 BUILD_SLICE 2
304 BUILD_TUPLE 2
306 BINARY_SUBSCR
308 STORE_FAST 19 (buffered_token_type_ids)
984 310 LOAD_FAST 19 (buffered_token_type_ids)
312 LOAD_METHOD 15 (expand)
314 LOAD_FAST 15 (batch_size)
316 LOAD_FAST 16 (seq_length)
318 CALL_METHOD 2
320 STORE_FAST 20 (buffered_token_type_ids_expanded)
985 322 LOAD_FAST 20 (buffered_token_type_ids_expanded)
324 STORE_FAST 3 (token_type_ids)
326 JUMP_FORWARD 18 (to 346)
987 >> 328 LOAD_GLOBAL 10 (torch)
330 LOAD_ATTR 16 (zeros)
332 LOAD_FAST 14 (input_shape)
334 LOAD_GLOBAL 10 (torch)
336 LOAD_ATTR 17 (long)
338 LOAD_FAST 17 (device)
340 LOAD_CONST 10 (('dtype', 'device'))
342 CALL_FUNCTION_KW 3
344 STORE_FAST 3 (token_type_ids)
991 >> 346 LOAD_FAST 0 (self)
348 LOAD_METHOD 18 (get_extended_attention_mask)
350 LOAD_FAST 2 (attention_mask)
352 LOAD_FAST 14 (input_shape)
354 CALL_METHOD 2
356 STORE_FAST 21 (extended_attention_mask)
995 358 LOAD_FAST 0 (self)
360 LOAD_ATTR 0 (config)
362 LOAD_ATTR 4 (is_decoder)
364 EXTENDED_ARG 1
366 POP_JUMP_IF_FALSE 436
368 LOAD_FAST 7 (encoder_hidden_states)
370 LOAD_CONST 1 (None)
372 COMPARE_OP 9 (is not)
374 EXTENDED_ARG 1
376 POP_JUMP_IF_FALSE 436
996 378 LOAD_FAST 7 (encoder_hidden_states)
380 LOAD_METHOD 7 (size)
382 CALL_METHOD 0
384 UNPACK_SEQUENCE 3
386 STORE_FAST 22 (encoder_batch_size)
388 STORE_FAST 23 (encoder_sequence_length)
390 STORE_FAST 24 (_)
997 392 LOAD_FAST 22 (encoder_batch_size)
394 LOAD_FAST 23 (encoder_sequence_length)
396 BUILD_TUPLE 2
398 STORE_FAST 25 (encoder_hidden_shape)
998 400 LOAD_FAST 8 (encoder_attention_mask)
402 LOAD_CONST 1 (None)
404 COMPARE_OP 8 (is)
406 EXTENDED_ARG 1
408 POP_JUMP_IF_FALSE 424
999 410 LOAD_GLOBAL 10 (torch)
412 LOAD_ATTR 11 (ones)
414 LOAD_FAST 25 (encoder_hidden_shape)
416 LOAD_FAST 17 (device)
418 LOAD_CONST 8 (('device',))
420 CALL_FUNCTION_KW 2
422 STORE_FAST 8 (encoder_attention_mask)
1000 >> 424 LOAD_FAST 0 (self)
426 LOAD_METHOD 19 (invert_attention_mask)
428 LOAD_FAST 8 (encoder_attention_mask)
430 CALL_METHOD 1
432 STORE_FAST 26 (encoder_extended_attention_mask)
434 JUMP_FORWARD 4 (to 440)
1002 >> 436 LOAD_CONST 1 (None)
438 STORE_FAST 26 (encoder_extended_attention_mask)
1009 >> 440 LOAD_FAST 0 (self)
442 LOAD_METHOD 20 (get_head_mask)
444 LOAD_FAST 5 (head_mask)
446 LOAD_FAST 0 (self)
448 LOAD_ATTR 0 (config)
450 LOAD_ATTR 21 (num_hidden_layers)
452 CALL_METHOD 2
454 STORE_FAST 5 (head_mask)
1011 456 LOAD_FAST 0 (self)
458 LOAD_ATTR 13 (embeddings)
1012 460 LOAD_FAST 1 (input_ids)
1013 462 LOAD_FAST 4 (position_ids)
1014 464 LOAD_FAST 3 (token_type_ids)
1015 466 LOAD_FAST 6 (inputs_embeds)
1016 468 LOAD_FAST 18 (past_key_values_length)
1011 470 LOAD_CONST 11 (('input_ids', 'position_ids', 'token_type_ids', 'inputs_embeds', 'past_key_values_length'))
472 CALL_FUNCTION_KW 5
474 STORE_FAST 27 (embedding_output)
1018 476 LOAD_FAST 0 (self)
478 LOAD_ATTR 22 (encoder)
1019 480 LOAD_FAST 27 (embedding_output)
1020 482 LOAD_FAST 21 (extended_attention_mask)
1021 484 LOAD_FAST 5 (head_mask)
1022 486 LOAD_FAST 7 (encoder_hidden_states)
1023 488 LOAD_FAST 26 (encoder_extended_attention_mask)
1024 490 LOAD_FAST 9 (past_key_values)
1025 492 LOAD_FAST 10 (use_cache)
1026 494 LOAD_FAST 11 (output_attentions)
1027 496 LOAD_FAST 12 (output_hidden_states)
1028 498 LOAD_FAST 13 (return_dict)
1018 500 LOAD_CONST 12 (('attention_mask', 'head_mask', 'encoder_hidden_states', 'encoder_attention_mask', 'past_key_values', 'use_cache', 'output_attentions', 'output_hidden_states', 'return_dict'))
502 CALL_FUNCTION_KW 10
504 STORE_FAST 28 (encoder_outputs)
1030 506 LOAD_FAST 28 (encoder_outputs)
508 LOAD_CONST 6 (0)
510 BINARY_SUBSCR
512 STORE_FAST 29 (sequence_output)
1031 514 LOAD_FAST 0 (self)
516 LOAD_ATTR 23 (pooler)
518 LOAD_CONST 1 (None)
520 COMPARE_OP 9 (is not)
522 EXTENDED_ARG 2
524 POP_JUMP_IF_FALSE 536
526 LOAD_FAST 0 (self)
528 LOAD_METHOD 23 (pooler)
530 LOAD_FAST 29 (sequence_output)
532 CALL_METHOD 1
534 JUMP_FORWARD 2 (to 538)
>> 536 LOAD_CONST 1 (None)
>> 538 STORE_FAST 30 (pooled_output)
1033 540 LOAD_FAST 13 (return_dict)
542 EXTENDED_ARG 2
544 POP_JUMP_IF_TRUE 566
1034 546 LOAD_FAST 29 (sequence_output)
548 LOAD_FAST 30 (pooled_output)
550 BUILD_TUPLE 2
552 LOAD_FAST 28 (encoder_outputs)
554 LOAD_CONST 13 (1)
556 LOAD_CONST 1 (None)
558 BUILD_SLICE 2
560 BINARY_SUBSCR
562 BINARY_ADD
564 RETURN_VALUE
1036 >> 566 LOAD_GLOBAL 24 (BaseModelOutputWithPoolingAndCrossAttentions)
1037 568 LOAD_FAST 29 (sequence_output)
1038 570 LOAD_FAST 30 (pooled_output)
1039 572 LOAD_FAST 28 (encoder_outputs)
574 LOAD_ATTR 25 (past_key_values)
1040 576 LOAD_FAST 28 (encoder_outputs)
578 LOAD_ATTR 26 (hidden_states)
1041 580 LOAD_FAST 28 (encoder_outputs)
582 LOAD_ATTR 27 (attentions)
1042 584 LOAD_FAST 28 (encoder_outputs)
586 LOAD_ATTR 28 (cross_attentions)
1036 588 LOAD_CONST 14 (('last_hidden_state', 'pooler_output', 'past_key_values', 'hidden_states', 'attentions', 'cross_attentions'))
590 CALL_FUNCTION_KW 6
592 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 12 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output_attentions [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST output_attentions [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 32 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output_hidden_states [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST output_hidden_states [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST return_dict []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST return_dict []
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 58 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST return_dict [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 90 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST False []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST use_cache [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 120 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs_embeds []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 120 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 138 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST input_shape [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 172 []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_shape []
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST batch_size [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST seq_length [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 194 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR device [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 198 [TorchVariable(cuda:0)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST device [TorchVariable(cuda:0)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 226 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST past_key_values_length [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 262 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR ones [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST batch_size [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST seq_length [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values_length [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST device [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('device',) [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), TupleVariable(), TorchVariable(cuda:0)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<built-in method ones of type object at 0x7f6bb096cb40>), TupleVariable(), TorchVariable(cuda:0), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST token_type_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 346 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL hasattr []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [BuiltinVariable(hasattr)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR embeddings [BuiltinVariable(hasattr), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST token_type_ids [BuiltinVariable(hasattr), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [BuiltinVariable(hasattr), NNModuleVariable(), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 328 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR token_type_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST seq_length [TensorVariable(), SliceVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TensorVariable(), SliceVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [TensorVariable(), SliceVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TensorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST buffered_token_type_ids [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST buffered_token_type_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR expand [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST batch_size [GetAttrVariable(TensorVariable(), expand)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST seq_length [GetAttrVariable(TensorVariable(), expand), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [GetAttrVariable(TensorVariable(), expand), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST buffered_token_type_ids_expanded [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST buffered_token_type_ids_expanded []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST token_type_ids [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 346 []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR get_extended_attention_mask [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [UserMethodVariable(<function ModuleUtilsMixin.get_extended_attention_mask at 0x7f69ff5bcca0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_shape [UserMethodVariable(<function ModuleUtilsMixin.get_extended_attention_mask at 0x7f69ff5bcca0>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [UserMethodVariable(<function ModuleUtilsMixin.get_extended_attention_mask at 0x7f69ff5bcca0>, NNModuleVariable()), TensorVariable(), SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object get_extended_attention_mask at 0x7f6a1009a450, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 794>
809 0 LOAD_FAST 1 (attention_mask)
2 LOAD_METHOD 0 (dim)
4 CALL_METHOD 0
6 LOAD_CONST 1 (2)
8 COMPARE_OP 2 (==)
10 POP_JUMP_IF_FALSE 20
12 LOAD_FAST 0 (self)
14 LOAD_ATTR 1 (config)
16 LOAD_ATTR 2 (is_decoder)
18 POP_JUMP_IF_TRUE 40
811 >> 20 LOAD_FAST 3 (device)
22 LOAD_CONST 2 (None)
24 COMPARE_OP 9 (is not)
26 POP_JUMP_IF_FALSE 40
812 28 LOAD_GLOBAL 3 (warnings)
30 LOAD_METHOD 4 (warn)
813 32 LOAD_CONST 3 ('The `device` argument is deprecated and will be removed in v5 of Transformers.')
34 LOAD_GLOBAL 5 (FutureWarning)
812 36 CALL_METHOD 2
38 POP_TOP
817 >> 40 LOAD_FAST 1 (attention_mask)
42 LOAD_METHOD 0 (dim)
44 CALL_METHOD 0
46 LOAD_CONST 4 (3)
48 COMPARE_OP 2 (==)
50 POP_JUMP_IF_FALSE 82
818 52 LOAD_FAST 1 (attention_mask)
54 LOAD_CONST 2 (None)
56 LOAD_CONST 2 (None)
58 BUILD_SLICE 2
60 LOAD_CONST 2 (None)
62 LOAD_CONST 2 (None)
64 LOAD_CONST 2 (None)
66 BUILD_SLICE 2
68 LOAD_CONST 2 (None)
70 LOAD_CONST 2 (None)
72 BUILD_SLICE 2
74 BUILD_TUPLE 4
76 BINARY_SUBSCR
78 STORE_FAST 4 (extended_attention_mask)
80 JUMP_FORWARD 86 (to 168)
819 >> 82 LOAD_FAST 1 (attention_mask)
84 LOAD_METHOD 0 (dim)
86 CALL_METHOD 0
88 LOAD_CONST 1 (2)
90 COMPARE_OP 2 (==)
92 POP_JUMP_IF_FALSE 144
823 94 LOAD_FAST 0 (self)
96 LOAD_ATTR 1 (config)
98 LOAD_ATTR 2 (is_decoder)
100 POP_JUMP_IF_FALSE 118
824 102 LOAD_GLOBAL 6 (ModuleUtilsMixin)
104 LOAD_METHOD 7 (create_extended_attention_mask_for_decoder)
825 106 LOAD_FAST 2 (input_shape)
108 LOAD_FAST 1 (attention_mask)
110 LOAD_FAST 3 (device)
824 112 CALL_METHOD 3
114 STORE_FAST 4 (extended_attention_mask)
116 JUMP_ABSOLUTE 168
828 >> 118 LOAD_FAST 1 (attention_mask)
120 LOAD_CONST 2 (None)
122 LOAD_CONST 2 (None)
124 BUILD_SLICE 2
126 LOAD_CONST 2 (None)
128 LOAD_CONST 2 (None)
130 LOAD_CONST 2 (None)
132 LOAD_CONST 2 (None)
134 BUILD_SLICE 2
136 BUILD_TUPLE 4
138 BINARY_SUBSCR
140 STORE_FAST 4 (extended_attention_mask)
142 JUMP_FORWARD 24 (to 168)
830 >> 144 LOAD_GLOBAL 8 (ValueError)
831 146 LOAD_CONST 5 ('Wrong shape for input_ids (shape ')
148 LOAD_FAST 2 (input_shape)
150 FORMAT_VALUE 0
152 LOAD_CONST 6 (') or attention_mask (shape ')
154 LOAD_FAST 1 (attention_mask)
156 LOAD_ATTR 9 (shape)
158 FORMAT_VALUE 0
160 LOAD_CONST 7 (')')
162 BUILD_STRING 5
830 164 CALL_FUNCTION 1
166 RAISE_VARARGS 1
839 >> 168 LOAD_FAST 4 (extended_attention_mask)
170 LOAD_ATTR 10 (to)
172 LOAD_FAST 0 (self)
174 LOAD_ATTR 11 (dtype)
176 LOAD_CONST 8 (('dtype',))
178 CALL_FUNCTION_KW 1
180 STORE_FAST 4 (extended_attention_mask)
840 182 LOAD_CONST 9 (1.0)
184 LOAD_FAST 4 (extended_attention_mask)
186 BINARY_SUBTRACT
188 LOAD_CONST 10 (-10000.0)
190 BINARY_MULTIPLY
192 STORE_FAST 4 (extended_attention_mask)
841 194 LOAD_FAST 4 (extended_attention_mask)
196 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dim [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), dim)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 40 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST device []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 40 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dim [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), dim)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 82 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dim [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), dim)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 144 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 118 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), SliceVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), SliceVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), SliceVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TensorVariable(), SliceVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 4 [TensorVariable(), SliceVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TensorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST extended_attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 168 []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST extended_attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR to [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [GetAttrVariable(TensorVariable(), to)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dtype [GetAttrVariable(TensorVariable(), to), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object dtype at 0x7f6a1009a240, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 727>
732 0 LOAD_GLOBAL 0 (get_parameter_dtype)
2 LOAD_FAST 0 (self)
4 CALL_FUNCTION 1
6 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL get_parameter_dtype []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object get_parameter_dtype at 0x7f6a100f5b30, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 153>
157 0 LOAD_CONST 1 (None)
2 STORE_FAST 1 (last_dtype)
158 4 LOAD_FAST 0 (parameter)
6 LOAD_METHOD 0 (parameters)
8 CALL_METHOD 0
10 GET_ITER
>> 12 FOR_ITER 28 (to 42)
14 STORE_FAST 2 (t)
159 16 LOAD_FAST 2 (t)
18 LOAD_ATTR 1 (dtype)
20 STORE_FAST 1 (last_dtype)
160 22 LOAD_FAST 2 (t)
24 LOAD_METHOD 2 (is_floating_point)
26 CALL_METHOD 0
28 POP_JUMP_IF_FALSE 12
161 30 LOAD_FAST 2 (t)
32 LOAD_ATTR 1 (dtype)
34 ROT_TWO
36 POP_TOP
38 RETURN_VALUE
40 JUMP_ABSOLUTE 12
163 >> 42 LOAD_FAST 1 (last_dtype)
44 LOAD_CONST 1 (None)
46 COMPARE_OP 9 (is not)
48 POP_JUMP_IF_FALSE 54
165 50 LOAD_FAST 1 (last_dtype)
52 RETURN_VALUE
169 >> 54 LOAD_GLOBAL 3 (nn)
56 LOAD_ATTR 4 (Module)
58 LOAD_GLOBAL 5 (List)
60 LOAD_GLOBAL 6 (Tuple)
62 LOAD_GLOBAL 7 (str)
64 LOAD_GLOBAL 8 (Tensor)
66 BUILD_TUPLE 2
68 BINARY_SUBSCR
70 BINARY_SUBSCR
72 LOAD_CONST 2 (('module', 'return'))
74 BUILD_CONST_KEY_MAP 2
76 LOAD_CONST 3 (<code object find_tensor_attributes at 0x7f6a100f5a80, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 169>)
78 LOAD_CONST 4 ('get_parameter_dtype.<locals>.find_tensor_attributes')
80 MAKE_FUNCTION 4 (annotations)
82 STORE_FAST 3 (find_tensor_attributes)
173 84 LOAD_FAST 0 (parameter)
86 LOAD_ATTR 9 (_named_members)
88 LOAD_FAST 3 (find_tensor_attributes)
90 LOAD_CONST 5 (('get_members_fn',))
92 CALL_FUNCTION_KW 1
94 STORE_FAST 4 (gen)
174 96 LOAD_CONST 1 (None)
98 STORE_FAST 5 (last_tuple)
175 100 LOAD_FAST 4 (gen)
102 GET_ITER
>> 104 FOR_ITER 34 (to 140)
106 STORE_FAST 6 (tuple)
176 108 LOAD_FAST 6 (tuple)
110 STORE_FAST 5 (last_tuple)
177 112 LOAD_FAST 6 (tuple)
114 LOAD_CONST 6 (1)
116 BINARY_SUBSCR
118 LOAD_METHOD 2 (is_floating_point)
120 CALL_METHOD 0
122 POP_JUMP_IF_FALSE 104
178 124 LOAD_FAST 6 (tuple)
126 LOAD_CONST 6 (1)
128 BINARY_SUBSCR
130 LOAD_ATTR 1 (dtype)
132 ROT_TWO
134 POP_TOP
136 RETURN_VALUE
138 JUMP_ABSOLUTE 104
181 >> 140 LOAD_FAST 5 (last_tuple)
142 LOAD_CONST 6 (1)
144 BINARY_SUBSCR
146 LOAD_ATTR 1 (dtype)
148 RETURN_VALUE
150 LOAD_CONST 1 (None)
152 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST last_dtype [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST parameter []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [UserMethodVariable(<function Module.parameters at 0x7f6a1d2a6a60>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE GET_ITER None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 42 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST t [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST t [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dtype [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST last_dtype [ListIteratorVariable(), TorchVariable(torch.float32)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST t [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_floating_point [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [ListIteratorVariable(), GetAttrVariable(TensorVariable(), is_floating_point)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 12 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST t [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dtype [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE ROT_TWO None [ListIteratorVariable(), TorchVariable(torch.float32)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_TOP None [TorchVariable(torch.float32), ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TorchVariable(torch.float32)]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object get_parameter_dtype at 0x7f6a100f5b30, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 153>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TorchVariable(torch.float32)]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object dtype at 0x7f6a1009a240, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 727>
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dtype',) [GetAttrVariable(TensorVariable(), to), TorchVariable(torch.float32)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 1 [GetAttrVariable(TensorVariable(), to), TorchVariable(torch.float32), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST extended_attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1.0 []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST extended_attention_mask [ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBTRACT None [ConstantVariable(float), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -10000.0 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_MULTIPLY None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST extended_attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST extended_attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object get_extended_attention_mask at 0x7f6a1009a450, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 794>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST extended_attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 436 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST encoder_extended_attention_mask [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR get_head_mask [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [UserMethodVariable(<function ModuleUtilsMixin.get_head_mask at 0x7f69ff5bcd30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function ModuleUtilsMixin.get_head_mask at 0x7f69ff5bcd30>, NNModuleVariable()), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR config [UserMethodVariable(<function ModuleUtilsMixin.get_head_mask at 0x7f69ff5bcd30>, NNModuleVariable()), ConstantVariable(NoneType), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_hidden_layers [UserMethodVariable(<function ModuleUtilsMixin.get_head_mask at 0x7f69ff5bcd30>, NNModuleVariable()), ConstantVariable(NoneType), HFPretrainedConfigVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [UserMethodVariable(<function ModuleUtilsMixin.get_head_mask at 0x7f69ff5bcd30>, NNModuleVariable()), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object get_head_mask at 0x7f6a1009a500, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 843>
861 0 LOAD_FAST 1 (head_mask)
2 LOAD_CONST 1 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 40
862 8 LOAD_FAST 0 (self)
10 LOAD_METHOD 0 (_convert_head_mask_to_5d)
12 LOAD_FAST 1 (head_mask)
14 LOAD_FAST 2 (num_hidden_layers)
16 CALL_METHOD 2
18 STORE_FAST 1 (head_mask)
863 20 LOAD_FAST 3 (is_attention_chunked)
22 LOAD_CONST 2 (True)
24 COMPARE_OP 8 (is)
26 POP_JUMP_IF_FALSE 50
864 28 LOAD_FAST 1 (head_mask)
30 LOAD_METHOD 1 (unsqueeze)
32 LOAD_CONST 3 (-1)
34 CALL_METHOD 1
36 STORE_FAST 1 (head_mask)
38 JUMP_FORWARD 10 (to 50)
866 >> 40 LOAD_CONST 1 (None)
42 BUILD_LIST 1
44 LOAD_FAST 2 (num_hidden_layers)
46 BINARY_MULTIPLY
48 STORE_FAST 1 (head_mask)
868 >> 50 LOAD_FAST 1 (head_mask)
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 40 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_LIST 1 [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_hidden_layers [ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_MULTIPLY None [ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST head_mask [ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object get_head_mask at 0x7f6a1009a500, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/modeling_utils.py", line 843>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST head_mask [ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST position_ids [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST token_type_ids [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs_embeds [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values_length [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('input_ids', 'position_ids', 'token_type_ids', 'inputs_embeds', 'past_key_values_length') [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), ConstantVariable(NoneType), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140450, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 205>
213 0 LOAD_FAST 1 (input_ids)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 18
214 8 LOAD_FAST 1 (input_ids)
10 LOAD_METHOD 0 (size)
12 CALL_METHOD 0
14 STORE_FAST 6 (input_shape)
16 JUMP_FORWARD 16 (to 34)
216 >> 18 LOAD_FAST 4 (inputs_embeds)
20 LOAD_METHOD 0 (size)
22 CALL_METHOD 0
24 LOAD_CONST 0 (None)
26 LOAD_CONST 1 (-1)
28 BUILD_SLICE 2
30 BINARY_SUBSCR
32 STORE_FAST 6 (input_shape)
218 >> 34 LOAD_FAST 6 (input_shape)
36 LOAD_CONST 2 (1)
38 BINARY_SUBSCR
40 STORE_FAST 7 (seq_length)
220 42 LOAD_FAST 3 (position_ids)
44 LOAD_CONST 0 (None)
46 COMPARE_OP 8 (is)
48 POP_JUMP_IF_FALSE 76
221 50 LOAD_FAST 0 (self)
52 LOAD_ATTR 1 (position_ids)
54 LOAD_CONST 0 (None)
56 LOAD_CONST 0 (None)
58 BUILD_SLICE 2
60 LOAD_FAST 5 (past_key_values_length)
62 LOAD_FAST 7 (seq_length)
64 LOAD_FAST 5 (past_key_values_length)
66 BINARY_ADD
68 BUILD_SLICE 2
70 BUILD_TUPLE 2
72 BINARY_SUBSCR
74 STORE_FAST 3 (position_ids)
226 >> 76 LOAD_FAST 2 (token_type_ids)
78 LOAD_CONST 0 (None)
80 COMPARE_OP 8 (is)
82 POP_JUMP_IF_FALSE 160
227 84 LOAD_GLOBAL 2 (hasattr)
86 LOAD_FAST 0 (self)
88 LOAD_CONST 3 ('token_type_ids')
90 CALL_FUNCTION 2
92 POP_JUMP_IF_FALSE 138
228 94 LOAD_FAST 0 (self)
96 LOAD_ATTR 3 (token_type_ids)
98 LOAD_CONST 0 (None)
100 LOAD_CONST 0 (None)
102 BUILD_SLICE 2
104 LOAD_CONST 0 (None)
106 LOAD_FAST 7 (seq_length)
108 BUILD_SLICE 2
110 BUILD_TUPLE 2
112 BINARY_SUBSCR
114 STORE_FAST 8 (buffered_token_type_ids)
229 116 LOAD_FAST 8 (buffered_token_type_ids)
118 LOAD_METHOD 4 (expand)
120 LOAD_FAST 6 (input_shape)
122 LOAD_CONST 4 (0)
124 BINARY_SUBSCR
126 LOAD_FAST 7 (seq_length)
128 CALL_METHOD 2
130 STORE_FAST 9 (buffered_token_type_ids_expanded)
230 132 LOAD_FAST 9 (buffered_token_type_ids_expanded)
134 STORE_FAST 2 (token_type_ids)
136 JUMP_FORWARD 22 (to 160)
232 >> 138 LOAD_GLOBAL 5 (torch)
140 LOAD_ATTR 6 (zeros)
142 LOAD_FAST 6 (input_shape)
144 LOAD_GLOBAL 5 (torch)
146 LOAD_ATTR 7 (long)
148 LOAD_FAST 0 (self)
150 LOAD_ATTR 1 (position_ids)
152 LOAD_ATTR 8 (device)
154 LOAD_CONST 5 (('dtype', 'device'))
156 CALL_FUNCTION_KW 3
158 STORE_FAST 2 (token_type_ids)
234 >> 160 LOAD_FAST 4 (inputs_embeds)
162 LOAD_CONST 0 (None)
164 COMPARE_OP 8 (is)
166 POP_JUMP_IF_FALSE 178
235 168 LOAD_FAST 0 (self)
170 LOAD_METHOD 9 (word_embeddings)
172 LOAD_FAST 1 (input_ids)
174 CALL_METHOD 1
176 STORE_FAST 4 (inputs_embeds)
236 >> 178 LOAD_FAST 0 (self)
180 LOAD_METHOD 10 (token_type_embeddings)
182 LOAD_FAST 2 (token_type_ids)
184 CALL_METHOD 1
186 STORE_FAST 10 (token_type_embeddings)
238 188 LOAD_FAST 4 (inputs_embeds)
190 LOAD_FAST 10 (token_type_embeddings)
192 BINARY_ADD
194 STORE_FAST 11 (embeddings)
239 196 LOAD_FAST 0 (self)
198 LOAD_ATTR 11 (position_embedding_type)
200 LOAD_CONST 6 ('absolute')
202 COMPARE_OP 2 (==)
204 POP_JUMP_IF_FALSE 224
240 206 LOAD_FAST 0 (self)
208 LOAD_METHOD 12 (position_embeddings)
210 LOAD_FAST 3 (position_ids)
212 CALL_METHOD 1
214 STORE_FAST 12 (position_embeddings)
241 216 LOAD_FAST 11 (embeddings)
218 LOAD_FAST 12 (position_embeddings)
220 INPLACE_ADD
222 STORE_FAST 11 (embeddings)
242 >> 224 LOAD_FAST 0 (self)
226 LOAD_METHOD 13 (LayerNorm)
228 LOAD_FAST 11 (embeddings)
230 CALL_METHOD 1
232 STORE_FAST 11 (embeddings)
243 234 LOAD_FAST 0 (self)
236 LOAD_METHOD 14 (dropout)
238 LOAD_FAST 11 (embeddings)
240 CALL_METHOD 1
242 STORE_FAST 11 (embeddings)
244 244 LOAD_FAST 11 (embeddings)
246 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 18 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST input_shape [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 34 []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_shape []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST seq_length [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST position_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 76 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values_length [TensorVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST seq_length [TensorVariable(), SliceVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values_length [TensorVariable(), SliceVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), SliceVariable(), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TensorVariable(), SliceVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [TensorVariable(), SliceVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TensorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST position_ids [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST token_type_ids []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 160 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs_embeds []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 178 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR word_embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST inputs_embeds [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR token_type_embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST token_type_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST token_type_embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST inputs_embeds []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST token_type_embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST absolute [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 224 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST position_ids [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST position_embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST embeddings []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST position_embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE INPLACE_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST embeddings [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST embeddings [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST embeddings []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140450, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 205>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST embedding_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR encoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST embedding_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST extended_attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_extended_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST return_dict [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('attention_mask', 'head_mask', 'encoder_hidden_states', 'encoder_attention_mask', 'past_key_values', 'use_cache', 'output_attentions', 'output_hidden_states', 'return_dict') [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 10 [NNModuleVariable(), TensorVariable(), TensorVariable(), ListVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(bool), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c8920, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 559>
572 0 LOAD_FAST 9 (output_hidden_states)
2 POP_JUMP_IF_FALSE 8
4 LOAD_CONST 1 (())
6 JUMP_FORWARD 2 (to 10)
>> 8 LOAD_CONST 0 (None)
>> 10 STORE_FAST 11 (all_hidden_states)
573 12 LOAD_DEREF 0 (output_attentions)
14 POP_JUMP_IF_FALSE 20
16 LOAD_CONST 1 (())
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 12 (all_self_attentions)
574 24 LOAD_DEREF 0 (output_attentions)
26 POP_JUMP_IF_FALSE 40
28 LOAD_FAST 0 (self)
30 LOAD_ATTR 0 (config)
32 LOAD_ATTR 1 (add_cross_attention)
34 POP_JUMP_IF_FALSE 40
36 LOAD_CONST 1 (())
38 JUMP_FORWARD 2 (to 42)
>> 40 LOAD_CONST 0 (None)
>> 42 STORE_FAST 13 (all_cross_attentions)
576 44 LOAD_FAST 7 (use_cache)
46 POP_JUMP_IF_FALSE 52
48 LOAD_CONST 1 (())
50 JUMP_FORWARD 2 (to 54)
>> 52 LOAD_CONST 0 (None)
>> 54 STORE_FAST 14 (next_decoder_cache)
577 56 LOAD_GLOBAL 2 (enumerate)
58 LOAD_FAST 0 (self)
60 LOAD_ATTR 3 (layer)
62 CALL_FUNCTION 1
64 GET_ITER
>> 66 FOR_ITER 222 (to 290)
68 UNPACK_SEQUENCE 2
70 STORE_FAST 15 (i)
72 STORE_FAST 16 (layer_module)
578 74 LOAD_FAST 9 (output_hidden_states)
76 POP_JUMP_IF_FALSE 88
579 78 LOAD_FAST 11 (all_hidden_states)
80 LOAD_FAST 1 (hidden_states)
82 BUILD_TUPLE 1
84 BINARY_ADD
86 STORE_FAST 11 (all_hidden_states)
581 >> 88 LOAD_FAST 3 (head_mask)
90 LOAD_CONST 0 (None)
92 COMPARE_OP 9 (is not)
94 POP_JUMP_IF_FALSE 104
96 LOAD_FAST 3 (head_mask)
98 LOAD_FAST 15 (i)
100 BINARY_SUBSCR
102 JUMP_FORWARD 2 (to 106)
>> 104 LOAD_CONST 0 (None)
>> 106 STORE_FAST 17 (layer_head_mask)
582 108 LOAD_FAST 6 (past_key_values)
110 LOAD_CONST 0 (None)
112 COMPARE_OP 9 (is not)
114 POP_JUMP_IF_FALSE 124
116 LOAD_FAST 6 (past_key_values)
118 LOAD_FAST 15 (i)
120 BINARY_SUBSCR
122 JUMP_FORWARD 2 (to 126)
>> 124 LOAD_CONST 0 (None)
>> 126 STORE_DEREF 1 (past_key_value)
584 128 LOAD_FAST 0 (self)
130 LOAD_ATTR 4 (gradient_checkpointing)
132 POP_JUMP_IF_FALSE 202
134 LOAD_FAST 0 (self)
136 LOAD_ATTR 5 (training)
138 POP_JUMP_IF_FALSE 202
586 140 LOAD_FAST 7 (use_cache)
142 POP_JUMP_IF_FALSE 158
587 144 LOAD_GLOBAL 6 (logger)
146 LOAD_METHOD 7 (warning)
588 148 LOAD_CONST 2 ('`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...')
587 150 CALL_METHOD 1
152 POP_TOP
590 154 LOAD_CONST 3 (False)
156 STORE_FAST 7 (use_cache)
592 >> 158 LOAD_CLOSURE 0 (output_attentions)
160 LOAD_CLOSURE 1 (past_key_value)
162 BUILD_TUPLE 2
164 LOAD_CONST 4 (<code object create_custom_forward at 0x7f6a100c87c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 592>)
166 LOAD_CONST 5 ('BertEncoder.forward.<locals>.create_custom_forward')
168 MAKE_FUNCTION 8 (closure)
170 STORE_FAST 18 (create_custom_forward)
598 172 LOAD_GLOBAL 8 (torch)
174 LOAD_ATTR 9 (utils)
176 LOAD_ATTR 10 (checkpoint)
178 LOAD_METHOD 10 (checkpoint)
599 180 LOAD_FAST 18 (create_custom_forward)
182 LOAD_FAST 16 (layer_module)
184 CALL_FUNCTION 1
600 186 LOAD_FAST 1 (hidden_states)
601 188 LOAD_FAST 2 (attention_mask)
602 190 LOAD_FAST 17 (layer_head_mask)
603 192 LOAD_FAST 4 (encoder_hidden_states)
604 194 LOAD_FAST 5 (encoder_attention_mask)
598 196 CALL_METHOD 6
198 STORE_FAST 19 (layer_outputs)
200 JUMP_FORWARD 20 (to 222)
607 >> 202 LOAD_FAST 16 (layer_module)
608 204 LOAD_FAST 1 (hidden_states)
609 206 LOAD_FAST 2 (attention_mask)
610 208 LOAD_FAST 17 (layer_head_mask)
611 210 LOAD_FAST 4 (encoder_hidden_states)
612 212 LOAD_FAST 5 (encoder_attention_mask)
613 214 LOAD_DEREF 1 (past_key_value)
614 216 LOAD_DEREF 0 (output_attentions)
607 218 CALL_FUNCTION 7
220 STORE_FAST 19 (layer_outputs)
617 >> 222 LOAD_FAST 19 (layer_outputs)
224 LOAD_CONST 6 (0)
226 BINARY_SUBSCR
228 STORE_FAST 1 (hidden_states)
618 230 LOAD_FAST 7 (use_cache)
232 POP_JUMP_IF_FALSE 248
619 234 LOAD_FAST 14 (next_decoder_cache)
236 LOAD_FAST 19 (layer_outputs)
238 LOAD_CONST 7 (-1)
240 BINARY_SUBSCR
242 BUILD_TUPLE 1
244 INPLACE_ADD
246 STORE_FAST 14 (next_decoder_cache)
620 >> 248 LOAD_DEREF 0 (output_attentions)
250 POP_JUMP_IF_FALSE 66
621 252 LOAD_FAST 12 (all_self_attentions)
254 LOAD_FAST 19 (layer_outputs)
256 LOAD_CONST 8 (1)
258 BINARY_SUBSCR
260 BUILD_TUPLE 1
262 BINARY_ADD
264 STORE_FAST 12 (all_self_attentions)
622 266 LOAD_FAST 0 (self)
268 LOAD_ATTR 0 (config)
270 LOAD_ATTR 1 (add_cross_attention)
272 POP_JUMP_IF_FALSE 66
623 274 LOAD_FAST 13 (all_cross_attentions)
276 LOAD_FAST 19 (layer_outputs)
278 LOAD_CONST 9 (2)
280 BINARY_SUBSCR
282 BUILD_TUPLE 1
284 BINARY_ADD
286 STORE_FAST 13 (all_cross_attentions)
288 JUMP_ABSOLUTE 66
625 >> 290 LOAD_FAST 9 (output_hidden_states)
292 EXTENDED_ARG 1
294 POP_JUMP_IF_FALSE 306
626 296 LOAD_FAST 11 (all_hidden_states)
298 LOAD_FAST 1 (hidden_states)
300 BUILD_TUPLE 1
302 BINARY_ADD
304 STORE_FAST 11 (all_hidden_states)
628 >> 306 LOAD_FAST 10 (return_dict)
308 EXTENDED_ARG 1
310 POP_JUMP_IF_TRUE 340
629 312 LOAD_GLOBAL 11 (tuple)
314 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a100c8870, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 629>)
316 LOAD_CONST 11 ('BertEncoder.forward.<locals>.<genexpr>')
318 MAKE_FUNCTION 0
632 320 LOAD_FAST 1 (hidden_states)
633 322 LOAD_FAST 14 (next_decoder_cache)
634 324 LOAD_FAST 11 (all_hidden_states)
635 326 LOAD_FAST 12 (all_self_attentions)
636 328 LOAD_FAST 13 (all_cross_attentions)
631 330 BUILD_TUPLE 5
629 332 GET_ITER
334 CALL_FUNCTION 1
336 CALL_FUNCTION 1
338 RETURN_VALUE
640 >> 340 LOAD_GLOBAL 12 (BaseModelOutputWithPastAndCrossAttentions)
641 342 LOAD_FAST 1 (hidden_states)
642 344 LOAD_FAST 14 (next_decoder_cache)
643 346 LOAD_FAST 11 (all_hidden_states)
644 348 LOAD_FAST 12 (all_self_attentions)
645 350 LOAD_FAST 13 (all_cross_attentions)
640 352 LOAD_CONST 12 (('last_hidden_state', 'past_key_values', 'hidden_states', 'attentions', 'cross_attentions'))
354 CALL_FUNCTION_KW 5
356 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 8 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST all_hidden_states [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST all_self_attentions [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 40 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST all_cross_attentions [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST next_decoder_cache [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL enumerate []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [BuiltinVariable(enumerate)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR layer [BuiltinVariable(enumerate), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(enumerate), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE GET_ITER None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 236 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST query_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST key_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), GetAttrVariable(TensorVariable(), transpose), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 280 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR position_embedding_type [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST relative_key_query [ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP == [ConstantVariable(str), ConstantVariable(str)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 478 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL math [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR sqrt [TensorVariable(), TorchVariable(<module 'math' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/lib-dynload/math.cpython-38-x86_64-linux-gnu.so'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [TensorVariable(), TorchVariable(<built-in function sqrt>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [TensorVariable(), TorchVariable(<built-in function sqrt>), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TensorVariable(), TorchVariable(<built-in function sqrt>), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_TRUE_DIVIDE None [TensorVariable(), ConstantVariable(float)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 512 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_scores [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL nn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR functional [TorchVariable(<module 'torch.nn' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR softmax [TorchVariable(<module 'torch.nn.functional' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/nn/functional.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_scores [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('dim',) [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 2 [TorchVariable(<function softmax at 0x7f6a1d0fb5e0>), TensorVariable(), ConstantVariable(int), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_probs [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 556 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL torch []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR matmul [TorchVariable(<module 'torch' from '/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/torch/__init__.py'>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_probs [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST value_layer [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [TorchVariable(<built-in method matmul of type object at 0x7f6bb096cb40>), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR contiguous [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), contiguous)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -2 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR all_head_size [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_context_layer_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_context_layer_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST context_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 636 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST context_layer []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 660 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [NNModuleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [NNModuleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
382 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
383 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
384 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
385 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a101409d0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 381>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attention_outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST attention_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attention_outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [TupleVariable(), ConstantVariable(int), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [TupleVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST cross_attn_present_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 222 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL apply_chunking_to_forward []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR feed_forward_chunk [UserFunctionVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR chunk_size_feed_forward [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR seq_len_dim [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [UserFunctionVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), ConstantVariable(int), ConstantVariable(int), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
207 0 LOAD_GLOBAL 0 (len)
2 LOAD_FAST 3 (input_tensors)
4 CALL_FUNCTION 1
6 LOAD_CONST 1 (0)
8 COMPARE_OP 4 (>)
10 POP_JUMP_IF_TRUE 26
12 LOAD_GLOBAL 1 (AssertionError)
14 LOAD_FAST 3 (input_tensors)
16 FORMAT_VALUE 0
18 LOAD_CONST 2 (' has to be a tuple/list of tensors')
20 BUILD_STRING 2
22 CALL_FUNCTION 1
24 RAISE_VARARGS 1
210 >> 26 LOAD_GLOBAL 0 (len)
28 LOAD_GLOBAL 2 (inspect)
30 LOAD_METHOD 3 (signature)
32 LOAD_DEREF 1 (forward_fn)
34 CALL_METHOD 1
36 LOAD_ATTR 4 (parameters)
38 CALL_FUNCTION 1
40 STORE_FAST 4 (num_args_in_forward_chunk_fn)
211 42 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
44 LOAD_GLOBAL 0 (len)
46 LOAD_FAST 3 (input_tensors)
48 CALL_FUNCTION 1
50 COMPARE_OP 3 (!=)
52 POP_JUMP_IF_FALSE 80
212 54 LOAD_GLOBAL 5 (ValueError)
213 56 LOAD_CONST 3 ('forward_chunk_fn expects ')
58 LOAD_FAST 4 (num_args_in_forward_chunk_fn)
60 FORMAT_VALUE 0
62 LOAD_CONST 4 (' arguments, but only ')
64 LOAD_GLOBAL 0 (len)
66 LOAD_FAST 3 (input_tensors)
68 CALL_FUNCTION 1
70 FORMAT_VALUE 0
72 LOAD_CONST 5 (' input tensors are given')
74 BUILD_STRING 5
212 76 CALL_FUNCTION 1
78 RAISE_VARARGS 1
217 >> 80 LOAD_FAST 1 (chunk_size)
82 LOAD_CONST 1 (0)
84 COMPARE_OP 4 (>)
86 EXTENDED_ARG 1
88 POP_JUMP_IF_FALSE 288
218 90 LOAD_FAST 3 (input_tensors)
92 LOAD_CONST 1 (0)
94 BINARY_SUBSCR
96 LOAD_ATTR 6 (shape)
98 LOAD_DEREF 0 (chunk_dim)
100 BINARY_SUBSCR
102 STORE_FAST 5 (tensor_shape)
219 104 LOAD_FAST 3 (input_tensors)
106 GET_ITER
>> 108 FOR_ITER 44 (to 154)
110 STORE_FAST 6 (input_tensor)
220 112 LOAD_FAST 6 (input_tensor)
114 LOAD_ATTR 6 (shape)
116 LOAD_DEREF 0 (chunk_dim)
118 BINARY_SUBSCR
120 LOAD_FAST 5 (tensor_shape)
122 COMPARE_OP 3 (!=)
124 POP_JUMP_IF_FALSE 108
221 126 LOAD_GLOBAL 5 (ValueError)
222 128 LOAD_CONST 6 ('All input tenors have to be of the same shape: ')
130 LOAD_FAST 5 (tensor_shape)
132 FORMAT_VALUE 0
134 LOAD_CONST 7 (', found shape ')
136 LOAD_FAST 6 (input_tensor)
138 LOAD_ATTR 6 (shape)
140 LOAD_DEREF 0 (chunk_dim)
142 BINARY_SUBSCR
144 FORMAT_VALUE 0
146 BUILD_STRING 4
221 148 CALL_FUNCTION 1
150 RAISE_VARARGS 1
152 JUMP_ABSOLUTE 108
226 >> 154 LOAD_FAST 3 (input_tensors)
156 LOAD_CONST 1 (0)
158 BINARY_SUBSCR
160 LOAD_ATTR 6 (shape)
162 LOAD_DEREF 0 (chunk_dim)
164 BINARY_SUBSCR
166 LOAD_FAST 1 (chunk_size)
168 BINARY_MODULO
170 LOAD_CONST 1 (0)
172 COMPARE_OP 3 (!=)
174 POP_JUMP_IF_FALSE 206
227 176 LOAD_GLOBAL 5 (ValueError)
228 178 LOAD_CONST 8 ('The dimension to be chunked ')
180 LOAD_FAST 3 (input_tensors)
182 LOAD_CONST 1 (0)
184 BINARY_SUBSCR
186 LOAD_ATTR 6 (shape)
188 LOAD_DEREF 0 (chunk_dim)
190 BINARY_SUBSCR
192 FORMAT_VALUE 0
194 LOAD_CONST 9 (' has to be a multiple of the chunk size ')
196 LOAD_FAST 1 (chunk_size)
198 FORMAT_VALUE 0
200 BUILD_STRING 4
227 202 CALL_FUNCTION 1
204 RAISE_VARARGS 1
232 >> 206 LOAD_FAST 3 (input_tensors)
208 LOAD_CONST 1 (0)
210 BINARY_SUBSCR
212 LOAD_ATTR 6 (shape)
214 LOAD_DEREF 0 (chunk_dim)
216 BINARY_SUBSCR
218 LOAD_FAST 1 (chunk_size)
220 BINARY_FLOOR_DIVIDE
222 STORE_DEREF 2 (num_chunks)
235 224 LOAD_GLOBAL 7 (tuple)
226 LOAD_CLOSURE 0 (chunk_dim)
228 LOAD_CLOSURE 2 (num_chunks)
230 BUILD_TUPLE 2
232 LOAD_CONST 10 (<code object <genexpr> at 0x7f6a10083ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 235>)
234 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
236 MAKE_FUNCTION 8 (closure)
238 LOAD_FAST 3 (input_tensors)
240 GET_ITER
242 CALL_FUNCTION 1
244 CALL_FUNCTION 1
246 STORE_FAST 7 (input_tensors_chunks)
237 248 LOAD_GLOBAL 7 (tuple)
250 LOAD_CLOSURE 1 (forward_fn)
252 BUILD_TUPLE 1
254 LOAD_CONST 12 (<code object <genexpr> at 0x7f6a10083f50, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 237>)
256 LOAD_CONST 11 ('apply_chunking_to_forward.<locals>.<genexpr>')
258 MAKE_FUNCTION 8 (closure)
260 LOAD_GLOBAL 8 (zip)
262 LOAD_FAST 7 (input_tensors_chunks)
264 CALL_FUNCTION_EX 0
266 GET_ITER
268 CALL_FUNCTION 1
270 CALL_FUNCTION 1
272 STORE_FAST 8 (output_chunks)
239 274 LOAD_GLOBAL 9 (torch)
276 LOAD_ATTR 10 (cat)
278 LOAD_FAST 8 (output_chunks)
280 LOAD_DEREF 0 (chunk_dim)
282 LOAD_CONST 13 (('dim',))
284 CALL_FUNCTION_KW 2
286 RETURN_VALUE
241 >> 288 LOAD_DEREF 1 (forward_fn)
290 LOAD_FAST 3 (input_tensors)
292 CALL_FUNCTION_EX 0
294 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_TRUE 26 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL inspect [BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR signature [BuiltinVariable(len), PythonModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn [BuiltinVariable(len), LambdaVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), LambdaVariable(), UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR parameters [BuiltinVariable(len), InspectSignatureVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [BuiltinVariable(len), GetAttrVariable(InspectSignatureVariable(), parameters)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST num_args_in_forward_chunk_fn [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST num_args_in_forward_chunk_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_GLOBAL len [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [ConstantVariable(int), BuiltinVariable(len)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [ConstantVariable(int), BuiltinVariable(len), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP != [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 80 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST chunk_size []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP > [ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 288 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF forward_fn []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensors [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_EX 0 [UserMethodVariable(<function BertLayer.feed_forward_chunk at 0x7f69ff5474c0>, NNModuleVariable()), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
547 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (intermediate)
4 LOAD_FAST 1 (attention_output)
6 CALL_METHOD 1
8 STORE_FAST 2 (intermediate_output)
548 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (output)
14 LOAD_FAST 2 (intermediate_output)
16 LOAD_FAST 1 (attention_output)
18 CALL_METHOD 2
20 STORE_FAST 3 (layer_output)
549 22 LOAD_FAST 3 (layer_output)
24 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
447 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
448 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (intermediate_act_fn)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
449 20 LOAD_FAST 1 (hidden_states)
22 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR intermediate_act_fn [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
56 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (act)
4 LOAD_FAST 1 (input)
6 CALL_METHOD 1
8 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR act [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input [TorchVariable(<built-in function gelu>)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [TorchVariable(<built-in function gelu>), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100e12f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/activations.py", line 55>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a10140ea0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 446>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST intermediate_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST intermediate_output [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_output [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 2 [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
460 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (dense)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 1 (hidden_states)
461 10 LOAD_FAST 0 (self)
12 LOAD_METHOD 1 (dropout)
14 LOAD_FAST 1 (hidden_states)
16 CALL_METHOD 1
18 STORE_FAST 1 (hidden_states)
462 20 LOAD_FAST 0 (self)
22 LOAD_METHOD 2 (LayerNorm)
24 LOAD_FAST 1 (hidden_states)
26 LOAD_FAST 2 (input_tensor)
28 BINARY_ADD
30 CALL_METHOD 1
32 STORE_FAST 1 (hidden_states)
463 34 LOAD_FAST 1 (hidden_states)
36 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dense [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR dropout [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR LayerNorm [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST input_tensor [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c80e0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 459>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object feed_forward_chunk at 0x7f6a100c83a0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 546>
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object apply_chunking_to_forward at 0x7f6a10006030, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 169>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_output [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_output []
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 1 [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [TupleVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST outputs [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_decoder [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 270 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST outputs []
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_outputs [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_outputs [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), TupleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST hidden_states [ListIteratorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST use_cache [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 248 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 66 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE FOR_ITER 290 [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE UNPACK_SEQUENCE 2 [ListIteratorVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST i [ListIteratorVariable(), NNModuleVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_module [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_hidden_states [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 88 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ListVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 104 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST i [ListIteratorVariable(), ListVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [ListIteratorVariable(), ListVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE JUMP_FORWARD 106 [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST layer_head_mask [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_values [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ListIteratorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 124 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_DEREF past_key_value [ListIteratorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR gradient_checkpointing [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 202 [ListIteratorVariable(), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_module [ListIteratorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [ListIteratorVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST layer_head_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF past_key_value [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_DEREF output_attentions [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [ListIteratorVariable(), NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a100c82f0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 481>
492 0 LOAD_FAST 6 (past_key_value)
2 LOAD_CONST 0 (None)
4 COMPARE_OP 9 (is not)
6 POP_JUMP_IF_FALSE 20
8 LOAD_FAST 6 (past_key_value)
10 LOAD_CONST 0 (None)
12 LOAD_CONST 1 (2)
14 BUILD_SLICE 2
16 BINARY_SUBSCR
18 JUMP_FORWARD 2 (to 22)
>> 20 LOAD_CONST 0 (None)
>> 22 STORE_FAST 8 (self_attn_past_key_value)
493 24 LOAD_FAST 0 (self)
26 LOAD_ATTR 0 (attention)
494 28 LOAD_FAST 1 (hidden_states)
495 30 LOAD_FAST 2 (attention_mask)
496 32 LOAD_FAST 3 (head_mask)
497 34 LOAD_FAST 7 (output_attentions)
498 36 LOAD_FAST 8 (self_attn_past_key_value)
493 38 LOAD_CONST 2 (('output_attentions', 'past_key_value'))
40 CALL_FUNCTION_KW 5
42 STORE_FAST 9 (self_attention_outputs)
500 44 LOAD_FAST 9 (self_attention_outputs)
46 LOAD_CONST 3 (0)
48 BINARY_SUBSCR
50 STORE_FAST 10 (attention_output)
503 52 LOAD_FAST 0 (self)
54 LOAD_ATTR 1 (is_decoder)
56 POP_JUMP_IF_FALSE 80
504 58 LOAD_FAST 9 (self_attention_outputs)
60 LOAD_CONST 4 (1)
62 LOAD_CONST 5 (-1)
64 BUILD_SLICE 2
66 BINARY_SUBSCR
68 STORE_FAST 11 (outputs)
505 70 LOAD_FAST 9 (self_attention_outputs)
72 LOAD_CONST 5 (-1)
74 BINARY_SUBSCR
76 STORE_FAST 12 (present_key_value)
78 JUMP_FORWARD 12 (to 92)
507 >> 80 LOAD_FAST 9 (self_attention_outputs)
82 LOAD_CONST 4 (1)
84 LOAD_CONST 0 (None)
86 BUILD_SLICE 2
88 BINARY_SUBSCR
90 STORE_FAST 11 (outputs)
509 >> 92 LOAD_CONST 0 (None)
94 STORE_FAST 13 (cross_attn_present_key_value)
510 96 LOAD_FAST 0 (self)
98 LOAD_ATTR 1 (is_decoder)
100 POP_JUMP_IF_FALSE 222
102 LOAD_FAST 4 (encoder_hidden_states)
104 LOAD_CONST 0 (None)
106 COMPARE_OP 9 (is not)
108 POP_JUMP_IF_FALSE 222
511 110 LOAD_GLOBAL 2 (hasattr)
112 LOAD_FAST 0 (self)
114 LOAD_CONST 6 ('crossattention')
116 CALL_FUNCTION 2
118 POP_JUMP_IF_TRUE 136
512 120 LOAD_GLOBAL 3 (ValueError)
513 122 LOAD_CONST 7 ('If `encoder_hidden_states` are passed, ')
124 LOAD_FAST 0 (self)
126 FORMAT_VALUE 0
128 LOAD_CONST 8 (' has to be instantiated with cross-attention layers by setting `config.add_cross_attention=True`')
130 BUILD_STRING 3
512 132 CALL_FUNCTION 1
134 RAISE_VARARGS 1
518 >> 136 LOAD_FAST 6 (past_key_value)
138 LOAD_CONST 0 (None)
140 COMPARE_OP 9 (is not)
142 POP_JUMP_IF_FALSE 156
144 LOAD_FAST 6 (past_key_value)
146 LOAD_CONST 9 (-2)
148 LOAD_CONST 0 (None)
150 BUILD_SLICE 2
152 BINARY_SUBSCR
154 JUMP_FORWARD 2 (to 158)
>> 156 LOAD_CONST 0 (None)
>> 158 STORE_FAST 14 (cross_attn_past_key_value)
519 160 LOAD_FAST 0 (self)
162 LOAD_METHOD 4 (crossattention)
520 164 LOAD_FAST 10 (attention_output)
521 166 LOAD_FAST 2 (attention_mask)
522 168 LOAD_FAST 3 (head_mask)
523 170 LOAD_FAST 4 (encoder_hidden_states)
524 172 LOAD_FAST 5 (encoder_attention_mask)
525 174 LOAD_FAST 14 (cross_attn_past_key_value)
526 176 LOAD_FAST 7 (output_attentions)
519 178 CALL_METHOD 7
180 STORE_FAST 15 (cross_attention_outputs)
528 182 LOAD_FAST 15 (cross_attention_outputs)
184 LOAD_CONST 3 (0)
186 BINARY_SUBSCR
188 STORE_FAST 10 (attention_output)
529 190 LOAD_FAST 11 (outputs)
192 LOAD_FAST 15 (cross_attention_outputs)
194 LOAD_CONST 4 (1)
196 LOAD_CONST 5 (-1)
198 BUILD_SLICE 2
200 BINARY_SUBSCR
202 BINARY_ADD
204 STORE_FAST 11 (outputs)
532 206 LOAD_FAST 15 (cross_attention_outputs)
208 LOAD_CONST 5 (-1)
210 BINARY_SUBSCR
212 STORE_FAST 13 (cross_attn_present_key_value)
533 214 LOAD_FAST 12 (present_key_value)
216 LOAD_FAST 13 (cross_attn_present_key_value)
218 BINARY_ADD
220 STORE_FAST 12 (present_key_value)
535 >> 222 LOAD_GLOBAL 5 (apply_chunking_to_forward)
536 224 LOAD_FAST 0 (self)
226 LOAD_ATTR 6 (feed_forward_chunk)
228 LOAD_FAST 0 (self)
230 LOAD_ATTR 7 (chunk_size_feed_forward)
232 LOAD_FAST 0 (self)
234 LOAD_ATTR 8 (seq_len_dim)
236 LOAD_FAST 10 (attention_output)
535 238 CALL_FUNCTION 4
240 STORE_FAST 16 (layer_output)
538 242 LOAD_FAST 16 (layer_output)
244 BUILD_TUPLE 1
246 LOAD_FAST 11 (outputs)
248 BINARY_ADD
250 STORE_FAST 11 (outputs)
541 252 LOAD_FAST 0 (self)
254 LOAD_ATTR 1 (is_decoder)
256 EXTENDED_ARG 1
258 POP_JUMP_IF_FALSE 270
542 260 LOAD_FAST 11 (outputs)
262 LOAD_FAST 12 (present_key_value)
264 BUILD_TUPLE 1
266 BINARY_ADD
268 STORE_FAST 11 (outputs)
544 >> 270 LOAD_FAST 11 (outputs)
272 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 20 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None []
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST self_attn_past_key_value [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self_attn_past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST ('output_attentions', 'past_key_value') [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION_KW 5 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(bool), ConstantVariable(NoneType), ConstantVariable(tuple)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a10140c90, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 413>
423 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (self)
424 4 LOAD_FAST 1 (hidden_states)
425 6 LOAD_FAST 2 (attention_mask)
426 8 LOAD_FAST 3 (head_mask)
427 10 LOAD_FAST 4 (encoder_hidden_states)
428 12 LOAD_FAST 5 (encoder_attention_mask)
429 14 LOAD_FAST 6 (past_key_value)
430 16 LOAD_FAST 7 (output_attentions)
423 18 CALL_METHOD 7
20 STORE_FAST 8 (self_outputs)
432 22 LOAD_FAST 0 (self)
24 LOAD_METHOD 1 (output)
26 LOAD_FAST 8 (self_outputs)
28 LOAD_CONST 1 (0)
30 BINARY_SUBSCR
32 LOAD_FAST 1 (hidden_states)
34 CALL_METHOD 2
36 STORE_FAST 9 (attention_output)
433 38 LOAD_FAST 9 (attention_output)
40 BUILD_TUPLE 1
42 LOAD_FAST 8 (self_outputs)
44 LOAD_CONST 2 (1)
46 LOAD_CONST 0 (None)
48 BUILD_SLICE 2
50 BINARY_SUBSCR
52 BINARY_ADD
54 STORE_FAST 10 (outputs)
434 56 LOAD_FAST 10 (outputs)
58 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR self [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST attention_mask [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST head_mask [NNModuleVariable(), TensorVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_attention_mask [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST output_attentions [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 7 [NNModuleVariable(), TensorVariable(), TensorVariable(), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(NoneType), ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object forward at 0x7f6a101407c0, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 279>
289 0 LOAD_FAST 0 (self)
2 LOAD_METHOD 0 (query)
4 LOAD_FAST 1 (hidden_states)
6 CALL_METHOD 1
8 STORE_FAST 8 (mixed_query_layer)
294 10 LOAD_FAST 4 (encoder_hidden_states)
12 LOAD_CONST 0 (None)
14 COMPARE_OP 9 (is not)
16 STORE_FAST 9 (is_cross_attention)
296 18 LOAD_FAST 9 (is_cross_attention)
20 POP_JUMP_IF_FALSE 52
22 LOAD_FAST 6 (past_key_value)
24 LOAD_CONST 0 (None)
26 COMPARE_OP 9 (is not)
28 POP_JUMP_IF_FALSE 52
298 30 LOAD_FAST 6 (past_key_value)
32 LOAD_CONST 1 (0)
34 BINARY_SUBSCR
36 STORE_FAST 10 (key_layer)
299 38 LOAD_FAST 6 (past_key_value)
40 LOAD_CONST 2 (1)
42 BINARY_SUBSCR
44 STORE_FAST 11 (value_layer)
300 46 LOAD_FAST 5 (encoder_attention_mask)
48 STORE_FAST 2 (attention_mask)
50 JUMP_FORWARD 160 (to 212)
301 >> 52 LOAD_FAST 9 (is_cross_attention)
54 POP_JUMP_IF_FALSE 94
302 56 LOAD_FAST 0 (self)
58 LOAD_METHOD 1 (transpose_for_scores)
60 LOAD_FAST 0 (self)
62 LOAD_METHOD 2 (key)
64 LOAD_FAST 4 (encoder_hidden_states)
66 CALL_METHOD 1
68 CALL_METHOD 1
70 STORE_FAST 10 (key_layer)
303 72 LOAD_FAST 0 (self)
74 LOAD_METHOD 1 (transpose_for_scores)
76 LOAD_FAST 0 (self)
78 LOAD_METHOD 3 (value)
80 LOAD_FAST 4 (encoder_hidden_states)
82 CALL_METHOD 1
84 CALL_METHOD 1
86 STORE_FAST 11 (value_layer)
304 88 LOAD_FAST 5 (encoder_attention_mask)
90 STORE_FAST 2 (attention_mask)
92 JUMP_FORWARD 118 (to 212)
305 >> 94 LOAD_FAST 6 (past_key_value)
96 LOAD_CONST 0 (None)
98 COMPARE_OP 9 (is not)
100 POP_JUMP_IF_FALSE 180
306 102 LOAD_FAST 0 (self)
104 LOAD_METHOD 1 (transpose_for_scores)
106 LOAD_FAST 0 (self)
108 LOAD_METHOD 2 (key)
110 LOAD_FAST 1 (hidden_states)
112 CALL_METHOD 1
114 CALL_METHOD 1
116 STORE_FAST 10 (key_layer)
307 118 LOAD_FAST 0 (self)
120 LOAD_METHOD 1 (transpose_for_scores)
122 LOAD_FAST 0 (self)
124 LOAD_METHOD 3 (value)
126 LOAD_FAST 1 (hidden_states)
128 CALL_METHOD 1
130 CALL_METHOD 1
132 STORE_FAST 11 (value_layer)
308 134 LOAD_GLOBAL 4 (torch)
136 LOAD_ATTR 5 (cat)
138 LOAD_FAST 6 (past_key_value)
140 LOAD_CONST 1 (0)
142 BINARY_SUBSCR
144 LOAD_FAST 10 (key_layer)
146 BUILD_LIST 2
148 LOAD_CONST 3 (2)
150 LOAD_CONST 4 (('dim',))
152 CALL_FUNCTION_KW 2
154 STORE_FAST 10 (key_layer)
309 156 LOAD_GLOBAL 4 (torch)
158 LOAD_ATTR 5 (cat)
160 LOAD_FAST 6 (past_key_value)
162 LOAD_CONST 2 (1)
164 BINARY_SUBSCR
166 LOAD_FAST 11 (value_layer)
168 BUILD_LIST 2
170 LOAD_CONST 3 (2)
172 LOAD_CONST 4 (('dim',))
174 CALL_FUNCTION_KW 2
176 STORE_FAST 11 (value_layer)
178 JUMP_FORWARD 32 (to 212)
311 >> 180 LOAD_FAST 0 (self)
182 LOAD_METHOD 1 (transpose_for_scores)
184 LOAD_FAST 0 (self)
186 LOAD_METHOD 2 (key)
188 LOAD_FAST 1 (hidden_states)
190 CALL_METHOD 1
192 CALL_METHOD 1
194 STORE_FAST 10 (key_layer)
312 196 LOAD_FAST 0 (self)
198 LOAD_METHOD 1 (transpose_for_scores)
200 LOAD_FAST 0 (self)
202 LOAD_METHOD 3 (value)
204 LOAD_FAST 1 (hidden_states)
206 CALL_METHOD 1
208 CALL_METHOD 1
210 STORE_FAST 11 (value_layer)
314 >> 212 LOAD_FAST 0 (self)
214 LOAD_METHOD 1 (transpose_for_scores)
216 LOAD_FAST 8 (mixed_query_layer)
218 CALL_METHOD 1
220 STORE_FAST 12 (query_layer)
316 222 LOAD_FAST 0 (self)
224 LOAD_ATTR 6 (is_decoder)
226 POP_JUMP_IF_FALSE 236
324 228 LOAD_FAST 10 (key_layer)
230 LOAD_FAST 11 (value_layer)
232 BUILD_TUPLE 2
234 STORE_FAST 6 (past_key_value)
327 >> 236 LOAD_GLOBAL 4 (torch)
238 LOAD_METHOD 7 (matmul)
240 LOAD_FAST 12 (query_layer)
242 LOAD_FAST 10 (key_layer)
244 LOAD_METHOD 8 (transpose)
246 LOAD_CONST 5 (-1)
248 LOAD_CONST 6 (-2)
250 CALL_METHOD 2
252 CALL_METHOD 2
254 STORE_FAST 13 (attention_scores)
329 256 LOAD_FAST 0 (self)
258 LOAD_ATTR 9 (position_embedding_type)
260 LOAD_CONST 7 ('relative_key')
262 COMPARE_OP 2 (==)
264 EXTENDED_ARG 1
266 POP_JUMP_IF_TRUE 280
268 LOAD_FAST 0 (self)
270 LOAD_ATTR 9 (position_embedding_type)
272 LOAD_CONST 8 ('relative_key_query')
274 COMPARE_OP 2 (==)
276 EXTENDED_ARG 1
278 POP_JUMP_IF_FALSE 478
330 >> 280 LOAD_FAST 1 (hidden_states)
282 LOAD_METHOD 10 (size)
284 CALL_METHOD 0
286 LOAD_CONST 2 (1)
288 BINARY_SUBSCR
290 STORE_FAST 14 (seq_length)
331 292 LOAD_GLOBAL 4 (torch)
294 LOAD_ATTR 11 (arange)
296 LOAD_FAST 14 (seq_length)
298 LOAD_GLOBAL 4 (torch)
300 LOAD_ATTR 12 (long)
302 LOAD_FAST 1 (hidden_states)
304 LOAD_ATTR 13 (device)
306 LOAD_CONST 9 (('dtype', 'device'))
308 CALL_FUNCTION_KW 3
310 LOAD_METHOD 14 (view)
312 LOAD_CONST 5 (-1)
314 LOAD_CONST 2 (1)
316 CALL_METHOD 2
318 STORE_FAST 15 (position_ids_l)
332 320 LOAD_GLOBAL 4 (torch)
322 LOAD_ATTR 11 (arange)
324 LOAD_FAST 14 (seq_length)
326 LOAD_GLOBAL 4 (torch)
328 LOAD_ATTR 12 (long)
330 LOAD_FAST 1 (hidden_states)
332 LOAD_ATTR 13 (device)
334 LOAD_CONST 9 (('dtype', 'device'))
336 CALL_FUNCTION_KW 3
338 LOAD_METHOD 14 (view)
340 LOAD_CONST 2 (1)
342 LOAD_CONST 5 (-1)
344 CALL_METHOD 2
346 STORE_FAST 16 (position_ids_r)
333 348 LOAD_FAST 15 (position_ids_l)
350 LOAD_FAST 16 (position_ids_r)
352 BINARY_SUBTRACT
354 STORE_FAST 17 (distance)
334 356 LOAD_FAST 0 (self)
358 LOAD_METHOD 15 (distance_embedding)
360 LOAD_FAST 17 (distance)
362 LOAD_FAST 0 (self)
364 LOAD_ATTR 16 (max_position_embeddings)
366 BINARY_ADD
368 LOAD_CONST 2 (1)
370 BINARY_SUBTRACT
372 CALL_METHOD 1
374 STORE_FAST 18 (positional_embedding)
335 376 LOAD_FAST 18 (positional_embedding)
378 LOAD_ATTR 17 (to)
380 LOAD_FAST 12 (query_layer)
382 LOAD_ATTR 18 (dtype)
384 LOAD_CONST 10 (('dtype',))
386 CALL_FUNCTION_KW 1
388 STORE_FAST 18 (positional_embedding)
337 390 LOAD_FAST 0 (self)
392 LOAD_ATTR 9 (position_embedding_type)
394 LOAD_CONST 7 ('relative_key')
396 COMPARE_OP 2 (==)
398 EXTENDED_ARG 1
400 POP_JUMP_IF_FALSE 426
338 402 LOAD_GLOBAL 4 (torch)
404 LOAD_METHOD 19 (einsum)
406 LOAD_CONST 11 ('bhld,lrd->bhlr')
408 LOAD_FAST 12 (query_layer)
410 LOAD_FAST 18 (positional_embedding)
412 CALL_METHOD 3
414 STORE_FAST 19 (relative_position_scores)
339 416 LOAD_FAST 13 (attention_scores)
418 LOAD_FAST 19 (relative_position_scores)
420 BINARY_ADD
422 STORE_FAST 13 (attention_scores)
424 JUMP_FORWARD 52 (to 478)
340 >> 426 LOAD_FAST 0 (self)
428 LOAD_ATTR 9 (position_embedding_type)
430 LOAD_CONST 8 ('relative_key_query')
432 COMPARE_OP 2 (==)
434 EXTENDED_ARG 1
436 POP_JUMP_IF_FALSE 478
341 438 LOAD_GLOBAL 4 (torch)
440 LOAD_METHOD 19 (einsum)
442 LOAD_CONST 11 ('bhld,lrd->bhlr')
444 LOAD_FAST 12 (query_layer)
446 LOAD_FAST 18 (positional_embedding)
448 CALL_METHOD 3
450 STORE_FAST 20 (relative_position_scores_query)
342 452 LOAD_GLOBAL 4 (torch)
454 LOAD_METHOD 19 (einsum)
456 LOAD_CONST 12 ('bhrd,lrd->bhlr')
458 LOAD_FAST 10 (key_layer)
460 LOAD_FAST 18 (positional_embedding)
462 CALL_METHOD 3
464 STORE_FAST 21 (relative_position_scores_key)
343 466 LOAD_FAST 13 (attention_scores)
468 LOAD_FAST 20 (relative_position_scores_query)
470 BINARY_ADD
472 LOAD_FAST 21 (relative_position_scores_key)
474 BINARY_ADD
476 STORE_FAST 13 (attention_scores)
345 >> 478 LOAD_FAST 13 (attention_scores)
480 LOAD_GLOBAL 20 (math)
482 LOAD_METHOD 21 (sqrt)
484 LOAD_FAST 0 (self)
486 LOAD_ATTR 22 (attention_head_size)
488 CALL_METHOD 1
490 BINARY_TRUE_DIVIDE
492 STORE_FAST 13 (attention_scores)
346 494 LOAD_FAST 2 (attention_mask)
496 LOAD_CONST 0 (None)
498 COMPARE_OP 9 (is not)
500 EXTENDED_ARG 2
502 POP_JUMP_IF_FALSE 512
348 504 LOAD_FAST 13 (attention_scores)
506 LOAD_FAST 2 (attention_mask)
508 BINARY_ADD
510 STORE_FAST 13 (attention_scores)
351 >> 512 LOAD_GLOBAL 23 (nn)
514 LOAD_ATTR 24 (functional)
516 LOAD_ATTR 25 (softmax)
518 LOAD_FAST 13 (attention_scores)
520 LOAD_CONST 5 (-1)
522 LOAD_CONST 4 (('dim',))
524 CALL_FUNCTION_KW 2
526 STORE_FAST 22 (attention_probs)
355 528 LOAD_FAST 0 (self)
530 LOAD_METHOD 26 (dropout)
532 LOAD_FAST 22 (attention_probs)
534 CALL_METHOD 1
536 STORE_FAST 22 (attention_probs)
358 538 LOAD_FAST 3 (head_mask)
540 LOAD_CONST 0 (None)
542 COMPARE_OP 9 (is not)
544 EXTENDED_ARG 2
546 POP_JUMP_IF_FALSE 556
359 548 LOAD_FAST 22 (attention_probs)
550 LOAD_FAST 3 (head_mask)
552 BINARY_MULTIPLY
554 STORE_FAST 22 (attention_probs)
361 >> 556 LOAD_GLOBAL 4 (torch)
558 LOAD_METHOD 7 (matmul)
560 LOAD_FAST 22 (attention_probs)
562 LOAD_FAST 11 (value_layer)
564 CALL_METHOD 2
566 STORE_FAST 23 (context_layer)
363 568 LOAD_FAST 23 (context_layer)
570 LOAD_METHOD 27 (permute)
572 LOAD_CONST 1 (0)
574 LOAD_CONST 3 (2)
576 LOAD_CONST 2 (1)
578 LOAD_CONST 13 (3)
580 CALL_METHOD 4
582 LOAD_METHOD 28 (contiguous)
584 CALL_METHOD 0
586 STORE_FAST 23 (context_layer)
364 588 LOAD_FAST 23 (context_layer)
590 LOAD_METHOD 10 (size)
592 CALL_METHOD 0
594 LOAD_CONST 0 (None)
596 LOAD_CONST 6 (-2)
598 BUILD_SLICE 2
600 BINARY_SUBSCR
602 LOAD_FAST 0 (self)
604 LOAD_ATTR 29 (all_head_size)
606 BUILD_TUPLE 1
608 BINARY_ADD
610 STORE_FAST 24 (new_context_layer_shape)
365 612 LOAD_FAST 23 (context_layer)
614 LOAD_METHOD 14 (view)
616 LOAD_FAST 24 (new_context_layer_shape)
618 CALL_METHOD 1
620 STORE_FAST 23 (context_layer)
367 622 LOAD_FAST 7 (output_attentions)
624 EXTENDED_ARG 2
626 POP_JUMP_IF_FALSE 636
628 LOAD_FAST 23 (context_layer)
630 LOAD_FAST 22 (attention_probs)
632 BUILD_TUPLE 2
634 JUMP_FORWARD 4 (to 640)
>> 636 LOAD_FAST 23 (context_layer)
638 BUILD_TUPLE 1
>> 640 STORE_FAST 25 (outputs)
369 642 LOAD_FAST 0 (self)
644 LOAD_ATTR 6 (is_decoder)
646 EXTENDED_ARG 2
648 POP_JUMP_IF_FALSE 660
370 650 LOAD_FAST 25 (outputs)
652 LOAD_FAST 6 (past_key_value)
654 BUILD_TUPLE 1
656 BINARY_ADD
658 STORE_FAST 25 (outputs)
371 >> 660 LOAD_FAST 25 (outputs)
662 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR query [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST mixed_query_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST encoder_hidden_states []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST is_cross_attention [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 52 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST is_cross_attention []
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 94 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST past_key_value []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE COMPARE_OP is not [ConstantVariable(NoneType), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 180 [ConstantVariable(bool)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR key [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST key_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR value [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST hidden_states [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), NNModuleVariable(), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST 2 (new_x_shape)
34 CALL_METHOD 1
36 STORE_FAST 1 (x)
277 38 LOAD_FAST 1 (x)
40 LOAD_METHOD 4 (permute)
42 LOAD_CONST 2 (0)
44 LOAD_CONST 3 (2)
46 LOAD_CONST 4 (1)
48 LOAD_CONST 5 (3)
50 CALL_METHOD 4
52 RETURN_VALUE
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR size [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), size)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST None [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST -1 [SizeVariable(), ConstantVariable(NoneType)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_SLICE 2 [SizeVariable(), ConstantVariable(NoneType), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_SUBSCR None [SizeVariable(), SliceVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR num_attention_heads [SizeVariable(), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self [SizeVariable(), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR attention_head_size [SizeVariable(), ConstantVariable(int), NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE BUILD_TUPLE 2 [SizeVariable(), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE BINARY_ADD None [SizeVariable(), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST new_x_shape [TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR view [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST new_x_shape [GetAttrVariable(TensorVariable(), view)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [GetAttrVariable(TensorVariable(), view), TupleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST x [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR permute [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 [GetAttrVariable(TensorVariable(), permute)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 2 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 1 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 3 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 4 [GetAttrVariable(TensorVariable(), permute), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int), ConstantVariable(int)]
torchdynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] DONE INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
torchdynamo.symbolic_convert: [DEBUG] TRACE STORE_FAST value_layer [TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST self []
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR transpose_for_scores [NNModuleVariable()]
torchdynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST mixed_query_layer [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable())]
torchdynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 1 [UserMethodVariable(<function BertSelfAttention.transpose_for_scores at 0x7f69ff541d30>, NNModuleVariable()), TensorVariable()]
torchdynamo.symbolic_convert: [DEBUG] INLINING <code object transpose_for_scores at 0x7f6a10140710, file "/data/home/dberard/miniconda/envs/bench-fast/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 274>
275 0 LOAD_FAST 1 (x)
2 LOAD_METHOD 0 (size)
4 CALL_METHOD 0
6 LOAD_CONST 0 (None)
8 LOAD_CONST 1 (-1)
10 BUILD_SLICE 2
12 BINARY_SUBSCR
14 LOAD_FAST 0 (self)
16 LOAD_ATTR 1 (num_attention_heads)
18 LOAD_FAST 0 (self)
20 LOAD_ATTR 2 (attention_head_size)
22 BUILD_TUPLE 2
24 BINARY_ADD
26 STORE_FAST 2 (new_x_shape)
276 28 LOAD_FAST 1 (x)
30 LOAD_METHOD 3 (view)
32 LOAD_FAST
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment