Skip to content

Instantly share code, notes, and snippets.

@neel04
Created August 26, 2022 23:11
Show Gist options
  • Save neel04/22052c61512657aec4b364888ed74a95 to your computer and use it in GitHub Desktop.
Save neel04/22052c61512657aec4b364888ed74a95 to your computer and use it in GitHub Desktop.
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO Bootstrap : Using eth0:172.31.231.78<0>
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.78<0> [1]eth1:172.31.225.87<0> [2]eth2:172.31.235.21<0> [3]eth3:172.31.235.87<0>
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO Using network Socket
NCCL version 2.12.12+cuda11.7
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO Bootstrap : Using eth0:172.31.237.132<0>
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.132<0> [1]eth1:172.31.227.76<0> [2]eth2:172.31.232.61<0> [3]eth3:172.31.230.241<0>
gpu-st-p4d-24xlarge-49:25778:25778 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO Bootstrap : Using eth0:172.31.237.132<0>
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.132<0> [1]eth1:172.31.227.76<0> [2]eth2:172.31.232.61<0> [3]eth3:172.31.230.241<0>
gpu-st-p4d-24xlarge-49:25781:25781 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO Bootstrap : Using eth0:172.31.237.132<0>
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.132<0> [1]eth1:172.31.227.76<0> [2]eth2:172.31.232.61<0> [3]eth3:172.31.230.241<0>
gpu-st-p4d-24xlarge-49:25777:25777 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO Bootstrap : Using eth0:172.31.231.78<0>
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.78<0> [1]eth1:172.31.225.87<0> [2]eth2:172.31.235.21<0> [3]eth3:172.31.235.87<0>
gpu-st-p4d-24xlarge-44:28068:28068 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO Bootstrap : Using eth0:172.31.237.132<0>
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.132<0> [1]eth1:172.31.227.76<0> [2]eth2:172.31.232.61<0> [3]eth3:172.31.230.241<0>
gpu-st-p4d-24xlarge-49:25776:25776 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO Bootstrap : Using eth0:172.31.237.132<0>
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.132<0> [1]eth1:172.31.227.76<0> [2]eth2:172.31.232.61<0> [3]eth3:172.31.230.241<0>
gpu-st-p4d-24xlarge-49:25779:25779 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO Bootstrap : Using eth0:172.31.237.132<0>
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO Bootstrap : Using eth0:172.31.229.205<0>
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.132<0> [1]eth1:172.31.227.76<0> [2]eth2:172.31.232.61<0> [3]eth3:172.31.230.241<0>
gpu-st-p4d-24xlarge-49:25780:25780 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO Bootstrap : Using eth0:172.31.229.205<0>
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO Bootstrap : Using eth0:172.31.229.205<0>
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO Bootstrap : Using eth0:172.31.229.205<0>
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO Bootstrap : Using eth0:172.31.229.205<0>
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO Bootstrap : Using eth0:172.31.229.205<0>
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO Bootstrap : Using eth0:172.31.229.154<0>
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO Bootstrap : Using eth0:172.31.229.154<0>
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO Bootstrap : Using eth0:172.31.229.154<0>
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO Bootstrap : Using eth0:172.31.229.154<0>
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO Bootstrap : Using eth0:172.31.229.154<0>
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO Bootstrap : Using eth0:172.31.229.154<0>
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO Bootstrap : Using eth0:172.31.227.198<0>
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO Bootstrap : Using eth0:172.31.227.198<0>
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO Bootstrap : Using eth0:172.31.227.198<0>
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO Bootstrap : Using eth0:172.31.227.198<0>
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO Bootstrap : Using eth0:172.31.227.198<0>
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO Bootstrap : Using eth0:172.31.227.198<0>
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO Bootstrap : Using eth0:172.31.235.246<0>
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO Bootstrap : Using eth0:172.31.235.246<0>
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO Bootstrap : Using eth0:172.31.235.246<0>
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO Bootstrap : Using eth0:172.31.235.246<0>
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO Bootstrap : Using eth0:172.31.235.246<0>
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO Bootstrap : Using eth0:172.31.235.246<0>
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.205<0> [1]eth1:172.31.224.87<0> [2]eth2:172.31.237.20<0> [3]eth3:172.31.226.214<0>
gpu-st-p4d-24xlarge-57:27795:27795 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.205<0> [1]eth1:172.31.224.87<0> [2]eth2:172.31.237.20<0> [3]eth3:172.31.226.214<0>
gpu-st-p4d-24xlarge-57:27792:27792 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.205<0> [1]eth1:172.31.224.87<0> [2]eth2:172.31.237.20<0> [3]eth3:172.31.226.214<0>
gpu-st-p4d-24xlarge-57:27794:27794 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.205<0> [1]eth1:172.31.224.87<0> [2]eth2:172.31.237.20<0> [3]eth3:172.31.226.214<0>
gpu-st-p4d-24xlarge-57:27796:27796 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.205<0> [1]eth1:172.31.224.87<0> [2]eth2:172.31.237.20<0> [3]eth3:172.31.226.214<0>
gpu-st-p4d-24xlarge-57:27791:27791 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.205<0> [1]eth1:172.31.224.87<0> [2]eth2:172.31.237.20<0> [3]eth3:172.31.226.214<0>
gpu-st-p4d-24xlarge-57:27793:27793 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO Bootstrap : Using eth0:172.31.226.192<0>
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO Bootstrap : Using eth0:172.31.226.192<0>
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO Bootstrap : Using eth0:172.31.226.192<0>
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO Bootstrap : Using eth0:172.31.226.192<0>
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO Bootstrap : Using eth0:172.31.226.192<0>
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO Bootstrap : Using eth0:172.31.226.192<0>
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.154<0> [1]eth1:172.31.239.92<0> [2]eth2:172.31.234.27<0> [3]eth3:172.31.229.218<0>
gpu-st-p4d-24xlarge-54:29515:29515 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.154<0> [1]eth1:172.31.239.92<0> [2]eth2:172.31.234.27<0> [3]eth3:172.31.229.218<0>
gpu-st-p4d-24xlarge-54:29519:29519 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.154<0> [1]eth1:172.31.239.92<0> [2]eth2:172.31.234.27<0> [3]eth3:172.31.229.218<0>
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.154<0> [1]eth1:172.31.239.92<0> [2]eth2:172.31.234.27<0> [3]eth3:172.31.229.218<0>
gpu-st-p4d-24xlarge-54:29517:29517 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-54:29514:29514 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.154<0> [1]eth1:172.31.239.92<0> [2]eth2:172.31.234.27<0> [3]eth3:172.31.229.218<0>
gpu-st-p4d-24xlarge-54:29518:29518 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.229.154<0> [1]eth1:172.31.239.92<0> [2]eth2:172.31.234.27<0> [3]eth3:172.31.229.218<0>
gpu-st-p4d-24xlarge-54:29516:29516 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO Bootstrap : Using eth0:172.31.231.78<0>
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.78<0> [1]eth1:172.31.225.87<0> [2]eth2:172.31.235.21<0> [3]eth3:172.31.235.87<0>
gpu-st-p4d-24xlarge-44:28066:28066 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO Bootstrap : Using eth0:172.31.230.245<0>
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO Bootstrap : Using eth0:172.31.230.245<0>
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO Bootstrap : Using eth0:172.31.230.245<0>
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO Bootstrap : Using eth0:172.31.231.78<0>
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO Bootstrap : Using eth0:172.31.230.245<0>
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO Bootstrap : Using eth0:172.31.230.245<0>
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.198<0> [1]eth1:172.31.227.208<0> [2]eth2:172.31.224.139<0> [3]eth3:172.31.229.184<0>
gpu-st-p4d-24xlarge-53:26990:26990 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO Bootstrap : Using eth0:172.31.230.245<0>
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO Bootstrap : Using eth0:172.31.231.78<0>
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.78<0> [1]eth1:172.31.225.87<0> [2]eth2:172.31.235.21<0> [3]eth3:172.31.235.87<0>
gpu-st-p4d-24xlarge-44:28069:28069 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.198<0> [1]eth1:172.31.227.208<0> [2]eth2:172.31.224.139<0> [3]eth3:172.31.229.184<0>
gpu-st-p4d-24xlarge-53:26987:26987 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.198<0> [1]eth1:172.31.227.208<0> [2]eth2:172.31.224.139<0> [3]eth3:172.31.229.184<0>
gpu-st-p4d-24xlarge-53:26989:26989 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.198<0> [1]eth1:172.31.227.208<0> [2]eth2:172.31.224.139<0> [3]eth3:172.31.229.184<0>
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.198<0> [1]eth1:172.31.227.208<0> [2]eth2:172.31.224.139<0> [3]eth3:172.31.229.184<0>
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.198<0> [1]eth1:172.31.227.208<0> [2]eth2:172.31.224.139<0> [3]eth3:172.31.229.184<0>
gpu-st-p4d-24xlarge-53:26988:26988 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-53:26991:26991 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-53:26986:26986 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.78<0> [1]eth1:172.31.225.87<0> [2]eth2:172.31.235.21<0> [3]eth3:172.31.235.87<0>
gpu-st-p4d-24xlarge-44:28070:28070 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.192<0> [1]eth1:172.31.231.65<0> [2]eth2:172.31.235.241<0> [3]eth3:172.31.237.240<0>
gpu-st-p4d-24xlarge-46:30086:30086 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.192<0> [1]eth1:172.31.231.65<0> [2]eth2:172.31.235.241<0> [3]eth3:172.31.237.240<0>
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.192<0> [1]eth1:172.31.231.65<0> [2]eth2:172.31.235.241<0> [3]eth3:172.31.237.240<0>
gpu-st-p4d-24xlarge-46:30081:30081 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-46:30084:30084 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.192<0> [1]eth1:172.31.231.65<0> [2]eth2:172.31.235.241<0> [3]eth3:172.31.237.240<0>
gpu-st-p4d-24xlarge-46:30083:30083 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.192<0> [1]eth1:172.31.231.65<0> [2]eth2:172.31.235.241<0> [3]eth3:172.31.237.240<0>
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.192<0> [1]eth1:172.31.231.65<0> [2]eth2:172.31.235.241<0> [3]eth3:172.31.237.240<0>
gpu-st-p4d-24xlarge-46:30082:30082 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-46:30085:30085 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.235.246<0> [1]eth1:172.31.233.8<0> [2]eth2:172.31.239.73<0> [3]eth3:172.31.232.185<0>
gpu-st-p4d-24xlarge-45:30135:30135 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.235.246<0> [1]eth1:172.31.233.8<0> [2]eth2:172.31.239.73<0> [3]eth3:172.31.232.185<0>
gpu-st-p4d-24xlarge-45:30145:30145 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.235.246<0> [1]eth1:172.31.233.8<0> [2]eth2:172.31.239.73<0> [3]eth3:172.31.232.185<0>
gpu-st-p4d-24xlarge-45:30136:30136 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.235.246<0> [1]eth1:172.31.233.8<0> [2]eth2:172.31.239.73<0> [3]eth3:172.31.232.185<0>
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.235.246<0> [1]eth1:172.31.233.8<0> [2]eth2:172.31.239.73<0> [3]eth3:172.31.232.185<0>
gpu-st-p4d-24xlarge-45:30138:30138 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.235.246<0> [1]eth1:172.31.233.8<0> [2]eth2:172.31.239.73<0> [3]eth3:172.31.232.185<0>
gpu-st-p4d-24xlarge-45:30139:30139 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-45:30137:30137 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO Bootstrap : Using eth0:172.31.231.78<0>
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.78<0> [1]eth1:172.31.225.87<0> [2]eth2:172.31.235.21<0> [3]eth3:172.31.235.87<0>
gpu-st-p4d-24xlarge-44:28067:28067 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO Bootstrap : Using eth0:172.31.227.130<0>
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO Bootstrap : Using eth0:172.31.227.130<0>
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO Bootstrap : Using eth0:172.31.227.130<0>
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO Bootstrap : Using eth0:172.31.227.130<0>
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO Bootstrap : Using eth0:172.31.227.130<0>
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO Bootstrap : Using eth0:172.31.227.130<0>
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO Bootstrap : Using eth0:172.31.237.192<0>
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO Bootstrap : Using eth0:172.31.233.13<0>
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO Bootstrap : Using eth0:172.31.237.192<0>
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO Bootstrap : Using eth0:172.31.233.13<0>
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO Bootstrap : Using eth0:172.31.233.13<0>
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO Bootstrap : Using eth0:172.31.237.192<0>
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO Bootstrap : Using eth0:172.31.237.192<0>
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO Bootstrap : Using eth0:172.31.237.192<0>
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO Bootstrap : Using eth0:172.31.233.13<0>
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO Bootstrap : Using eth0:172.31.233.13<0>
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO Bootstrap : Using eth0:172.31.237.192<0>
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO Bootstrap : Using eth0:172.31.233.13<0>
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.230.245<0> [1]eth1:172.31.237.231<0> [2]eth2:172.31.225.111<0> [3]eth3:172.31.227.43<0>
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.230.245<0> [1]eth1:172.31.237.231<0> [2]eth2:172.31.225.111<0> [3]eth3:172.31.227.43<0>
gpu-st-p4d-24xlarge-52:39399:39399 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-52:39398:39398 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.230.245<0> [1]eth1:172.31.237.231<0> [2]eth2:172.31.225.111<0> [3]eth3:172.31.227.43<0>
gpu-st-p4d-24xlarge-52:39402:39402 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.230.245<0> [1]eth1:172.31.237.231<0> [2]eth2:172.31.225.111<0> [3]eth3:172.31.227.43<0>
gpu-st-p4d-24xlarge-52:39397:39397 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.230.245<0> [1]eth1:172.31.237.231<0> [2]eth2:172.31.225.111<0> [3]eth3:172.31.227.43<0>
gpu-st-p4d-24xlarge-52:39401:39401 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.230.245<0> [1]eth1:172.31.237.231<0> [2]eth2:172.31.225.111<0> [3]eth3:172.31.227.43<0>
gpu-st-p4d-24xlarge-52:39400:39400 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO Bootstrap : Using eth0:172.31.233.244<0>
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO Bootstrap : Using eth0:172.31.233.244<0>
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO Bootstrap : Using eth0:172.31.233.244<0>
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO Bootstrap : Using eth0:172.31.233.244<0>
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO Bootstrap : Using eth0:172.31.233.244<0>
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO Bootstrap : Using eth0:172.31.233.244<0>
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.130<0> [1]eth1:172.31.233.129<0> [2]eth2:172.31.237.125<0> [3]eth3:172.31.227.64<0>
gpu-st-p4d-24xlarge-56:28173:28173 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.130<0> [1]eth1:172.31.233.129<0> [2]eth2:172.31.237.125<0> [3]eth3:172.31.227.64<0>
gpu-st-p4d-24xlarge-56:28172:28172 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.130<0> [1]eth1:172.31.233.129<0> [2]eth2:172.31.237.125<0> [3]eth3:172.31.227.64<0>
gpu-st-p4d-24xlarge-56:28174:28174 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.130<0> [1]eth1:172.31.233.129<0> [2]eth2:172.31.237.125<0> [3]eth3:172.31.227.64<0>
gpu-st-p4d-24xlarge-56:28176:28176 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.130<0> [1]eth1:172.31.233.129<0> [2]eth2:172.31.237.125<0> [3]eth3:172.31.227.64<0>
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.227.130<0> [1]eth1:172.31.233.129<0> [2]eth2:172.31.237.125<0> [3]eth3:172.31.227.64<0>
gpu-st-p4d-24xlarge-56:28175:28175 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-56:28171:28171 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO Bootstrap : Using eth0:172.31.231.150<0>
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO Bootstrap : Using eth0:172.31.231.150<0>
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO Bootstrap : Using eth0:172.31.231.150<0>
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO Bootstrap : Using eth0:172.31.231.150<0>
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO Bootstrap : Using eth0:172.31.231.150<0>
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO Bootstrap : Using eth0:172.31.231.150<0>
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.13<0> [1]eth1:172.31.239.86<0> [2]eth2:172.31.227.84<0> [3]eth3:172.31.237.21<0>
gpu-st-p4d-24xlarge-59:27922:27922 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.13<0> [1]eth1:172.31.239.86<0> [2]eth2:172.31.227.84<0> [3]eth3:172.31.237.21<0>
gpu-st-p4d-24xlarge-59:27918:27918 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.13<0> [1]eth1:172.31.239.86<0> [2]eth2:172.31.227.84<0> [3]eth3:172.31.237.21<0>
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.13<0> [1]eth1:172.31.239.86<0> [2]eth2:172.31.227.84<0> [3]eth3:172.31.237.21<0>
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.13<0> [1]eth1:172.31.239.86<0> [2]eth2:172.31.227.84<0> [3]eth3:172.31.237.21<0>
gpu-st-p4d-24xlarge-59:27923:27923 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.13<0> [1]eth1:172.31.239.86<0> [2]eth2:172.31.227.84<0> [3]eth3:172.31.237.21<0>
gpu-st-p4d-24xlarge-59:27920:27920 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-59:27921:27921 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-59:27919:27919 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.192<0> [1]eth1:172.31.225.254<0> [2]eth2:172.31.234.186<0> [3]eth3:172.31.234.61<0>
gpu-st-p4d-24xlarge-58:27720:27720 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.192<0> [1]eth1:172.31.225.254<0> [2]eth2:172.31.234.186<0> [3]eth3:172.31.234.61<0>
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.192<0> [1]eth1:172.31.225.254<0> [2]eth2:172.31.234.186<0> [3]eth3:172.31.234.61<0>
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.192<0> [1]eth1:172.31.225.254<0> [2]eth2:172.31.234.186<0> [3]eth3:172.31.234.61<0>
gpu-st-p4d-24xlarge-58:27722:27722 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-58:27723:27723 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-58:27721:27721 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.192<0> [1]eth1:172.31.225.254<0> [2]eth2:172.31.234.186<0> [3]eth3:172.31.234.61<0>
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.237.192<0> [1]eth1:172.31.225.254<0> [2]eth2:172.31.234.186<0> [3]eth3:172.31.234.61<0>
gpu-st-p4d-24xlarge-58:27724:27724 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-58:27719:27719 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO Bootstrap : Using eth0:172.31.234.184<0>
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO Bootstrap : Using eth0:172.31.231.152<0>
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO Bootstrap : Using eth0:172.31.234.184<0>
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO Bootstrap : Using eth0:172.31.234.184<0>
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO Bootstrap : Using eth0:172.31.234.184<0>
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO Bootstrap : Using eth0:172.31.234.184<0>
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO Bootstrap : Using eth0:172.31.234.184<0>
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO Bootstrap : Using eth0:172.31.231.152<0>
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO Bootstrap : Using eth0:172.31.231.152<0>
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO Bootstrap : Using eth0:172.31.231.152<0>
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO Bootstrap : Using eth0:172.31.231.152<0>
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.244<0> [1]eth1:172.31.226.133<0> [2]eth2:172.31.227.24<0> [3]eth3:172.31.234.3<0>
gpu-st-p4d-24xlarge-51:30343:30343 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.244<0> [1]eth1:172.31.226.133<0> [2]eth2:172.31.227.24<0> [3]eth3:172.31.234.3<0>
gpu-st-p4d-24xlarge-51:30342:30342 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO Bootstrap : Using eth0:172.31.231.152<0>
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.244<0> [1]eth1:172.31.226.133<0> [2]eth2:172.31.227.24<0> [3]eth3:172.31.234.3<0>
gpu-st-p4d-24xlarge-51:30344:30344 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.244<0> [1]eth1:172.31.226.133<0> [2]eth2:172.31.227.24<0> [3]eth3:172.31.234.3<0>
gpu-st-p4d-24xlarge-51:30346:30346 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.244<0> [1]eth1:172.31.226.133<0> [2]eth2:172.31.227.24<0> [3]eth3:172.31.234.3<0>
gpu-st-p4d-24xlarge-51:30347:30347 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.233.244<0> [1]eth1:172.31.226.133<0> [2]eth2:172.31.227.24<0> [3]eth3:172.31.234.3<0>
gpu-st-p4d-24xlarge-51:30345:30345 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.150<0> [1]eth1:172.31.226.208<0> [2]eth2:172.31.239.214<0> [3]eth3:172.31.234.152<0>
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-55:29155:29155 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.150<0> [1]eth1:172.31.226.208<0> [2]eth2:172.31.239.214<0> [3]eth3:172.31.234.152<0>
gpu-st-p4d-24xlarge-55:29157:29157 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.150<0> [1]eth1:172.31.226.208<0> [2]eth2:172.31.239.214<0> [3]eth3:172.31.234.152<0>
gpu-st-p4d-24xlarge-55:29156:29156 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.150<0> [1]eth1:172.31.226.208<0> [2]eth2:172.31.239.214<0> [3]eth3:172.31.234.152<0>
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.150<0> [1]eth1:172.31.226.208<0> [2]eth2:172.31.239.214<0> [3]eth3:172.31.234.152<0>
gpu-st-p4d-24xlarge-55:29158:29158 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29159:29159 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.150<0> [1]eth1:172.31.226.208<0> [2]eth2:172.31.239.214<0> [3]eth3:172.31.234.152<0>
gpu-st-p4d-24xlarge-55:29154:29154 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO Bootstrap : Using eth0:172.31.226.226<0>
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO Bootstrap : Using eth0:172.31.226.226<0>
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO Bootstrap : Using eth0:172.31.226.226<0>
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO Bootstrap : Using eth0:172.31.226.226<0>
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO Bootstrap : Using eth0:172.31.226.226<0>
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO Bootstrap : Using eth0:172.31.226.226<0>
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.234.184<0> [1]eth1:172.31.234.169<0> [2]eth2:172.31.237.157<0> [3]eth3:172.31.230.169<0>
gpu-st-p4d-24xlarge-48:30191:30191 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.234.184<0> [1]eth1:172.31.234.169<0> [2]eth2:172.31.237.157<0> [3]eth3:172.31.230.169<0>
gpu-st-p4d-24xlarge-48:30194:30194 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.234.184<0> [1]eth1:172.31.234.169<0> [2]eth2:172.31.237.157<0> [3]eth3:172.31.230.169<0>
gpu-st-p4d-24xlarge-48:30192:30192 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.234.184<0> [1]eth1:172.31.234.169<0> [2]eth2:172.31.237.157<0> [3]eth3:172.31.230.169<0>
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.234.184<0> [1]eth1:172.31.234.169<0> [2]eth2:172.31.237.157<0> [3]eth3:172.31.230.169<0>
gpu-st-p4d-24xlarge-48:30193:30193 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30196:30196 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.234.184<0> [1]eth1:172.31.234.169<0> [2]eth2:172.31.237.157<0> [3]eth3:172.31.230.169<0>
gpu-st-p4d-24xlarge-48:30195:30195 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.152<0> [1]eth1:172.31.239.148<0> [2]eth2:172.31.238.214<0> [3]eth3:172.31.234.88<0>
gpu-st-p4d-24xlarge-50:29211:29211 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.152<0> [1]eth1:172.31.239.148<0> [2]eth2:172.31.238.214<0> [3]eth3:172.31.234.88<0>
gpu-st-p4d-24xlarge-50:29213:29213 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.152<0> [1]eth1:172.31.239.148<0> [2]eth2:172.31.238.214<0> [3]eth3:172.31.234.88<0>
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.152<0> [1]eth1:172.31.239.148<0> [2]eth2:172.31.238.214<0> [3]eth3:172.31.234.88<0>
gpu-st-p4d-24xlarge-50:29215:29215 [0] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29212:29212 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.152<0> [1]eth1:172.31.239.148<0> [2]eth2:172.31.238.214<0> [3]eth3:172.31.234.88<0>
gpu-st-p4d-24xlarge-50:29214:29214 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO NET/Socket : Using [0]eth0:172.31.231.152<0> [1]eth1:172.31.239.148<0> [2]eth2:172.31.238.214<0> [3]eth3:172.31.234.88<0>
gpu-st-p4d-24xlarge-50:29216:29216 [1] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.226<0> [1]eth1:172.31.235.228<0> [2]eth2:172.31.224.181<0> [3]eth3:172.31.227.177<0>
gpu-st-p4d-24xlarge-47:30192:30192 [6] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.226<0> [1]eth1:172.31.235.228<0> [2]eth2:172.31.224.181<0> [3]eth3:172.31.227.177<0>
gpu-st-p4d-24xlarge-47:30191:30191 [5] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v5 symbol.
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO P2P plugin IBext
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO NET/IB : No device found.
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.226<0> [1]eth1:172.31.235.228<0> [2]eth2:172.31.224.181<0> [3]eth3:172.31.227.177<0>
gpu-st-p4d-24xlarge-47:30189:30189 [3] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.226<0> [1]eth1:172.31.235.228<0> [2]eth2:172.31.224.181<0> [3]eth3:172.31.227.177<0>
gpu-st-p4d-24xlarge-47:30188:30188 [2] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.226<0> [1]eth1:172.31.235.228<0> [2]eth2:172.31.224.181<0> [3]eth3:172.31.227.177<0>
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO NET/Socket : Using [0]eth0:172.31.226.226<0> [1]eth1:172.31.235.228<0> [2]eth2:172.31.224.181<0> [3]eth3:172.31.227.177<0>
gpu-st-p4d-24xlarge-47:30193:30193 [7] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-47:30190:30190 [4] NCCL INFO Using network Socket
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Setting affinity for GPU 6 to ffff,ff000000
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Setting affinity for GPU 4 to ffff,ff000000
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Setting affinity for GPU 7 to ffff,ff000000
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Setting affinity for GPU 3 to ffffff
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Setting affinity for GPU 1 to ffffff
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Setting affinity for GPU 2 to ffffff
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Setting affinity for GPU 5 to ffff,ff000000
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Setting affinity for GPU 0 to ffffff
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Trees [0] 81/-1/-1->80->79 [1] 81/-1/-1->80->79 [2] 81/-1/-1->80->89 [3] 81/-1/-1->80->89 [4] 81/-1/-1->80->79 [5] 81/-1/-1->80->79 [6] 81/76/-1->80->69 [7] 81/76/-1->80->69
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Trees [0] -1/-1/-1->83->82 [1] -1/-1/-1->83->82 [2] 78/-1/-1->83->82 [3] 78/-1/-1->83->82 [4] -1/-1/-1->83->82 [5] -1/-1/-1->83->82 [6] 78/-1/-1->83->82 [7] 78/-1/-1->83->82
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Trees [0] 82/-1/-1->81->80 [1] 82/-1/-1->81->80 [2] 82/-1/-1->81->80 [3] 82/-1/-1->81->80 [4] 82/-1/-1->81->80 [5] 82/-1/-1->81->80 [6] 82/88/-1->81->80 [7] 82/88/-1->81->80
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Trees [0] 83/-1/-1->82->81 [1] 83/-1/-1->82->81 [2] 83/-1/-1->82->81 [3] 83/-1/-1->82->81 [4] 83/-1/-1->82->81 [5] 83/-1/-1->82->81 [6] 83/-1/-1->82->81 [7] 83/-1/-1->82->81
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Trees [0] 85/90/-1->84->72 [1] 85/90/-1->84->72 [2] 85/-1/-1->84->89 [3] 85/-1/-1->84->89 [4] 85/-1/-1->84->79 [5] 85/-1/-1->84->79 [6] 85/-1/-1->84->89 [7] 85/-1/-1->84->89
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Trees [0] 87/-1/-1->86->85 [1] 87/-1/-1->86->85 [2] 87/-1/-1->86->85 [3] 87/-1/-1->86->85 [4] 87/-1/-1->86->85 [5] 87/-1/-1->86->85 [6] 87/-1/-1->86->85 [7] 87/-1/-1->86->85
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Trees [0] 86/78/-1->85->84 [1] 86/78/-1->85->84 [2] 86/-1/-1->85->84 [3] 86/-1/-1->85->84 [4] 86/-1/-1->85->84 [5] 86/-1/-1->85->84 [6] 86/-1/-1->85->84 [7] 86/-1/-1->85->84
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Trees [0] 88/-1/-1->87->86 [1] 88/-1/-1->87->86 [2] -1/-1/-1->87->86 [3] -1/-1/-1->87->86 [4] 88/-1/-1->87->86 [5] 88/-1/-1->87->86 [6] -1/-1/-1->87->86 [7] -1/-1/-1->87->86
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Trees [0] 89/-1/-1->88->87 [1] 89/-1/-1->88->87 [2] 89/92/-1->88->76 [3] 89/92/-1->88->76 [4] 89/-1/-1->88->87 [5] 89/-1/-1->88->87 [6] 89/-1/-1->88->81 [7] 89/-1/-1->88->81
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Trees [0] -1/-1/-1->89->88 [1] -1/-1/-1->89->88 [2] 84/80/-1->89->88 [3] 84/80/-1->89->88 [4] -1/-1/-1->89->88 [5] -1/-1/-1->89->88 [6] 84/-1/-1->89->88 [7] 84/-1/-1->89->88
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Trees [0] 91/-1/-1->90->84 [1] 91/-1/-1->90->84 [2] 91/-1/-1->90->95 [3] 91/-1/-1->90->95 [4] 91/42/-1->90->-1 [5] 91/42/-1->90->-1 [6] 91/-1/-1->90->95 [7] 91/-1/-1->90->95
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Trees [0] -1/-1/-1->77->76 [1] -1/-1/-1->77->76 [2] 72/64/-1->77->76 [3] 72/64/-1->77->76 [4] -1/-1/-1->77->76 [5] -1/-1/-1->77->76 [6] 72/-1/-1->77->76 [7] 72/-1/-1->77->76
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 00/08 : 0 5 4 3 2 1 6 11 10 9 8 7 12 17 16 15 14 13 18 23
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 01/08 : 0 5 4 3 2 1 6 11 10 9 8 7 12 17 16 15 14 13 18 23
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Trees [0] 80/-1/-1->79->78 [1] 80/-1/-1->79->78 [2] -1/-1/-1->79->78 [3] -1/-1/-1->79->78 [4] 80/84/-1->79->78 [5] 80/84/-1->79->78 [6] -1/-1/-1->79->78 [7] -1/-1/-1->79->78
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 02/08 : 0 3 2 5 8 7 6 11 10 9 16 13 12 15 14 17 20 19 18 23
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 03/08 : 0 3 2 5 8 7 6 11 10 9 16 13 12 15 14 17 20 19 18 23
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Trees [0] 3/-1/-1->2->1 [1] 3/-1/-1->2->1 [2] 3/-1/-1->2->1 [3] 3/-1/-1->2->1 [4] 3/-1/-1->2->1 [5] 3/-1/-1->2->1 [6] 3/-1/-1->2->1 [7] 3/-1/-1->2->1
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Trees [0] -1/-1/-1->95->94 [1] -1/-1/-1->95->94 [2] 90/-1/-1->95->94 [3] 90/-1/-1->95->94 [4] -1/-1/-1->95->94 [5] -1/-1/-1->95->94 [6] 90/-1/-1->95->94 [7] 90/-1/-1->95->94
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Trees [0] 93/-1/-1->92->91 [1] 93/-1/-1->92->91 [2] 93/-1/-1->92->88 [3] 93/-1/-1->92->88 [4] 93/-1/-1->92->91 [5] 93/-1/-1->92->91 [6] 93/44/-1->92->-1 [7] 93/44/-1->92->-1
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Trees [0] 77/-1/-1->76->75 [1] 77/-1/-1->76->75 [2] 77/88/-1->76->52 [3] 77/88/-1->76->52 [4] 77/-1/-1->76->75 [5] 77/-1/-1->76->75 [6] 77/-1/-1->76->80 [7] 77/-1/-1->76->80
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 04/08 : 0 5 4 3 2 1 6 11 10 9 8 7 12 17 16 15 14 13 18 23
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 05/08 : 0 5 4 3 2 1 6 11 10 9 8 7 12 17 16 15 14 13 18 23
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Trees [0] 79/-1/-1->78->85 [1] 79/-1/-1->78->85 [2] 79/-1/-1->78->83 [3] 79/-1/-1->78->83 [4] 79/72/-1->78->67 [5] 79/72/-1->78->67 [6] 79/-1/-1->78->83 [7] 79/-1/-1->78->83
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Trees [0] 76/-1/-1->75->74 [1] 76/-1/-1->75->74 [2] -1/-1/-1->75->74 [3] -1/-1/-1->75->74 [4] 76/-1/-1->75->74 [5] 76/-1/-1->75->74 [6] -1/-1/-1->75->74 [7] -1/-1/-1->75->74
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Trees [0] 4/-1/-1->3->2 [1] 4/-1/-1->3->2 [2] -1/-1/-1->3->2 [3] -1/-1/-1->3->2 [4] 4/-1/-1->3->2 [5] 4/-1/-1->3->2 [6] -1/-1/-1->3->2 [7] -1/-1/-1->3->2
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Trees [0] -1/-1/-1->5->4 [1] -1/-1/-1->5->4 [2] 0/-1/-1->5->4 [3] 0/-1/-1->5->4 [4] -1/-1/-1->5->4 [5] -1/-1/-1->5->4 [6] 0/-1/-1->5->4 [7] 0/-1/-1->5->4
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 06/08 : 0 3 2 5 8 7 6 11 10 9 16 13 12 15 14 17 20 19 18 23
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Trees [0] 5/-1/-1->4->3 [1] 5/-1/-1->4->3 [2] 5/52/-1->4->-1 [3] 5/52/-1->4->-1 [4] 5/-1/-1->4->3 [5] 5/-1/-1->4->3 [6] 5/-1/-1->4->8 [7] 5/-1/-1->4->8
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Trees [0] 8/-1/-1->7->6 [1] 8/-1/-1->7->6 [2] -1/-1/-1->7->6 [3] -1/-1/-1->7->6 [4] 8/12/-1->7->6 [5] 8/12/-1->7->6 [6] -1/-1/-1->7->6 [7] -1/-1/-1->7->6
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Trees [0] 9/-1/-1->8->7 [1] 9/-1/-1->8->7 [2] 9/-1/-1->8->17 [3] 9/-1/-1->8->17 [4] 9/-1/-1->8->7 [5] 9/-1/-1->8->7 [6] 9/4/-1->8->20 [7] 9/4/-1->8->20
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Trees [0] -1/-1/-1->71->70 [1] -1/-1/-1->71->70 [2] 66/-1/-1->71->70 [3] 66/-1/-1->71->70 [4] -1/-1/-1->71->70 [5] -1/-1/-1->71->70 [6] 66/-1/-1->71->70 [7] 66/-1/-1->71->70
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Trees [0] 71/-1/-1->70->69 [1] 71/-1/-1->70->69 [2] 71/-1/-1->70->69 [3] 71/-1/-1->70->69 [4] 71/-1/-1->70->69 [5] 71/-1/-1->70->69 [6] 71/-1/-1->70->69 [7] 71/-1/-1->70->69
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Trees [0] 75/-1/-1->74->73 [1] 75/-1/-1->74->73 [2] 75/-1/-1->74->73 [3] 75/-1/-1->74->73 [4] 75/-1/-1->74->73 [5] 75/-1/-1->74->73 [6] 75/-1/-1->74->73 [7] 75/-1/-1->74->73
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 07/08 : 0 3 2 5 8 7 6 11 10 9 16 13 12 15 14 17 20 19 18 23
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Trees [0] 1/48/-1->0->-1 [1] 1/48/-1->0->-1 [2] 1/-1/-1->0->5 [3] 1/-1/-1->0->5 [4] 1/-1/-1->0->6 [5] 1/-1/-1->0->6 [6] 1/-1/-1->0->5 [7] 1/-1/-1->0->5
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Trees [0] 65/-1/-1->64->63 [1] 65/-1/-1->64->63 [2] 65/68/-1->64->77 [3] 65/68/-1->64->77 [4] 65/-1/-1->64->63 [5] 65/-1/-1->64->63 [6] 65/-1/-1->64->57 [7] 65/-1/-1->64->57
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Trees [0] -1/-1/-1->65->64 [1] -1/-1/-1->65->64 [2] 60/56/-1->65->64 [3] 60/56/-1->65->64 [4] -1/-1/-1->65->64 [5] -1/-1/-1->65->64 [6] 60/-1/-1->65->64 [7] 60/-1/-1->65->64
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Trees [0] 17/-1/-1->16->15 [1] 17/-1/-1->16->15 [2] 17/20/-1->16->29 [3] 17/20/-1->16->29 [4] 17/-1/-1->16->15 [5] 17/-1/-1->16->15 [6] 17/-1/-1->16->9 [7] 17/-1/-1->16->9
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Trees [0] 95/-1/-1->94->93 [1] 95/-1/-1->94->93 [2] 95/-1/-1->94->93 [3] 95/-1/-1->94->93 [4] 95/-1/-1->94->93 [5] 95/-1/-1->94->93 [6] 95/-1/-1->94->93 [7] 95/-1/-1->94->93
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Trees [0] 94/-1/-1->93->92 [1] 94/-1/-1->93->92 [2] 94/-1/-1->93->92 [3] 94/-1/-1->93->92 [4] 94/-1/-1->93->92 [5] 94/-1/-1->93->92 [6] 94/-1/-1->93->92 [7] 94/-1/-1->93->92
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Trees [0] 92/-1/-1->91->90 [1] 92/-1/-1->91->90 [2] -1/-1/-1->91->90 [3] -1/-1/-1->91->90 [4] 92/-1/-1->91->90 [5] 92/-1/-1->91->90 [6] -1/-1/-1->91->90 [7] -1/-1/-1->91->90
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Trees [0] 73/84/-1->72->48 [1] 73/84/-1->72->48 [2] 73/-1/-1->72->77 [3] 73/-1/-1->72->77 [4] 73/-1/-1->72->78 [5] 73/-1/-1->72->78 [6] 73/-1/-1->72->77 [7] 73/-1/-1->72->77
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Trees [0] 74/60/-1->73->72 [1] 74/60/-1->73->72 [2] 74/-1/-1->73->72 [3] 74/-1/-1->73->72 [4] 74/-1/-1->73->72 [5] 74/-1/-1->73->72 [6] 74/-1/-1->73->72 [7] 74/-1/-1->73->72
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Trees [0] -1/-1/-1->11->10 [1] -1/-1/-1->11->10 [2] 6/-1/-1->11->10 [3] 6/-1/-1->11->10 [4] -1/-1/-1->11->10 [5] -1/-1/-1->11->10 [6] 6/-1/-1->11->10 [7] 6/-1/-1->11->10
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Trees [0] 11/-1/-1->10->9 [1] 11/-1/-1->10->9 [2] 11/-1/-1->10->9 [3] 11/-1/-1->10->9 [4] 11/-1/-1->10->9 [5] 11/-1/-1->10->9 [6] 11/-1/-1->10->9 [7] 11/-1/-1->10->9
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Trees [0] 10/-1/-1->9->8 [1] 10/-1/-1->9->8 [2] 10/-1/-1->9->8 [3] 10/-1/-1->9->8 [4] 10/-1/-1->9->8 [5] 10/-1/-1->9->8 [6] 10/16/-1->9->8 [7] 10/16/-1->9->8
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Trees [0] 7/-1/-1->6->13 [1] 7/-1/-1->6->13 [2] 7/-1/-1->6->11 [3] 7/-1/-1->6->11 [4] 7/0/-1->6->18 [5] 7/0/-1->6->18 [6] 7/-1/-1->6->11 [7] 7/-1/-1->6->11
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Trees [0] 57/-1/-1->56->55 [1] 57/-1/-1->56->55 [2] 57/-1/-1->56->65 [3] 57/-1/-1->56->65 [4] 57/-1/-1->56->55 [5] 57/-1/-1->56->55 [6] 57/52/-1->56->68 [7] 57/52/-1->56->68
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Trees [0] 70/-1/-1->69->68 [1] 70/-1/-1->69->68 [2] 70/-1/-1->69->68 [3] 70/-1/-1->69->68 [4] 70/-1/-1->69->68 [5] 70/-1/-1->69->68 [6] 70/80/-1->69->68 [7] 70/80/-1->69->68
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Trees [0] -1/-1/-1->53->52 [1] -1/-1/-1->53->52 [2] 48/28/-1->53->52 [3] 48/28/-1->53->52 [4] -1/-1/-1->53->52 [5] -1/-1/-1->53->52 [6] 48/-1/-1->53->52 [7] 48/-1/-1->53->52
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Trees [0] 53/-1/-1->52->51 [1] 53/-1/-1->52->51 [2] 53/76/-1->52->4 [3] 53/76/-1->52->4 [4] 53/-1/-1->52->51 [5] 53/-1/-1->52->51 [6] 53/-1/-1->52->56 [7] 53/-1/-1->52->56
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Trees [0] 64/-1/-1->63->62 [1] 64/-1/-1->63->62 [2] -1/-1/-1->63->62 [3] -1/-1/-1->63->62 [4] 64/-1/-1->63->62 [5] 64/-1/-1->63->62 [6] -1/-1/-1->63->62 [7] -1/-1/-1->63->62
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Trees [0] -1/-1/-1->47->46 [1] -1/-1/-1->47->46 [2] 42/-1/-1->47->46 [3] 42/-1/-1->47->46 [4] -1/-1/-1->47->46 [5] -1/-1/-1->47->46 [6] 42/-1/-1->47->46 [7] 42/-1/-1->47->46
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Trees [0] 47/-1/-1->46->45 [1] 47/-1/-1->46->45 [2] 47/-1/-1->46->45 [3] 47/-1/-1->46->45 [4] 47/-1/-1->46->45 [5] 47/-1/-1->46->45 [6] 47/-1/-1->46->45 [7] 47/-1/-1->46->45
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Trees [0] 69/-1/-1->68->67 [1] 69/-1/-1->68->67 [2] 69/-1/-1->68->64 [3] 69/-1/-1->68->64 [4] 69/-1/-1->68->67 [5] 69/-1/-1->68->67 [6] 69/56/-1->68->45 [7] 69/56/-1->68->45
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Trees [0] 15/-1/-1->14->13 [1] 15/-1/-1->14->13 [2] 15/-1/-1->14->13 [3] 15/-1/-1->14->13 [4] 15/-1/-1->14->13 [5] 15/-1/-1->14->13 [6] 15/-1/-1->14->13 [7] 15/-1/-1->14->13
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Trees [0] 41/-1/-1->40->39 [1] 41/-1/-1->40->39 [2] 41/44/-1->40->28 [3] 41/44/-1->40->28 [4] 41/-1/-1->40->39 [5] 41/-1/-1->40->39 [6] 41/-1/-1->40->33 [7] 41/-1/-1->40->33
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Trees [0] 19/-1/-1->18->12 [1] 19/-1/-1->18->12 [2] 19/-1/-1->18->23 [3] 19/-1/-1->18->23 [4] 19/6/-1->18->42 [5] 19/6/-1->18->42 [6] 19/-1/-1->18->23 [7] 19/-1/-1->18->23
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Trees [0] 20/-1/-1->19->18 [1] 20/-1/-1->19->18 [2] -1/-1/-1->19->18 [3] -1/-1/-1->19->18 [4] 20/30/-1->19->18 [5] 20/30/-1->19->18 [6] -1/-1/-1->19->18 [7] -1/-1/-1->19->18
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Trees [0] 23/-1/-1->22->21 [1] 23/-1/-1->22->21 [2] 23/-1/-1->22->21 [3] 23/-1/-1->22->21 [4] 23/-1/-1->22->21 [5] 23/-1/-1->22->21 [6] 23/-1/-1->22->21 [7] 23/-1/-1->22->21
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Trees [0] -1/-1/-1->23->22 [1] -1/-1/-1->23->22 [2] 18/-1/-1->23->22 [3] 18/-1/-1->23->22 [4] -1/-1/-1->23->22 [5] -1/-1/-1->23->22 [6] 18/-1/-1->23->22 [7] 18/-1/-1->23->22
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Trees [0] 68/-1/-1->67->66 [1] 68/-1/-1->67->66 [2] -1/-1/-1->67->66 [3] -1/-1/-1->67->66 [4] 68/78/-1->67->66 [5] 68/78/-1->67->66 [6] -1/-1/-1->67->66 [7] -1/-1/-1->67->66
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Trees [0] 67/-1/-1->66->60 [1] 67/-1/-1->66->60 [2] 67/-1/-1->66->71 [3] 67/-1/-1->66->71 [4] 67/54/-1->66->43 [5] 67/54/-1->66->43 [6] 67/-1/-1->66->71 [7] 67/-1/-1->66->71
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Trees [0] -1/-1/-1->17->16 [1] -1/-1/-1->17->16 [2] 12/8/-1->17->16 [3] 12/8/-1->17->16 [4] -1/-1/-1->17->16 [5] -1/-1/-1->17->16 [6] 12/-1/-1->17->16 [7] 12/-1/-1->17->16
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Trees [0] 13/18/-1->12->25 [1] 13/18/-1->12->25 [2] 13/-1/-1->12->17 [3] 13/-1/-1->12->17 [4] 13/-1/-1->12->7 [5] 13/-1/-1->12->7 [6] 13/-1/-1->12->17 [7] 13/-1/-1->12->17
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Trees [0] 16/-1/-1->15->14 [1] 16/-1/-1->15->14 [2] -1/-1/-1->15->14 [3] -1/-1/-1->15->14 [4] 16/-1/-1->15->14 [5] 16/-1/-1->15->14 [6] -1/-1/-1->15->14 [7] -1/-1/-1->15->14
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Trees [0] 52/-1/-1->51->50 [1] 52/-1/-1->51->50 [2] -1/-1/-1->51->50 [3] -1/-1/-1->51->50 [4] 52/-1/-1->51->50 [5] 52/-1/-1->51->50 [6] -1/-1/-1->51->50 [7] -1/-1/-1->51->50
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Trees [0] 51/-1/-1->50->49 [1] 51/-1/-1->50->49 [2] 51/-1/-1->50->49 [3] 51/-1/-1->50->49 [4] 51/-1/-1->50->49 [5] 51/-1/-1->50->49 [6] 51/-1/-1->50->49 [7] 51/-1/-1->50->49
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Trees [0] 49/72/-1->48->0 [1] 49/72/-1->48->0 [2] 49/-1/-1->48->53 [3] 49/-1/-1->48->53 [4] 49/-1/-1->48->54 [5] 49/-1/-1->48->54 [6] 49/-1/-1->48->53 [7] 49/-1/-1->48->53
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Trees [0] -1/-1/-1->59->58 [1] -1/-1/-1->59->58 [2] 54/-1/-1->59->58 [3] 54/-1/-1->59->58 [4] -1/-1/-1->59->58 [5] -1/-1/-1->59->58 [6] 54/-1/-1->59->58 [7] 54/-1/-1->59->58
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Trees [0] 59/-1/-1->58->57 [1] 59/-1/-1->58->57 [2] 59/-1/-1->58->57 [3] 59/-1/-1->58->57 [4] 59/-1/-1->58->57 [5] 59/-1/-1->58->57 [6] 59/-1/-1->58->57 [7] 59/-1/-1->58->57
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Trees [0] 58/-1/-1->57->56 [1] 58/-1/-1->57->56 [2] 58/-1/-1->57->56 [3] 58/-1/-1->57->56 [4] 58/-1/-1->57->56 [5] 58/-1/-1->57->56 [6] 58/64/-1->57->56 [7] 58/64/-1->57->56
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Trees [0] 25/36/-1->24->49 [1] 25/36/-1->24->49 [2] 25/-1/-1->24->29 [3] 25/-1/-1->24->29 [4] 25/-1/-1->24->30 [5] 25/-1/-1->24->30 [6] 25/-1/-1->24->29 [7] 25/-1/-1->24->29
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Trees [0] -1/-1/-1->29->28 [1] -1/-1/-1->29->28 [2] 24/16/-1->29->28 [3] 24/16/-1->29->28 [4] -1/-1/-1->29->28 [5] -1/-1/-1->29->28 [6] 24/-1/-1->29->28 [7] 24/-1/-1->29->28
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Trees [0] 29/-1/-1->28->27 [1] 29/-1/-1->28->27 [2] 29/40/-1->28->53 [3] 29/40/-1->28->53 [4] 29/-1/-1->28->27 [5] 29/-1/-1->28->27 [6] 29/-1/-1->28->32 [7] 29/-1/-1->28->32
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Trees [0] 63/-1/-1->62->61 [1] 63/-1/-1->62->61 [2] 63/-1/-1->62->61 [3] 63/-1/-1->62->61 [4] 63/-1/-1->62->61 [5] 63/-1/-1->62->61 [6] 63/-1/-1->62->61 [7] 63/-1/-1->62->61
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Trees [0] 14/6/-1->13->12 [1] 14/6/-1->13->12 [2] 14/-1/-1->13->12 [3] 14/-1/-1->13->12 [4] 14/-1/-1->13->12 [5] 14/-1/-1->13->12 [6] 14/-1/-1->13->12 [7] 14/-1/-1->13->12
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Trees [0] 50/24/-1->49->48 [1] 50/24/-1->49->48 [2] 50/-1/-1->49->48 [3] 50/-1/-1->49->48 [4] 50/-1/-1->49->48 [5] 50/-1/-1->49->48 [6] 50/-1/-1->49->48 [7] 50/-1/-1->49->48
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Trees [0] -1/-1/-1->41->40 [1] -1/-1/-1->41->40 [2] 36/32/-1->41->40 [3] 36/32/-1->41->40 [4] -1/-1/-1->41->40 [5] -1/-1/-1->41->40 [6] 36/-1/-1->41->40 [7] 36/-1/-1->41->40
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Trees [0] 56/-1/-1->55->54 [1] 56/-1/-1->55->54 [2] -1/-1/-1->55->54 [3] -1/-1/-1->55->54 [4] 56/60/-1->55->54 [5] 56/60/-1->55->54 [6] -1/-1/-1->55->54 [7] -1/-1/-1->55->54
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Trees [0] 21/-1/-1->20->19 [1] 21/-1/-1->20->19 [2] 21/-1/-1->20->16 [3] 21/-1/-1->20->16 [4] 21/-1/-1->20->19 [5] 21/-1/-1->20->19 [6] 21/8/-1->20->44 [7] 21/8/-1->20->44
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Trees [0] 22/-1/-1->21->20 [1] 22/-1/-1->21->20 [2] 22/-1/-1->21->20 [3] 22/-1/-1->21->20 [4] 22/-1/-1->21->20 [5] 22/-1/-1->21->20 [6] 22/32/-1->21->20 [7] 22/32/-1->21->20
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Trees [0] 62/54/-1->61->60 [1] 62/54/-1->61->60 [2] 62/-1/-1->61->60 [3] 62/-1/-1->61->60 [4] 62/-1/-1->61->60 [5] 62/-1/-1->61->60 [6] 62/-1/-1->61->60 [7] 62/-1/-1->61->60
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Trees [0] 61/66/-1->60->73 [1] 61/66/-1->60->73 [2] 61/-1/-1->60->65 [3] 61/-1/-1->60->65 [4] 61/-1/-1->60->55 [5] 61/-1/-1->60->55 [6] 61/-1/-1->60->65 [7] 61/-1/-1->60->65
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Trees [0] -1/-1/-1->35->34 [1] -1/-1/-1->35->34 [2] 30/-1/-1->35->34 [3] 30/-1/-1->35->34 [4] -1/-1/-1->35->34 [5] -1/-1/-1->35->34 [6] 30/-1/-1->35->34 [7] 30/-1/-1->35->34
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Trees [0] 33/-1/-1->32->31 [1] 33/-1/-1->32->31 [2] 33/-1/-1->32->41 [3] 33/-1/-1->32->41 [4] 33/-1/-1->32->31 [5] 33/-1/-1->32->31 [6] 33/28/-1->32->21 [7] 33/28/-1->32->21
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Trees [0] 35/-1/-1->34->33 [1] 35/-1/-1->34->33 [2] 35/-1/-1->34->33 [3] 35/-1/-1->34->33 [4] 35/-1/-1->34->33 [5] 35/-1/-1->34->33 [6] 35/-1/-1->34->33 [7] 35/-1/-1->34->33
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Trees [0] 46/-1/-1->45->44 [1] 46/-1/-1->45->44 [2] 46/-1/-1->45->44 [3] 46/-1/-1->45->44 [4] 46/-1/-1->45->44 [5] 46/-1/-1->45->44 [6] 46/68/-1->45->44 [7] 46/68/-1->45->44
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Trees [0] 40/-1/-1->39->38 [1] 40/-1/-1->39->38 [2] -1/-1/-1->39->38 [3] -1/-1/-1->39->38 [4] 40/-1/-1->39->38 [5] 40/-1/-1->39->38 [6] -1/-1/-1->39->38 [7] -1/-1/-1->39->38
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Trees [0] 55/-1/-1->54->61 [1] 55/-1/-1->54->61 [2] 55/-1/-1->54->59 [3] 55/-1/-1->54->59 [4] 55/48/-1->54->66 [5] 55/48/-1->54->66 [6] 55/-1/-1->54->59 [7] 55/-1/-1->54->59
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Trees [0] 28/-1/-1->27->26 [1] 28/-1/-1->27->26 [2] -1/-1/-1->27->26 [3] -1/-1/-1->27->26 [4] 28/-1/-1->27->26 [5] 28/-1/-1->27->26 [6] -1/-1/-1->27->26 [7] -1/-1/-1->27->26
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Trees [0] 26/12/-1->25->24 [1] 26/12/-1->25->24 [2] 26/-1/-1->25->24 [3] 26/-1/-1->25->24 [4] 26/-1/-1->25->24 [5] 26/-1/-1->25->24 [6] 26/-1/-1->25->24 [7] 26/-1/-1->25->24
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Trees [0] 27/-1/-1->26->25 [1] 27/-1/-1->26->25 [2] 27/-1/-1->26->25 [3] 27/-1/-1->26->25 [4] 27/-1/-1->26->25 [5] 27/-1/-1->26->25 [6] 27/-1/-1->26->25 [7] 27/-1/-1->26->25
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Trees [0] 34/-1/-1->33->32 [1] 34/-1/-1->33->32 [2] 34/-1/-1->33->32 [3] 34/-1/-1->33->32 [4] 34/-1/-1->33->32 [5] 34/-1/-1->33->32 [6] 34/40/-1->33->32 [7] 34/40/-1->33->32
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Trees [0] 31/-1/-1->30->37 [1] 31/-1/-1->30->37 [2] 31/-1/-1->30->35 [3] 31/-1/-1->30->35 [4] 31/24/-1->30->19 [5] 31/24/-1->30->19 [6] 31/-1/-1->30->35 [7] 31/-1/-1->30->35
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Trees [0] 45/-1/-1->44->43 [1] 45/-1/-1->44->43 [2] 45/-1/-1->44->40 [3] 45/-1/-1->44->40 [4] 45/-1/-1->44->43 [5] 45/-1/-1->44->43 [6] 45/20/-1->44->92 [7] 45/20/-1->44->92
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Trees [0] 39/-1/-1->38->37 [1] 39/-1/-1->38->37 [2] 39/-1/-1->38->37 [3] 39/-1/-1->38->37 [4] 39/-1/-1->38->37 [5] 39/-1/-1->38->37 [6] 39/-1/-1->38->37 [7] 39/-1/-1->38->37
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Trees [0] 32/-1/-1->31->30 [1] 32/-1/-1->31->30 [2] -1/-1/-1->31->30 [3] -1/-1/-1->31->30 [4] 32/36/-1->31->30 [5] 32/36/-1->31->30 [6] -1/-1/-1->31->30 [7] -1/-1/-1->31->30
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Trees [0] 44/-1/-1->43->42 [1] 44/-1/-1->43->42 [2] -1/-1/-1->43->42 [3] -1/-1/-1->43->42 [4] 44/66/-1->43->42 [5] 44/66/-1->43->42 [6] -1/-1/-1->43->42 [7] -1/-1/-1->43->42
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Trees [0] 43/-1/-1->42->36 [1] 43/-1/-1->42->36 [2] 43/-1/-1->42->47 [3] 43/-1/-1->42->47 [4] 43/18/-1->42->90 [5] 43/18/-1->42->90 [6] 43/-1/-1->42->47 [7] 43/-1/-1->42->47
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Trees [0] 38/30/-1->37->36 [1] 38/30/-1->37->36 [2] 38/-1/-1->37->36 [3] 38/-1/-1->37->36 [4] 38/-1/-1->37->36 [5] 38/-1/-1->37->36 [6] 38/-1/-1->37->36 [7] 38/-1/-1->37->36
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Trees [0] 37/42/-1->36->24 [1] 37/42/-1->36->24 [2] 37/-1/-1->36->41 [3] 37/-1/-1->36->41 [4] 37/-1/-1->36->31 [5] 37/-1/-1->36->31 [6] 37/-1/-1->36->41 [7] 37/-1/-1->36->41
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 02 : 72[101c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 02 : 0[101c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 02 : 24[101c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 02 : 48[101c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02/0 : 9[101d0] -> 16[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02/0 : 57[101d0] -> 64[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03/0 : 57[101d0] -> 64[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03/0 : 9[101d0] -> 16[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02/0 : 81[101d0] -> 88[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 02 : 84[901c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 06/0 : 9[101d0] -> 16[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 06/0 : 57[101d0] -> 64[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03/0 : 81[101d0] -> 88[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 07/0 : 9[101d0] -> 16[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 07/0 : 57[101d0] -> 64[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 02 : 86[a01c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 06/0 : 81[101d0] -> 88[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 03 : 72[101c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 07/0 : 81[101d0] -> 88[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 03 : 0[101c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 02/0 : 53[901d0] -> 56[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 03 : 24[101c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 02 : 14[a01c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 03/0 : 53[901d0] -> 56[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 02/0 : 77[901d0] -> 80[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 03/0 : 77[901d0] -> 80[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06/0 : 53[901d0] -> 56[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06/0 : 77[901d0] -> 80[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07/0 : 53[901d0] -> 56[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 03 : 48[101c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07/0 : 77[901d0] -> 80[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 02 : 2[201c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02/0 : 33[101d0] -> 40[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 02/0 : 5[901d0] -> 8[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03/0 : 33[101d0] -> 40[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 03/0 : 5[901d0] -> 8[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 06/0 : 33[101d0] -> 40[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06/0 : 5[901d0] -> 8[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 07/0 : 33[101d0] -> 40[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07/0 : 5[901d0] -> 8[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 03 : 84[901c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 06 : 24[101c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 02 : 12[901c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 03 : 86[a01c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 06 : 72[101c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 02/0 : 29[901d0] -> 32[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 03/0 : 29[901d0] -> 32[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 03 : 14[a01c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 06 : 0[101c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06/0 : 29[901d0] -> 32[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 06 : 48[101c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 02 : 26[201c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 02 : 50[201c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07/0 : 29[901d0] -> 32[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 02 : 38[a01c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 02 : 62[a01c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 02 : 74[201c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 06 : 84[901c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 03 : 2[201c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 02 : 60[901c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 00/0 : 7[a01d0] -> 12[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 06 : 86[a01c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 01/0 : 7[a01d0] -> 12[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 04/0 : 7[a01d0] -> 12[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 05/0 : 7[a01d0] -> 12[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02/0 : 69[901d0] -> 76[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 03 : 12[901c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 07 : 72[101c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03/0 : 69[901d0] -> 76[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 00/0 : 1[101d0] -> 6[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 06/0 : 69[901d0] -> 76[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 06 : 14[a01c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 03 : 62[a01c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 01/0 : 1[101d0] -> 6[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 07/0 : 69[901d0] -> 76[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 00/0 : 85[901d0] -> 90[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02/0 : 45[901d0] -> 52[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 07 : 0[101c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 01/0 : 85[901d0] -> 90[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04/0 : 1[101d0] -> 6[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 04/0 : 85[901d0] -> 90[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 07 : 48[101c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03/0 : 45[901d0] -> 52[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05/0 : 1[101d0] -> 6[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 00/0 : 1[101d0] -> 6[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 00/0 : 61[901d0] -> 66[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 03 : 50[201c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 00/0 : 85[901d0] -> 90[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 05/0 : 85[901d0] -> 90[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 00 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 07 : 24[101c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 01/0 : 1[101d0] -> 6[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 01/0 : 61[901d0] -> 66[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 06/0 : 45[901d0] -> 52[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 07 : 84[901c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 01/0 : 85[901d0] -> 90[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 04/0 : 1[101d0] -> 6[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 04/0 : 61[901d0] -> 66[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 05/0 : 1[101d0] -> 6[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 07/0 : 45[901d0] -> 52[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 05/0 : 61[901d0] -> 66[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 04/0 : 85[901d0] -> 90[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 03 : 74[201c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 03 : 38[a01c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 07 : 86[a01c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 05/0 : 85[901d0] -> 90[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 00 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 06 : 2[201c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 03 : 60[901c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 03 : 26[201c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 02/0 : 81[101d0] -> 88[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 06 : 12[901c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 03/0 : 81[101d0] -> 88[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 06/0 : 81[101d0] -> 88[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 07/0 : 81[101d0] -> 88[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 01 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 02/0 : 9[101d0] -> 16[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 06 : 62[a01c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 07 : 14[a01c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 03/0 : 9[101d0] -> 16[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 06/0 : 9[101d0] -> 16[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 02/0 : 89[101d0] -> 92[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 07/0 : 9[101d0] -> 16[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 03/0 : 89[101d0] -> 92[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 06 : 50[201c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 06/0 : 89[101d0] -> 92[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 01 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 06 : 74[201c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 07/0 : 89[101d0] -> 92[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 00/0 : 91[201d0] -> 0[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 01/0 : 91[201d0] -> 0[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 02 : 36[901c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 07 : 2[201c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 04/0 : 91[201d0] -> 0[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 00/0 : 43[201d0] -> 48[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 05/0 : 91[201d0] -> 0[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 01/0 : 43[201d0] -> 48[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 06 : 60[901c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 04/0 : 43[201d0] -> 48[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 05/0 : 43[201d0] -> 48[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 06 : 26[201c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 02 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 00/0 : 73[101d0] -> 78[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 00/0 : 13[901d0] -> 18[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 02/0 : 93[901d0] -> 4[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 01/0 : 73[101d0] -> 78[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 07 : 12[901c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 01/0 : 13[901d0] -> 18[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 04/0 : 73[101d0] -> 78[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 03/0 : 93[901d0] -> 4[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 07 : 62[a01c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 04/0 : 13[901d0] -> 18[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 05/0 : 73[101d0] -> 78[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 05/0 : 13[901d0] -> 18[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 06/0 : 93[901d0] -> 4[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 07 : 50[201c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 02/0 : 93[901d0] -> 4[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 03/0 : 93[901d0] -> 4[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 02 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 07 : 74[201c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 07/0 : 93[901d0] -> 4[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 07 : 60[901c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 06 : 38[a01c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 06/0 : 93[901d0] -> 4[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 07/0 : 93[901d0] -> 4[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 00/0 : 49[101d0] -> 54[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 07 : 26[201c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 01/0 : 49[101d0] -> 54[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 03 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 04/0 : 49[101d0] -> 54[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 05/0 : 49[101d0] -> 54[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 00/0 : 55[a01d0] -> 60[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 01/0 : 55[a01d0] -> 60[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 02/0 : 41[101d0] -> 44[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 04/0 : 55[a01d0] -> 60[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 00/0 : 25[101d0] -> 30[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 00/0 : 37[901d0] -> 42[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 05/0 : 55[a01d0] -> 60[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02/0 : 21[901d0] -> 28[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 03/0 : 41[101d0] -> 44[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 01/0 : 25[101d0] -> 30[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 04/0 : 25[101d0] -> 30[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 01/0 : 37[901d0] -> 42[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03/0 : 21[901d0] -> 28[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 05/0 : 25[101d0] -> 30[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06/0 : 41[101d0] -> 44[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04/0 : 37[901d0] -> 42[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 03 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 06/0 : 21[901d0] -> 28[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 03 : 36[901c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07/0 : 41[101d0] -> 44[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 00/0 : 13[901d0] -> 18[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05/0 : 37[901d0] -> 42[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 07/0 : 21[901d0] -> 28[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 00 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 01/0 : 13[901d0] -> 18[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 02/0 : 45[901d0] -> 52[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04/0 : 13[901d0] -> 18[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 03/0 : 45[901d0] -> 52[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 02/0 : 57[101d0] -> 64[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 06/0 : 45[901d0] -> 52[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05/0 : 13[901d0] -> 18[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 03/0 : 57[101d0] -> 64[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 00 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 07/0 : 45[901d0] -> 52[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 06/0 : 57[101d0] -> 64[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 07/0 : 57[101d0] -> 64[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 00/0 : 37[901d0] -> 42[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 00/0 : 49[101d0] -> 54[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 01/0 : 37[901d0] -> 42[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 04/0 : 37[901d0] -> 42[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 01/0 : 49[101d0] -> 54[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 05/0 : 37[901d0] -> 42[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04/0 : 49[101d0] -> 54[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 07 : 38[a01c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 04 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05/0 : 49[101d0] -> 54[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 00 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 00 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 00/0 : 25[101d0] -> 30[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 01 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 02/0 : 17[101d0] -> 20[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 02/0 : 33[101d0] -> 40[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 01/0 : 25[101d0] -> 30[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 03/0 : 17[101d0] -> 20[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 02/0 : 65[101d0] -> 68[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 03/0 : 33[101d0] -> 40[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 00/0 : 79[a01d0] -> 84[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 03/0 : 65[101d0] -> 68[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04/0 : 25[101d0] -> 30[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 06/0 : 33[101d0] -> 40[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 01/0 : 79[a01d0] -> 84[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06/0 : 17[101d0] -> 20[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 06/0 : 65[101d0] -> 68[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 00 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 07/0 : 33[101d0] -> 40[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05/0 : 25[101d0] -> 30[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 04/0 : 79[a01d0] -> 84[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 07/0 : 65[101d0] -> 68[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07/0 : 17[101d0] -> 20[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 00 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 05/0 : 79[a01d0] -> 84[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 00/0 : 31[a01d0] -> 36[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 02/0 : 77[901d0] -> 80[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 01/0 : 31[a01d0] -> 36[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 01 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 02/0 : 89[101d0] -> 92[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 04/0 : 31[a01d0] -> 36[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 03/0 : 89[101d0] -> 92[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 00/0 : 19[201d0] -> 24[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 05/0 : 31[a01d0] -> 36[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 06/0 : 89[101d0] -> 92[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 01/0 : 19[201d0] -> 24[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 06 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 05 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 06 : 36[901c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 01 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 00/0 : 73[101d0] -> 78[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 07/0 : 89[101d0] -> 92[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 01 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 04/0 : 19[201d0] -> 24[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 02/0 : 69[901d0] -> 76[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 03/0 : 69[901d0] -> 76[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 05/0 : 19[201d0] -> 24[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 01/0 : 73[101d0] -> 78[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 06/0 : 69[901d0] -> 76[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 07/0 : 69[901d0] -> 76[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 02 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 00 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04/0 : 73[101d0] -> 78[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05/0 : 73[101d0] -> 78[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 01 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 00 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 02/0 : 21[901d0] -> 28[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 02 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 03/0 : 21[901d0] -> 28[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 02/0 : 17[101d0] -> 20[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 06/0 : 21[901d0] -> 28[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 02/0 : 5[901d0] -> 8[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 07 : 6[a01c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 07/0 : 21[901d0] -> 28[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 06 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 03/0 : 5[901d0] -> 8[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 00/0 : 61[901d0] -> 66[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 06/0 : 5[901d0] -> 8[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 01/0 : 61[901d0] -> 66[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 00 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 07/0 : 5[901d0] -> 8[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 03 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04/0 : 61[901d0] -> 66[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 02 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 02 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05/0 : 61[901d0] -> 66[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 00/0 : 67[201d0] -> 72[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 03 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 01 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 00 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 01 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 02/0 : 29[901d0] -> 32[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 01/0 : 67[201d0] -> 72[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 07 : 90[201c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 04/0 : 67[201d0] -> 72[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 02/0 : 65[101d0] -> 68[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 05/0 : 67[201d0] -> 72[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 02/0 : 53[901d0] -> 56[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 03/0 : 65[101d0] -> 68[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 03/0 : 53[901d0] -> 56[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 06/0 : 53[901d0] -> 56[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 02 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06/0 : 65[101d0] -> 68[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 07/0 : 53[901d0] -> 56[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 01 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 07 : 36[901c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 01 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07/0 : 65[101d0] -> 68[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 03 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 03 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 02 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 02 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 04 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 03 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 01 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 02 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 03 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 06 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 04 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 05 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 04 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 02 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 00 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 07 : 42[201c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 06 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 06 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 00 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 05 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 02 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 01 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 03 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 03 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 07 : 46[a01c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 07 : 54[a01c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 02 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 02/0 : 41[101d0] -> 44[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 03 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 01 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 03 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 06 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 00 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 00 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00/0 : 67[201d0] -> 72[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 03 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 04 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00/0 : 79[a01d0] -> 84[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 01 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 01 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 00 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 05 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 02 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 02 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 02 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 00 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00/0 : 7[a01d0] -> 12[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 07 : 58[201c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 06 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 00 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 03 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 03 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00/0 : 19[201d0] -> 24[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 03 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 00 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 04 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 07 : 75[201d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 04 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00/0 : 43[201d0] -> 48[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 04 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00/0 : 55[a01d0] -> 60[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 06 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 00 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 00/0 : 91[201d0] -> 0[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01/0 : 55[a01d0] -> 60[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 04/0 : 55[a01d0] -> 60[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 06 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 05 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 01/0 : 91[201d0] -> 0[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 01 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 05 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 04/0 : 91[201d0] -> 0[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 04 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 05/0 : 55[a01d0] -> 60[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 05/0 : 91[201d0] -> 0[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 00 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 00 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 00 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 06 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 01 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 06 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 01 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 05 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 07 : 78[a01c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 05 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 07 : 15[a01d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 07 : 27[201d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 01 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 07 : 18[201c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 06 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 01 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 01 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 02 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 06 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 02 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 02 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 06 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 04 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 07 : 30[a01c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 02 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 07 : 66[201c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 03 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 06 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 03 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 01 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 05 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 01 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 04 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 02 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 05 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 03 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 03 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 04 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 05 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 03 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 05 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 04 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 02 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 06 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 04 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 02 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 05 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 00 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 03 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 00 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 07 : 10[201c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00/0 : 31[a01d0] -> 36[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 07 : 51[201d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01/0 : 31[a01d0] -> 36[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 04 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 04/0 : 31[a01d0] -> 36[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 04 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 06 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 05/0 : 31[a01d0] -> 36[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 01 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 04 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 01 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 07 : 82[201c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 05 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 04 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 07 : 63[a01d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 00 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 05 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 04 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 00 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 05 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 00 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 06 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 06 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 05 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 05 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 01 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 01 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 00 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 01 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 07 : 3[201d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 07 : 87[a01d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 03 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 01 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 04 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 04 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 04 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 02 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 00 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 05 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 02 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 05 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 05 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 03 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 05 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 01 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 04 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 02 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 03 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 05 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 03 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 04 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 06 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 06 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 04 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 07 : 47[a01d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 04 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 05 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 06 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 06 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 05 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 07 : 11[201d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 06 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 05 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 07 : 34[201c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 07 : 39[a01d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 07 : 22[a01c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 06 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 06 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 00 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 00 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 01 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 02 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 03 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 04 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 05 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 06 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 07 : 59[201d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 07 : 94[a01c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 00 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 01 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 00 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 00 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 07 : 70[a01c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 02 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 01 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 01 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 03 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 01 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 02 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 02 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 04 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 00 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 03 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 02 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 03 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 05 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 04 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 03 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 04 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 01 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 04 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 06 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 05 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 02 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 05 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 05 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 06 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 06 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 07 : 35[201d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 07 : 71[a01d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 03 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 04 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 05 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 06 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 07 : 23[a01d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 06 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 07 : 83[201d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 07 : 95[a01d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 03/0 : 77[901d0] -> 80[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 06/0 : 77[901d0] -> 80[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01/0 : 67[201d0] -> 72[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 07/0 : 77[901d0] -> 80[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 04/0 : 67[201d0] -> 72[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 05/0 : 67[201d0] -> 72[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 03/0 : 17[101d0] -> 20[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 04 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01/0 : 7[a01d0] -> 12[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 06/0 : 17[101d0] -> 20[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 07/0 : 17[101d0] -> 20[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 04/0 : 7[a01d0] -> 12[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 03/0 : 29[901d0] -> 32[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 06/0 : 29[901d0] -> 32[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01/0 : 19[201d0] -> 24[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 05/0 : 7[a01d0] -> 12[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 07/0 : 29[901d0] -> 32[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 05 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 04/0 : 19[201d0] -> 24[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 05/0 : 19[201d0] -> 24[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01/0 : 43[201d0] -> 48[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 04/0 : 43[201d0] -> 48[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 05/0 : 43[201d0] -> 48[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01/0 : 79[a01d0] -> 84[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 04/0 : 79[a01d0] -> 84[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 05/0 : 79[a01d0] -> 84[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 04 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 04 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 04 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 03/0 : 41[101d0] -> 44[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 06/0 : 41[101d0] -> 44[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 05 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 07/0 : 41[101d0] -> 44[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 04 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 05 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 05 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 05 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 00 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 01 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 04 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 05 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 00 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 01 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 04 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 05 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 00 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 01 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 04 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 05 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 00 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 01 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 04 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 05 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 00 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 01 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 04 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 05 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 00 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 01 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 04 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 05 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02 : 52[901c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03 : 52[901c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 06 : 52[901c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 07 : 52[901c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 00 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 01 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 04 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 05 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 00 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 01 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 02 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 00 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 03 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 04 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 05 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 06 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02 : 16[101c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Channel 07 : 10[201c0] -> 11[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03 : 16[101c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 06 : 16[101c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 01 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 07 : 16[101c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 04 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 05 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 00 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 01 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 02 : 11[201d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 04 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 03 : 11[201d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 05 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 06 : 11[201d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Channel 07 : 11[201d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 00 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 00 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 01 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 02 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 01 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 03 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 04 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 05 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 02 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02 : 88[101c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 06 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 00 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Channel 07 : 46[a01c0] -> 47[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 01 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 03 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03 : 88[101c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 04 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 05 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 04 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 06 : 88[101c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 05 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 06 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 07 : 88[101c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 07 : 92[901c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 02 : 47[a01d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 03 : 47[a01d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 06 : 47[a01d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Channel 07 : 47[a01d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 02 : 4[901c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 03 : 4[901c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 00 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 06 : 4[901c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 07 : 4[901c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 01 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 00 : 16[101c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 02 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 02 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 01 : 16[101c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 03 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 03 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 04 : 16[101c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 06 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 05 : 16[101c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 00 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 07 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 04 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 01 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 05 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 04 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 06 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 00 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 05 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 02 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 03 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 06 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 01 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 00 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Channel 07 : 82[201c0] -> 83[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 07 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 00 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 01 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02 : 64[101c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 01 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 04 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 04 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03 : 64[101c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 02 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 06 : 64[101c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 05 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 05 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 03 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 07 : 64[101c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 04 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 02 : 83[201d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 00 : 88[101c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 05 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 03 : 83[201d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 01 : 88[101c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 06 : 83[201d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 04 : 88[101c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 00 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Channel 07 : 83[201d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07 : 56[101c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 05 : 88[101c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 01 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 00 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 01 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 02 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 02 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 03 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 02 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 04 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 03 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 05 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 02 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 06 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 02 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 03 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 00 : 4[901c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 00 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 02 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 03 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 01 : 4[901c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 03 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 00 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 04 : 4[901c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 01 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 06 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 06 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 04 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 02 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 05 : 4[901c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 07 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 01 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 07 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 05 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 00 : 64[101c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 03 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 02 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 07 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 04 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 04 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 03 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 01 : 64[101c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07 : 68[901c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 03 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 05 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 06 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 04 : 64[101c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 05 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 06 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 05 : 64[101c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02 : 40[101c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 06 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 06 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03 : 40[101c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 06 : 40[101c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 07 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 07 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 02 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 07 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 07 : 40[101c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Channel 07 : 58[201c0] -> 59[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 03 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 00 : 52[901c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 06 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 00 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 00 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 01 : 52[901c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 07 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 01 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 04 : 52[901c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 05 : 52[901c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 00 : 15[a01d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 02 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 01 : 15[a01d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 00 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 03 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 04 : 15[a01d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 01 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Channel 05 : 15[a01d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 02 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 02 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 00 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 03 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 03 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 00 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 01 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 06 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 04 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 07 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 00 : 40[101c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 05 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 02 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 01 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 06 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 01 : 40[101c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02 : 76[901c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 04 : 40[101c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Channel 07 : 34[201c0] -> 35[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 00 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 05 : 40[101c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03 : 76[901c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 01 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 06 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 06 : 76[901c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 01 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 04 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 07 : 76[901c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 05 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 07 : 66[201c0] -> 67[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 00 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 00 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 03 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 01 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 02 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 00 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 01 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 03 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 04 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 04 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 02 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 00 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 00 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 01 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 02 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 05 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 05 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 04 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 00 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 01 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07 : 8[101c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 06 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 05 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 02 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 00 : 67[201d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 07 : 14[a01c0] -> 15[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 04 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 03 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 01 : 67[201d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 03 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 05 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 01 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 02 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 04 : 67[201d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 03 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 06 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 03 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 06 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 07 : 18[201c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 00 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 02 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 05 : 67[201d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 04 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 07 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 00 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 00 : 76[901c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 06 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 01 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 03 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 04 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 01 : 76[901c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 00 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 06 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 05 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 04 : 76[901c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 01 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 07 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 01 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 05 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 04 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 05 : 76[901c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 01 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 00 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 06 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 07 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 00 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 02 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 00 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 02 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02 : 28[901c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 00 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 00 : 87[a01d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 01 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 01 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 03 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 07 : 88[101c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 02 : 59[201d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 01 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 02 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 04 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 02 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 02 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 05 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 01 : 87[a01d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 04 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 02 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 05 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 03 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 03 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 03 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 03 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 00 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 04 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 06 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 05 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 04 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 05 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 07 : 64[101c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 05 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 00 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 03 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 03 : 59[201d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 01 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 01 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 06 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 06 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 02 : 35[201d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 01 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 02 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07 : 44[901c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 00 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 00 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07 : 20[901c0] -> 19[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 03 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07 : 80[101c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 07 : 42[201c0] -> 43[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 04 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 05 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 02 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 01 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 00 : 63[a01d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 06 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 07 : 9[101d0] -> 10[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 04 : 87[a01d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Channel 05 : 87[a01d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 02 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 06 : 59[201d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 07 : 90[201c0] -> 91[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 00 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 00 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03 : 28[901c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 03 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 01 : 63[a01d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 03 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 03 : 35[201d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 01 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 01 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 06 : 28[901c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 02 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 07 : 28[901c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 01 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 04 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 04 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 04 : 63[a01d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 06 : 35[201d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 04 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 02 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 06 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 05 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 00 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 05 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 06 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 05 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Channel 07 : 59[201d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 00 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 07 : 16[101c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Channel 07 : 35[201d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Channel 05 : 63[a01d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 03 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 07 : 30[a01c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 06 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 03 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 01 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 01 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 06 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 04 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07 : 32[101c0] -> 31[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 00 : 91[201d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 07 : 50[201c0] -> 51[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 04 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 00 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 00 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 02 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 05 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 04 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 07 : 54[a01c0] -> 55[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 00 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 01 : 91[201d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 01 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 05 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 04 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 06 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 05 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 03 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 00 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 00 : 51[201d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 01 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 02 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 07 : 52[901c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 06 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 04 : 91[201d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 00 : 28[901c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 01 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 01 : 51[201d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 03 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 02 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 01 : 28[901c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 04 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 00 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 05 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 03 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 02 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 04 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 04 : 28[901c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 04 : 51[201d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 02 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 05 : 91[201d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 06 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 05 : 28[901c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 03 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 07 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 05 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 00 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 05 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 01 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 07 : 62[a01c0] -> 63[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 06 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 04 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 06 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 00 : 55[a01d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 03 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 05 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 07 : 57[101d0] -> 58[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 06 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 02 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Channel 05 : 51[201d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 01 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 01 : 55[a01d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 07 : 0[101c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 00 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 04 : 55[a01d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 04 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 01 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07 : 56[101c0] -> 57[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 02 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 07 : 92[901c0] -> 93[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 03 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 02 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 01 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 03 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 03 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 04 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 04 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 06 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 02 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 00 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 05 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 07 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 05 : 55[a01d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 05 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 01 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 06 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 05 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 00 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 03 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 00 : 3[201d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 00 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 07 : 4[901c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 02 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 01 : 3[201d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 01 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 06 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 06 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 03 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 04 : 3[201d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 04 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 02 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 07 : 93[901d0] -> 94[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 04 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 04 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Channel 05 : 3[201d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 05 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 03 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 05 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 06 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 05 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 04 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 01 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 00 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 06 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 00 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 01 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 06 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 05 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 02 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Channel 07 : 94[a01c0] -> 95[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 01 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02/0 : 16[101c0] -> 20[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03/0 : 16[101c0] -> 20[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 02 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 07 : 86[a01c0] -> 87[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 07 : 1[101d0] -> 2[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 07 : 40[101c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 03 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 02 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 00 : 43[201d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 00 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 04 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 02 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 06 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 03 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 01 : 43[201d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 00 : 39[a01d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 01 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 03 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 06 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 05 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 00 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 07 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 04 : 43[201d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 02 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 04 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 02 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 07 : 2[201c0] -> 3[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 05 : 43[201d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 03 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 03 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07 : 44[901c0] -> 45[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 05 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 01 : 39[a01d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 04 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 04 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 00 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 00 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 00 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 05 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 06 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 00 : 75[201d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 01 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 01 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 07 : 76[901c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 06 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 05 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 01 : 75[201d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 02 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 02 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 02 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 04 : 39[a01d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 01 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 00 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 07 : 45[901d0] -> 46[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 03 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 03 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 04 : 75[201d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 01 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 04 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 00 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 04 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 01 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 03 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 05 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 04 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 02 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 06 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 01 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 02 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 07 : 60[901c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 05 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 06 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Channel 05 : 75[201d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 03 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 05 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 02 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 03 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 03 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 00 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 00 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 01 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 04 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 03 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 02 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07 : 68[901c0] -> 69[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 03 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 07 : 48[101c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 01 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 04 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 04 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 05 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 04 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 04 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 05 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 06 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 05 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07 : 32[101c0] -> 33[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 06 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 02 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 07 : 61[901d0] -> 62[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 04 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 00 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 00 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 06 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Channel 05 : 39[a01d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 01 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 02 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 03 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 03 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 04 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 00 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 07 : 81[101d0] -> 82[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 01 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 05 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 07 : 69[901d0] -> 70[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 00 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 06 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Channel 05 : 91[201d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 05 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 07 : 49[101d0] -> 50[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 00 : 31[a01d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 01 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 01 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 02 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 01 : 31[a01d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 02 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 00/0 : 36[901c0] -> 42[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 02 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 01/0 : 36[901c0] -> 42[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 04 : 31[a01d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 06 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 03 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 02 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 05 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 05 : 31[a01d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 07 : 78[a01c0] -> 79[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 04 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 03 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 03 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 06 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 00 : 19[201d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 05 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 06 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 00/0 : 60[901c0] -> 66[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 04 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 01/0 : 60[901c0] -> 66[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 04 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 02 : 95[a01d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 05 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 06 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 06 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 07 : 84[901c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 01 : 19[201d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 00 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 05 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 06 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 07 : 12[901c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 02 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 07 : 33[101d0] -> 34[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 03 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 03 : 95[a01d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 02 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27558 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 06/0 : 64[101c0] -> 57[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26989:27558 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 07/0 : 64[101c0] -> 57[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 04 : 19[201d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 06 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 00 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 07 : 0[101c0] -> 5[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 01 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 03 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 03 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 00 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 05 : 19[201d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 01 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 02 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 01 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Channel 07 : 70[a01c0] -> 71[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02/0 : 64[101c0] -> 68[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 07 : 74[201c0] -> 75[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03/0 : 64[101c0] -> 68[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 02 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 02 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 02 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 06 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 06 : 95[a01d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 02 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 04 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 03 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 04 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 00 : 79[a01d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02/0 : 88[101c0] -> 92[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 04 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 07 : 60[901c0] -> 65[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03/0 : 88[101c0] -> 92[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 03 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 05 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 00 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 00 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 03 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 06 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 05 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Channel 07 : 95[a01d0] -> 90[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 01 : 79[a01d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 03 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 05 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 07 : 85[901d0] -> 86[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 02 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 02 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 00 : 7[a01d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 00 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 04 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 01 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 01 : 7[a01d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 00/0 : 84[901c0] -> 90[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 01/0 : 84[901c0] -> 90[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 03 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 06 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30120 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 00/0 : 54[a01c0] -> 61[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29515:30120 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 04 : 79[a01d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 01/0 : 54[a01c0] -> 61[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 03 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Connected all rings
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 01 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 06 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 04 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 06 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 04 : 7[a01d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 00 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 04 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 05 : 7[a01d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 04/0 : 67[201d0] -> 78[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 05/0 : 67[201d0] -> 78[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 03 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 01 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 00 : 27[201d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Channel 07 : 93[901d0] -> 92[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 01 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 05 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 02 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 02 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 00 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 05 : 79[a01d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 07 : 28[901c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 03 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 02 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 03 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 04 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 05 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 06 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 07 : 13[901d0] -> 14[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 04 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 06 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 04 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 01 : 27[201d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 07 : 36[901c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 05 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 05 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 03 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 01 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07 : 8[101c0] -> 9[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 04 : 27[201d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 05 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 07 : 48[101c0] -> 53[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 06 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 05 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 02 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 02 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 02 : 5[901d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 02 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 06 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Channel 05 : 27[201d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30121 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 02/0 : 56[101c0] -> 65[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 00 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30121 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 03/0 : 56[101c0] -> 65[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 00/0 : 12[901c0] -> 18[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 01/0 : 12[901c0] -> 18[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 03 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07 : 20[901c0] -> 21[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 03 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 06 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 03 : 5[901d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 03 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 02 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 07 : 38[a01c0] -> 39[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 01 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 03 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 07 : 24[101c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 04 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 06 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 06 : 5[901d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 06 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 04 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Channel 07 : 62[a01c0] -> 61[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 04 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 02 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 05 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Channel 07 : 2[201c0] -> 1[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 07 : 72[101c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 06 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 00 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 05 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 01 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 04 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Channel 05 : 1[101d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 05 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 07 : 6[a01c0] -> 7[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 03 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 06 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07 : 80[101c0] -> 81[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 07 : 5[901d0] -> 0[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 04 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 00 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 06 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 07 : 21[901d0] -> 22[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 01 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 00 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 02 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 05 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 02 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 03 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 04 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 02 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 07 : 26[201c0] -> 27[201d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 05 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 01 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 00 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 06 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 02 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 06 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 03 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 02 : 71[a01d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 03 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 02 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 06 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 03 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 01 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Channel 07 : 50[201c0] -> 49[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 07 : 73[101d0] -> 74[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 06/0 : 52[901c0] -> 56[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 07/0 : 52[901c0] -> 56[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Channel 07 : 22[a01c0] -> 23[a01d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 03 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06/0 : 4[901c0] -> 8[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07/0 : 4[901c0] -> 8[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 03 : 71[a01d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04/0 : 48[101c0] -> 54[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 02 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04/0 : 24[101c0] -> 30[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05/0 : 48[101c0] -> 54[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05/0 : 24[101c0] -> 30[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 04 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 06 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 05 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 06 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 06 : 71[a01d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 07 : 12[901c0] -> 17[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 03 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 07 : 37[901d0] -> 38[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 06 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Channel 07 : 71[a01d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 04 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 06/0 : 4[901c0] -> 8[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 07/0 : 4[901c0] -> 8[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30138:30698 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 06/0 : 16[101c0] -> 9[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30138:30698 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 07/0 : 16[101c0] -> 9[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 07 : 84[901c0] -> 89[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 05 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 02 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 03 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 02/0 : 88[101c0] -> 92[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 06 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30683 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 00/0 : 6[a01c0] -> 13[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 03/0 : 88[101c0] -> 92[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30082:30683 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 01/0 : 6[a01c0] -> 13[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 02 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06/0 : 52[901c0] -> 56[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07/0 : 52[901c0] -> 56[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 07 : 25[101d0] -> 26[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 06 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06/0 : 28[901c0] -> 32[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07/0 : 28[901c0] -> 32[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 03 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 04/0 : 43[201d0] -> 66[201c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 05/0 : 43[201d0] -> 66[201c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Channel 07 : 86[a01c0] -> 85[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 06/0 : 69[901d0] -> 80[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 07/0 : 69[901d0] -> 80[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 02 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27724:28285 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 02/0 : 80[101c0] -> 89[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27724:28285 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 03/0 : 80[101c0] -> 89[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 03 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26987:27560 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 04/0 : 60[901c0] -> 55[a01d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26987:27560 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 05/0 : 60[901c0] -> 55[a01d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 06 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 07 : 72[101c0] -> 77[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 02/0 : 40[101c0] -> 44[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 06 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 03/0 : 40[101c0] -> 44[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30086:30684 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 02/0 : 8[101c0] -> 17[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30086:30684 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 03/0 : 8[101c0] -> 17[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Channel 07 : 14[a01c0] -> 13[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 02 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 02 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00/0 : 60[901c0] -> 66[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01/0 : 60[901c0] -> 66[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 03 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 03 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 02 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 06 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 06 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 02 : 23[a01d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 07 : 24[101c0] -> 29[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 03 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02/0 : 40[101c0] -> 44[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03/0 : 40[101c0] -> 44[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27794:28451 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 06/0 : 88[101c0] -> 81[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27794:28451 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 04/0 : 0[101c0] -> 6[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 05/0 : 0[101c0] -> 6[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 07/0 : 88[101c0] -> 81[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 06 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 03 : 23[a01d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Channel 07 : 74[201c0] -> 73[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 02 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 07 : 36[901c0] -> 41[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 06/0 : 76[901c0] -> 80[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 06 : 23[a01d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 07/0 : 76[901c0] -> 80[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 03 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 06/0 : 45[901d0] -> 68[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 07/0 : 45[901d0] -> 68[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 06 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Channel 07 : 23[a01d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 02/0 : 64[101c0] -> 68[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 03/0 : 64[101c0] -> 68[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39398:39970 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 00/0 : 24[101c0] -> 49[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39398:39970 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 01/0 : 24[101c0] -> 49[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04/0 : 72[101c0] -> 78[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05/0 : 72[101c0] -> 78[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Channel 07 : 26[201c0] -> 25[101d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 04/0 : 48[101c0] -> 54[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 05/0 : 48[101c0] -> 54[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39402:39972 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 02/0 : 28[901c0] -> 53[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39402:39972 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 03/0 : 28[901c0] -> 53[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 02 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26341 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 04/0 : 36[901c0] -> 31[a01d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25777:26341 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 05/0 : 36[901c0] -> 31[a01d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 03 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 06/0 : 28[901c0] -> 32[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 07/0 : 28[901c0] -> 32[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29216:29781 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 02/0 : 32[101c0] -> 41[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 06 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29781 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 03/0 : 32[101c0] -> 41[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27720:28286 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 00/0 : 78[a01c0] -> 85[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27720:28286 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 01/0 : 78[a01c0] -> 85[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Channel 07 : 38[a01c0] -> 37[901d0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 04/0 : 19[201d0] -> 30[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 05/0 : 19[201d0] -> 30[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04/0 : 0[101c0] -> 6[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05/0 : 0[101c0] -> 6[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00/0 : 84[901c0] -> 90[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01/0 : 84[901c0] -> 90[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27792:28452 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 04/0 : 84[901c0] -> 79[a01d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27792:28452 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 05/0 : 84[901c0] -> 79[a01d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30136:30699 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 04/0 : 12[901c0] -> 7[a01d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30136:30699 [7] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 05/0 : 12[901c0] -> 7[a01d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25779:26345 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 06/0 : 40[101c0] -> 33[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25779:26345 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 07/0 : 40[101c0] -> 33[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00/0 : 12[901c0] -> 18[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01/0 : 12[901c0] -> 18[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06/0 : 76[901c0] -> 80[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07/0 : 76[901c0] -> 80[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02/0 : 76[901c0] -> 88[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 02/0 : 8[101c0] -> 17[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03/0 : 76[901c0] -> 88[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 03/0 : 8[101c0] -> 17[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28172:29021 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 00/0 : 60[901c0] -> 73[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28172:29021 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 01/0 : 60[901c0] -> 73[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02/0 : 52[901c0] -> 76[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03/0 : 52[901c0] -> 76[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 06/0 : 44[901c0] -> 92[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 02/0 : 32[101c0] -> 41[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27920:28621 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 07/0 : 44[901c0] -> 92[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 03/0 : 32[101c0] -> 41[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 06/0 : 92[901c0] -> 44[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 07/0 : 92[901c0] -> 44[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 02/0 : 52[901c0] -> 4[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 03/0 : 52[901c0] -> 4[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 02/0 : 4[901c0] -> 52[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 03/0 : 4[901c0] -> 52[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 02/0 : 16[101c0] -> 20[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 03/0 : 16[101c0] -> 20[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 02/0 : 56[101c0] -> 65[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 03/0 : 56[101c0] -> 65[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 04/0 : 72[101c0] -> 78[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 05/0 : 72[101c0] -> 78[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28176:29020 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 02/0 : 64[101c0] -> 77[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28176:29020 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 03/0 : 64[101c0] -> 77[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30192:30740 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 00/0 : 12[901c0] -> 25[101d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30192:30740 [1] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 01/0 : 12[901c0] -> 25[101d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04/0 : 6[a01c0] -> 18[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05/0 : 6[a01c0] -> 18[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 04/0 : 24[101c0] -> 30[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 05/0 : 24[101c0] -> 30[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30196:30734 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 02/0 : 16[101c0] -> 29[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30196:30734 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 03/0 : 16[101c0] -> 29[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29212:29782 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 00/0 : 30[a01c0] -> 37[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29212:29782 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 01/0 : 30[a01c0] -> 37[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 02/0 : 41[101d0] -> 32[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 03/0 : 41[101d0] -> 32[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 06/0 : 21[901d0] -> 32[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00/0 : 60[901c0] -> 73[101d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01/0 : 60[901c0] -> 73[101d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 02/0 : 17[101d0] -> 8[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 03/0 : 17[101d0] -> 8[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 07/0 : 21[901d0] -> 32[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02/0 : 28[901c0] -> 40[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03/0 : 28[901c0] -> 40[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06/0 : 20[901c0] -> 44[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07/0 : 20[901c0] -> 44[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04/0 : 54[a01c0] -> 66[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05/0 : 54[a01c0] -> 66[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00/0 : 36[901c0] -> 42[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01/0 : 36[901c0] -> 42[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 02/0 : 65[101d0] -> 56[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 03/0 : 65[101d0] -> 56[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 00/0 : 54[a01c0] -> 61[901d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 01/0 : 54[a01c0] -> 61[901d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 00/0 : 30[a01c0] -> 37[901d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 01/0 : 30[a01c0] -> 37[901d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02/0 : 64[101c0] -> 77[901d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03/0 : 64[101c0] -> 77[901d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00/0 : 48[101c0] -> 72[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01/0 : 48[101c0] -> 72[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06/0 : 56[101c0] -> 68[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07/0 : 56[101c0] -> 68[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 00/0 : 6[a01c0] -> 13[901d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 00/0 : 48[101c0] -> 0[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 01/0 : 48[101c0] -> 0[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 00/0 : 0[101c0] -> 48[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 01/0 : 0[101c0] -> 48[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 01/0 : 6[a01c0] -> 13[901d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02/0 : 28[901c0] -> 40[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03/0 : 28[901c0] -> 40[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 04/0 : 42[201c0] -> 90[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27918:28693 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 05/0 : 42[201c0] -> 90[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 04/0 : 90[201c0] -> 42[201c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 05/0 : 90[201c0] -> 42[201c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00/0 : 72[101c0] -> 84[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01/0 : 72[101c0] -> 84[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00/0 : 12[901c0] -> 25[101d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01/0 : 12[901c0] -> 25[101d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06/0 : 8[101c0] -> 20[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07/0 : 8[101c0] -> 20[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00/0 : 72[101c0] -> 84[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02/0 : 76[901c0] -> 88[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01/0 : 72[101c0] -> 84[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03/0 : 76[901c0] -> 88[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02/0 : 16[101c0] -> 29[901d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03/0 : 16[101c0] -> 29[901d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06/0 : 21[901d0] -> 32[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07/0 : 21[901d0] -> 32[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 02/0 : 80[101c0] -> 89[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 00/0 : 78[a01c0] -> 85[901d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 03/0 : 80[101c0] -> 89[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 01/0 : 78[a01c0] -> 85[901d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00/0 : 24[101c0] -> 36[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01/0 : 24[101c0] -> 36[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06/0 : 8[101c0] -> 20[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07/0 : 8[101c0] -> 20[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06/0 : 56[101c0] -> 68[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07/0 : 56[101c0] -> 68[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00/0 : 73[101d0] -> 60[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01/0 : 73[101d0] -> 60[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 02/0 : 89[101d0] -> 80[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 03/0 : 89[101d0] -> 80[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 00/0 : 73[101d0] -> 60[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 01/0 : 73[101d0] -> 60[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 00/0 : 37[901d0] -> 30[a01c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 01/0 : 37[901d0] -> 30[a01c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00/0 : 24[101c0] -> 36[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01/0 : 24[101c0] -> 36[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04/0 : 19[201d0] -> 30[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05/0 : 19[201d0] -> 30[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04/0 : 54[a01c0] -> 66[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05/0 : 54[a01c0] -> 66[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04/0 : 18[201c0] -> 42[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05/0 : 18[201c0] -> 42[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 02/0 : 77[901d0] -> 64[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 03/0 : 77[901d0] -> 64[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02/0 : 77[901d0] -> 64[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03/0 : 77[901d0] -> 64[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 00/0 : 61[901d0] -> 54[a01c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 01/0 : 61[901d0] -> 54[a01c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 00 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 01 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 04 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Channel 05 : 73[101d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 00/0 : 13[901d0] -> 6[a01c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 01/0 : 13[901d0] -> 6[a01c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04/0 : 6[a01c0] -> 18[201c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05/0 : 6[a01c0] -> 18[201c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02/0 : 28[901c0] -> 53[901d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03/0 : 28[901c0] -> 53[901d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 02 : 77[901d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 03 : 77[901d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 06 : 77[901d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 07 : 77[901d0] -> 72[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02/0 : 40[101c0] -> 28[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03/0 : 40[101c0] -> 28[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00/0 : 25[101d0] -> 12[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 00/0 : 25[101d0] -> 12[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01/0 : 25[101d0] -> 12[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 01/0 : 25[101d0] -> 12[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02/0 : 52[901c0] -> 76[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03/0 : 52[901c0] -> 76[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30191:30759 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 06/0 : 32[101c0] -> 21[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30191:30759 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 07/0 : 32[101c0] -> 21[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04/0 : 67[201d0] -> 78[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06/0 : 69[901d0] -> 80[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05/0 : 67[201d0] -> 78[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07/0 : 69[901d0] -> 80[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00/0 : 48[101c0] -> 72[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01/0 : 48[101c0] -> 72[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00/0 : 84[901c0] -> 72[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00/0 : 36[901c0] -> 24[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 02/0 : 29[901d0] -> 16[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30189:30757 [3] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29155:29701 [3] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01/0 : 84[901c0] -> 72[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01/0 : 36[901c0] -> 24[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 03/0 : 29[901d0] -> 16[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 00 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 04/0 : 78[a01c0] -> 67[201d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02/0 : 88[101c0] -> 76[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 00/0 : 85[901d0] -> 78[a01c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 01 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29701 [3] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03/0 : 88[101c0] -> 76[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 04 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 05/0 : 78[a01c0] -> 67[201d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 01/0 : 85[901d0] -> 78[a01c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Channel 05 : 25[101d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29702 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 06/0 : 80[101c0] -> 69[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-55:29157:29702 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 07/0 : 80[101c0] -> 69[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 02 : 29[901d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 03 : 29[901d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 06 : 29[901d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 07 : 29[901d0] -> 24[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00/0 : 24[101c0] -> 49[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01/0 : 24[101c0] -> 49[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 04/0 : 30[a01c0] -> 19[201d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30189:30757 [3] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 05/0 : 30[a01c0] -> 19[201d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02/0 : 29[901d0] -> 16[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03/0 : 29[901d0] -> 16[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06/0 : 20[901c0] -> 8[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07/0 : 20[901c0] -> 8[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06/0 : 32[101c0] -> 21[901d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06/0 : 20[901c0] -> 44[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07/0 : 32[101c0] -> 21[901d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07/0 : 20[901c0] -> 44[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06/0 : 68[901c0] -> 56[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07/0 : 68[901c0] -> 56[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04/0 : 66[201c0] -> 54[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05/0 : 66[201c0] -> 54[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06/0 : 45[901d0] -> 68[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29706 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07/0 : 45[901d0] -> 68[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02/0 : 4[901c0] -> 52[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00/0 : 0[101c0] -> 48[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01/0 : 0[101c0] -> 48[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03/0 : 4[901c0] -> 52[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00/0 : 48[101c0] -> 0[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02/0 : 52[901c0] -> 4[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01/0 : 48[101c0] -> 0[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03/0 : 52[901c0] -> 4[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04/0 : 43[201d0] -> 66[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29703 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05/0 : 43[201d0] -> 66[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 02 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04/0 : 30[a01c0] -> 19[201d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05/0 : 30[a01c0] -> 19[201d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 03 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 06 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Channel 07 : 21[901d0] -> 20[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04/0 : 18[201c0] -> 42[201c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05/0 : 18[201c0] -> 42[201c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04/0 : 18[201c0] -> 6[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05/0 : 18[201c0] -> 6[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 00/0 : 66[201c0] -> 60[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29514:30119 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 01/0 : 66[201c0] -> 60[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 00/0 : 18[201c0] -> 12[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 06/0 : 64[101c0] -> 57[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30081:30686 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 01/0 : 18[201c0] -> 12[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 07/0 : 64[101c0] -> 57[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 02/0 : 53[901d0] -> 28[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 03/0 : 53[901d0] -> 28[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02/0 : 53[901d0] -> 28[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03/0 : 53[901d0] -> 28[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 00 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 01 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 04 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Channel 05 : 19[201d0] -> 18[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 06/0 : 16[101c0] -> 9[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 07/0 : 16[101c0] -> 9[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02/0 : 76[901c0] -> 52[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03/0 : 76[901c0] -> 52[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 02 : 53[901d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 03 : 53[901d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 06 : 53[901d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 07 : 53[901d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 02 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 03 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 06 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Channel 07 : 9[101d0] -> 8[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 02 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 03 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 06 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 02/0 : 41[101d0] -> 32[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25778:26346 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 03/0 : 41[101d0] -> 32[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Channel 07 : 57[101d0] -> 56[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06/0 : 80[101c0] -> 69[901d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07/0 : 80[101c0] -> 69[901d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00/0 : 72[101c0] -> 48[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01/0 : 72[101c0] -> 48[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04/0 : 78[a01c0] -> 67[201d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05/0 : 78[a01c0] -> 67[201d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00/0 : 49[101d0] -> 24[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01/0 : 49[101d0] -> 24[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30345:30900 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 06/0 : 68[901c0] -> 45[901d0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30345:30900 [5] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 07/0 : 68[901c0] -> 45[901d0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 00/0 : 49[101d0] -> 24[101c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 01/0 : 49[101d0] -> 24[101c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 02 : 41[101d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 03 : 41[101d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 06 : 41[101d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 07 : 41[101d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30343:30905 [3] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 04/0 : 66[201c0] -> 43[201d0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30343:30905 [3] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 05/0 : 66[201c0] -> 43[201d0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 00/0 : 37[901d0] -> 30[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25776:26344 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 01/0 : 37[901d0] -> 30[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 02 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 03 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 06 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 00 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Channel 07 : 69[901d0] -> 68[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 01 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 04 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Channel 05 : 67[201d0] -> 66[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 00 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 01 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 04 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Channel 05 : 49[101d0] -> 48[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 02/0 : 68[901c0] -> 64[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-54:29518:30122 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Channel 03/0 : 68[901c0] -> 64[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 02/0 : 20[901c0] -> 16[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-46:30085:30688 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Channel 03/0 : 20[901c0] -> 16[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 02/0 : 40[101c0] -> 28[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 03/0 : 40[101c0] -> 28[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 00 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 01 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 04 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Channel 05 : 37[901d0] -> 36[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04/0 : 90[201c0] -> 42[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06/0 : 92[901c0] -> 44[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30342:30903 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05/0 : 90[201c0] -> 42[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30902 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07/0 : 92[901c0] -> 44[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04/0 : 42[201c0] -> 90[201c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06/0 : 44[901c0] -> 92[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05/0 : 42[201c0] -> 90[201c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07/0 : 44[901c0] -> 92[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06/0 : 44[901c0] -> 20[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30190:30756 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07/0 : 44[901c0] -> 20[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04/0 : 42[201c0] -> 18[201c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30188:30760 [2] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05/0 : 42[201c0] -> 18[201c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 02/0 : 76[901c0] -> 52[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 03/0 : 76[901c0] -> 52[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06/0 : 68[901c0] -> 45[901d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07/0 : 68[901c0] -> 45[901d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 06/0 : 32[101c0] -> 28[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Channel 07/0 : 32[101c0] -> 28[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 06/0 : 8[101c0] -> 4[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-44:28069:28963 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Channel 07/0 : 8[101c0] -> 4[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04/0 : 66[201c0] -> 43[201d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05/0 : 66[201c0] -> 43[201d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 00/0 : 36[901c0] -> 24[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 01/0 : 36[901c0] -> 24[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 00/0 : 85[901d0] -> 78[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 02/0 : 89[101d0] -> 80[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27791:28450 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27793:28455 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 01/0 : 85[901d0] -> 78[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 03/0 : 89[101d0] -> 80[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 02 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 03 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 06 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Channel 07 : 45[901d0] -> 44[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 04/0 : 6[a01c0] -> 0[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-44:28065:28962 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Channel 05/0 : 6[a01c0] -> 0[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 04/0 : 30[a01c0] -> 24[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Channel 05/0 : 30[a01c0] -> 24[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 00 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 01 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 04 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Channel 05 : 43[201d0] -> 42[201c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 02 : 89[101d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 03 : 89[101d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 06 : 89[101d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 00/0 : 72[101c0] -> 48[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 01/0 : 72[101c0] -> 48[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 07 : 89[101d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 00 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 01 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 04 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Channel 05 : 85[901d0] -> 84[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 02/0 : 88[101c0] -> 76[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 03/0 : 88[101c0] -> 76[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 06/0 : 56[101c0] -> 52[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-52:39401:39973 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Channel 07/0 : 56[101c0] -> 52[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 06/0 : 68[901c0] -> 56[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 07/0 : 68[901c0] -> 56[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 04/0 : 66[201c0] -> 54[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 05/0 : 66[201c0] -> 54[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 04/0 : 78[a01c0] -> 72[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 06/0 : 80[101c0] -> 76[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Channel 05/0 : 78[a01c0] -> 72[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Channel 07/0 : 80[101c0] -> 76[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 06/0 : 40[101c0] -> 33[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 07/0 : 40[101c0] -> 33[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 04/0 : 54[a01c0] -> 48[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-52:39397:39974 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Channel 05/0 : 54[a01c0] -> 48[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 00/0 : 84[901c0] -> 72[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 01/0 : 84[901c0] -> 72[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 06/0 : 32[101c0] -> 28[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-48:30195:30738 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Channel 07/0 : 32[101c0] -> 28[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 04/0 : 42[201c0] -> 18[201c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 06/0 : 44[901c0] -> 20[901c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 05/0 : 42[201c0] -> 18[201c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 07/0 : 44[901c0] -> 20[901c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 00/0 : 90[201c0] -> 84[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 02/0 : 92[901c0] -> 88[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Channel 03/0 : 92[901c0] -> 88[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Channel 01/0 : 90[201c0] -> 84[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 02 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 03 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 06 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Channel 07 : 33[101d0] -> 32[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 02/0 : 68[901c0] -> 64[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Channel 03/0 : 68[901c0] -> 64[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 02/0 : 65[101d0] -> 56[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26988:27563 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 03/0 : 65[101d0] -> 56[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 00/0 : 61[901d0] -> 54[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26986:27559 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 01/0 : 61[901d0] -> 54[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 00/0 : 42[201c0] -> 36[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29780 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 01/0 : 42[201c0] -> 36[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 04/0 : 30[a01c0] -> 24[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-48:30191:30741 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Channel 05/0 : 30[a01c0] -> 24[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 00/0 : 66[201c0] -> 60[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Channel 01/0 : 66[201c0] -> 60[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 02 : 65[101d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 03 : 65[101d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 06 : 65[101d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 07 : 65[101d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 00 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 01 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 02/0 : 44[901c0] -> 40[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-50:29215:29783 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Channel 03/0 : 44[901c0] -> 40[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 04 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Channel 05 : 61[901d0] -> 60[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 04/0 : 18[201c0] -> 6[a01c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 06/0 : 20[901c0] -> 8[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 05/0 : 18[201c0] -> 6[a01c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 07/0 : 20[901c0] -> 8[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 02 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 02/0 : 44[901c0] -> 40[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 00/0 : 42[201c0] -> 36[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Channel 01/0 : 42[201c0] -> 36[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Channel 03/0 : 44[901c0] -> 40[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 03 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 06 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Channel 07 : 29[901d0] -> 28[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 06/0 : 88[101c0] -> 81[101d0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 07/0 : 88[101c0] -> 81[101d0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 06/0 : 80[101c0] -> 76[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-56:28175:29019 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Channel 07/0 : 80[101c0] -> 76[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 02 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 03 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 06 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 04/0 : 54[a01c0] -> 48[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 06/0 : 56[101c0] -> 52[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Channel 05/0 : 54[a01c0] -> 48[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Channel 07/0 : 56[101c0] -> 52[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Channel 07 : 81[101d0] -> 80[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 04/0 : 78[a01c0] -> 72[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-56:28171:29023 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Channel 05/0 : 78[a01c0] -> 72[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 00/0 : 90[201c0] -> 84[901c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27719:28283 [4] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 01/0 : 90[201c0] -> 84[901c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 04/0 : 60[901c0] -> 55[a01d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Channel 05/0 : 60[901c0] -> 55[a01d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 02 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 03 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 06 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Channel 07 : 65[101d0] -> 64[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 02/0 : 17[101d0] -> 8[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30137:30701 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 03/0 : 17[101d0] -> 8[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-55:29156:29664 [4] NCCL INFO comm 0x7f1d98009010 rank 68 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-55:29154:29667 [2] NCCL INFO comm 0x7fc438009010 rank 66 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 02 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 03 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 00/0 : 18[201c0] -> 12[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 02/0 : 20[901c0] -> 16[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Channel 01/0 : 18[201c0] -> 12[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Channel 03/0 : 20[901c0] -> 16[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 06 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-48:30196:30702 [5] NCCL INFO comm 0x7f1d10009010 rank 29 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-48:30195:30703 [4] NCCL INFO comm 0x7f185c009010 rank 28 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-48:30192:30700 [1] NCCL INFO comm 0x7fb6ec009010 rank 25 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-48:30193:30701 [2] NCCL INFO comm 0x7fce2c009010 rank 26 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-48:30194:30699 [3] NCCL INFO comm 0x7f08f0009010 rank 27 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Channel 07 : 77[901d0] -> 76[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-48:30191:30698 [0] NCCL INFO comm 0x7f5cc0009010 rank 24 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 00 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-55:29159:29666 [7] NCCL INFO comm 0x7f6c3c009010 rank 71 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 01 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 00/0 : 13[901d0] -> 6[a01c0] [receive] via NET/Socket/2
gpu-st-p4d-24xlarge-55:29158:29665 [6] NCCL INFO comm 0x7fca4c009010 rank 70 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30135:30697 [6] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 01/0 : 13[901d0] -> 6[a01c0] [receive] via NET/Socket/3
gpu-st-p4d-24xlarge-55:29157:29663 [5] NCCL INFO comm 0x7fcb24009010 rank 69 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 04 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-55:29155:29662 [3] NCCL INFO comm 0x7fedc8009010 rank 67 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 04/0 : 36[901c0] -> 31[a01d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Channel 05/0 : 36[901c0] -> 31[a01d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Channel 05 : 55[a01d0] -> 54[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 02 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 02 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 03 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 03 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 06 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 06 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Channel 07 : 41[101d0] -> 40[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Channel 07 : 53[901d0] -> 52[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 02/0 : 92[901c0] -> 88[101c0] [receive] via NET/Socket/0
gpu-st-p4d-24xlarge-58:27723:28289 [0] NCCL INFO NET/Socket: Using 2 threads and 8 sockets per thread
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Channel 03/0 : 92[901c0] -> 88[101c0] [receive] via NET/Socket/1
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 02 : 17[101d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 03 : 17[101d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 06 : 17[101d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 07 : 17[101d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-51:30344:30851 [4] NCCL INFO comm 0x7f2ac4009010 rank 44 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-51:30346:30852 [6] NCCL INFO comm 0x7f71d4009010 rank 46 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-51:30342:30850 [2] NCCL INFO comm 0x7f8b0c009010 rank 42 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 00 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 01 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 04 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Channel 05 : 13[901d0] -> 12[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30347:30853 [7] NCCL INFO comm 0x7f59c8009010 rank 47 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-51:30343:30849 [3] NCCL INFO comm 0x7fc078009010 rank 43 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-51:30345:30854 [5] NCCL INFO comm 0x7fc5dc009010 rank 45 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 00 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 01 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 04 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Channel 05 : 31[a01d0] -> 30[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 04/0 : 84[901c0] -> 79[a01d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Channel 05/0 : 84[901c0] -> 79[a01d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 02 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 03 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 06 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Channel 07 : 89[101d0] -> 88[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-54:29519:30067 [1] NCCL INFO comm 0x7f3130009010 rank 65 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-54:29516:30071 [6] NCCL INFO comm 0x7fe490009010 rank 62 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-54:29514:30069 [4] NCCL INFO comm 0x7f1024009010 rank 60 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-54:29515:30066 [5] NCCL INFO comm 0x7f939c009010 rank 61 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-54:29517:30068 [7] NCCL INFO comm 0x7f25c8009010 rank 63 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-54:29518:30070 [0] NCCL INFO comm 0x7fa29c009010 rank 64 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28171:28973 [0] NCCL INFO comm 0x7f21bc009010 rank 72 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28172:28968 [1] NCCL INFO comm 0x7f2e38009010 rank 73 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28176:28971 [5] NCCL INFO comm 0x7f41ac009010 rank 77 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28173:28967 [2] NCCL INFO comm 0x7f2910009010 rank 74 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28174:28970 [3] NCCL INFO comm 0x7fa2fc009010 rank 75 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-56:28175:28972 [4] NCCL INFO comm 0x7f2810009010 rank 76 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 06/0 : 8[101c0] -> 4[901c0] [send] via NET/Socket/0
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 04/0 : 6[a01c0] -> 0[101c0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Channel 07/0 : 8[101c0] -> 4[901c0] [send] via NET/Socket/1
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Channel 05/0 : 6[a01c0] -> 0[101c0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-53:26991:27496 [3] NCCL INFO comm 0x7f1c08009010 rank 59 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26988:27497 [0] NCCL INFO comm 0x7fbf94009010 rank 56 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26989:27495 [1] NCCL INFO comm 0x7f6d40009010 rank 57 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26990:27493 [2] NCCL INFO comm 0x7fb30c009010 rank 58 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26986:27498 [6] NCCL INFO comm 0x7f47cc009010 rank 54 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-53:26987:27494 [7] NCCL INFO comm 0x7fa94c009010 rank 55 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 00 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 01 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 04 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Channel 05 : 79[a01d0] -> 78[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-52:39401:39921 [4] NCCL INFO comm 0x7f9220009010 rank 52 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-52:39402:39919 [5] NCCL INFO comm 0x7f9d04009010 rank 53 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-52:39398:39917 [1] NCCL INFO comm 0x7ff498009010 rank 49 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-52:39400:39922 [3] NCCL INFO comm 0x7f1134009010 rank 51 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-52:39397:39920 [0] NCCL INFO comm 0x7f7e5c009010 rank 48 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-52:39399:39918 [2] NCCL INFO comm 0x7f254c009010 rank 50 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 04/0 : 12[901c0] -> 7[a01d0] [send] via NET/Socket/2
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Channel 05/0 : 12[901c0] -> 7[a01d0] [send] via NET/Socket/3
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-50:29213:29720 [6] NCCL INFO comm 0x7f2088009010 rank 38 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-50:29211:29719 [4] NCCL INFO comm 0x7f7a40009010 rank 36 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-50:29212:29721 [5] NCCL INFO comm 0x7fd770009010 rank 37 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-50:29216:29724 [1] NCCL INFO comm 0x7f3b88009010 rank 41 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-50:29214:29723 [7] NCCL INFO comm 0x7fb134009010 rank 39 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-50:29215:29722 [0] NCCL INFO comm 0x7fe408009010 rank 40 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 02 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 03 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 06 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Channel 07 : 17[101d0] -> 16[101c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-49:25776:26284 [6] NCCL INFO comm 0x7fd568009010 rank 30 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-49:25779:26285 [1] NCCL INFO comm 0x7f080c009010 rank 33 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-49:25781:26281 [3] NCCL INFO comm 0x7f0154009010 rank 35 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-49:25778:26280 [0] NCCL INFO comm 0x7f1324009010 rank 32 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-49:25780:26286 [2] NCCL INFO comm 0x7fb678009010 rank 34 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-49:25777:26283 [7] NCCL INFO comm 0x7f239c009010 rank 31 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 02 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30193:30702 [7] NCCL INFO comm 0x7f2dd8009010 rank 23 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-47:30190:30703 [4] NCCL INFO comm 0x7f7bb4009010 rank 20 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-47:30192:30698 [6] NCCL INFO comm 0x7ff590009010 rank 22 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-47:30188:30701 [2] NCCL INFO comm 0x7f07b8009010 rank 18 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 03 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-47:30189:30700 [3] NCCL INFO comm 0x7faaec009010 rank 19 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-47:30191:30699 [5] NCCL INFO comm 0x7f19a0009010 rank 21 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 06 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-59:27919:28520 [3] NCCL INFO comm 0x7efcfc009010 rank 91 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-59:27918:28517 [2] NCCL INFO comm 0x7f3934009010 rank 90 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-59:27921:28521 [5] NCCL INFO comm 0x7fdd68009010 rank 93 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-59:27923:28518 [7] NCCL INFO comm 0x7f66d4009010 rank 95 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-59:27922:28516 [6] NCCL INFO comm 0x7ff598009010 rank 94 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-59:27920:28519 [4] NCCL INFO comm 0x7f6e6c009010 rank 92 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Channel 07 : 5[901d0] -> 4[901c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 00 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 01 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 04 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Channel 05 : 7[a01d0] -> 6[a01c0] via P2P/IPC/read
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO Connected all trees
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO threadThresholds 8/8/64 | 768/8/64 | 8/8/512
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO 8 coll channels, 8 p2p channels, 2 p2p channels per peer
gpu-st-p4d-24xlarge-57:27791:28308 [6] NCCL INFO comm 0x7f6550009010 rank 78 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-57:27795:28304 [2] NCCL INFO comm 0x7f71bc009010 rank 82 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-57:27793:28309 [0] NCCL INFO comm 0x7f3cc0009010 rank 80 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-57:27794:28306 [1] NCCL INFO comm 0x7f8a80009010 rank 81 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-57:27796:28307 [3] NCCL INFO comm 0x7f729c009010 rank 83 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-57:27792:28305 [7] NCCL INFO comm 0x7fdb24009010 rank 79 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27723:28236 [0] NCCL INFO comm 0x7f866c009010 rank 88 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27721:28234 [6] NCCL INFO comm 0x7f41b0009010 rank 86 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27719:28237 [4] NCCL INFO comm 0x7f7a3c009010 rank 84 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27720:28232 [5] NCCL INFO comm 0x7ff810009010 rank 85 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27722:28233 [7] NCCL INFO comm 0x7fc6d8009010 rank 87 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-58:27724:28235 [1] NCCL INFO comm 0x7f0b24009010 rank 89 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28069:28907 [4] NCCL INFO comm 0x7f7aac009010 rank 4 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30083:30635 [6] NCCL INFO comm 0x7fd7d8009010 rank 14 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30081:30633 [4] NCCL INFO comm 0x7fae6c009010 rank 12 nranks 96 cudaDev 4 busId 901c0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30085:30637 [0] NCCL INFO comm 0x7f674c009010 rank 16 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30135:30642 [6] NCCL INFO comm 0x7f1668009010 rank 6 nranks 96 cudaDev 6 busId a01c0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30082:30636 [5] NCCL INFO comm 0x7f8718009010 rank 13 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30086:30632 [1] NCCL INFO comm 0x7f2de8009010 rank 17 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-46:30084:30634 [7] NCCL INFO comm 0x7f79f8009010 rank 15 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30136:30644 [7] NCCL INFO comm 0x7f9840009010 rank 7 nranks 96 cudaDev 7 busId a01d0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30138:30645 [1] NCCL INFO comm 0x7fb2c4009010 rank 9 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28065:28903 [0] NCCL INFO comm 0x7f483c009010 rank 0 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28067:28911 [2] NCCL INFO comm 0x7fc4e4009010 rank 2 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28065:28065 [0] NCCL INFO Launch mode Parallel
gpu-st-p4d-24xlarge-45:30145:30643 [3] NCCL INFO comm 0x7f3fd4009010 rank 11 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30137:30647 [0] NCCL INFO comm 0x7f4de8009010 rank 8 nranks 96 cudaDev 0 busId 101c0 - Init COMPLETE
gpu-st-p4d-24xlarge-45:30139:30646 [2] NCCL INFO comm 0x7f22b4009010 rank 10 nranks 96 cudaDev 2 busId 201c0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28068:28905 [3] NCCL INFO comm 0x7fb374009010 rank 3 nranks 96 cudaDev 3 busId 201d0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28070:28908 [5] NCCL INFO comm 0x7f4824009010 rank 5 nranks 96 cudaDev 5 busId 901d0 - Init COMPLETE
gpu-st-p4d-24xlarge-44:28066:28906 [1] NCCL INFO comm 0x7f2d54009010 rank 1 nranks 96 cudaDev 1 busId 101d0 - Init COMPLETE
===========================================================================
Layer (type:depth-idx) Param #
===========================================================================
DistributedDataParallel --
├─ConvNeXt: 1-1 --
│ └─Sequential: 2-1 --
│ │ └─Conv2d: 3-1 24,832
│ │ └─LayerNorm2d: 3-2 512
│ └─Sequential: 2-2 --
│ │ └─ConvNeXtStage: 3-3 1,617,408
│ │ └─ConvNeXtStage: 3-4 6,905,856
│ │ └─ConvNeXtStage: 3-5 230,195,200
│ │ └─ConvNeXtStage: 3-6 109,412,352
│ └─Identity: 2-3 --
│ └─Sequential: 2-4 --
│ │ └─SelectAdaptivePool2d: 3-7 --
│ │ └─LayerNorm2d: 3-8 4,096
│ │ └─Flatten: 3-9 --
│ │ └─Dropout: 3-10 --
│ │ └─Linear: 3-11 104,499
===========================================================================
Total params: 348,264,755
Trainable params: 348,264,755
Non-trainable params: 0
===========================================================================
2022-08-26 22:21:04.354284: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.372947: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.377720: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.379205: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.382281: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.383752: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.386802: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.387481: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.388446: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.389611: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.389822: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.389868: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.392839: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.396323: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.397133: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.397629: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.398357: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.399119: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.400092: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.400348: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.401414: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.402340: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.403810: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.406097: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.406138: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.408209: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.408203: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.408708: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.408746: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.409046: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.409935: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.410291: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.410099: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.410347: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.411249: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.411253: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.411291: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.413350: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.413463: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.413488: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.415528: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.416317: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.416868: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.416868: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.416875: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.416899: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.417148: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.418477: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.418731: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.419576: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.419609: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.419827: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.420470: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.420980: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.421020: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.422087: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.425960: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.428004: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.427978: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.428153: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.428316: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.429122: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.429246: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.433292: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.433392: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.436057: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.436868: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.437078: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.438253: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.440693: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.440718: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.441147: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.441777: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.442713: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.443545: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.443830: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.444067: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.444296: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.444482: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.445428: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.447355: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.449262: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.455116: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.455230: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.457389: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.457427: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.458257: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.458673: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.459110: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.459695: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.462970: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.465576: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.466839: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.475390: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.501794: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-26 22:21:04.587484: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Starting loop
Epoch: 0, Step: 0, Loss: 19.943899154663086 , Acc: 0.0 | Time taken: 202.68939018249512
Epoch: 0, Step: 0, Loss: 19.987777709960938 , Acc: 0.0 | Time taken: 205.91121792793274
Epoch: 0, Step: 0, Loss: 19.817249298095703 , Acc: 0.0 | Time taken: 208.08953380584717
Epoch: 0, Step: 0, Loss: 19.493167877197266 , Acc: 0.0 | Time taken: 198.46081614494324
Epoch: 0, Step: 0, Loss: 18.375741958618164 , Acc: 0.0 | Time taken: 139.7204189300537
Epoch: 0, Step: 0, Loss: 17.676410675048828 , Acc: 0.0 | Time taken: 118.25044631958008
Epoch: 0, Step: 0, Loss: 11.584511756896973 , Acc: 0.0 | Time taken: 201.53303003311157Epoch: 0, Step: 0, Loss: 11.903446197509766 , Acc: 0.0 | Time taken: 200.31899857521057
Epoch: 0, Step: 0, Loss: 18.943023681640625 , Acc: 0.0 | Time taken: 195.6868269443512
Epoch: 0, Step: 0, Loss: 14.19175910949707 , Acc: 0.0 | Time taken: 199.0504264831543
Epoch: 0, Step: 0, Loss: 11.817590713500977 , Acc: 0.0 | Time taken: 107.53136038780212
Epoch: 0, Step: 0, Loss: 19.778484344482422 , Acc: 0.0 | Time taken: 193.1314821243286
Epoch: 0, Step: 0, Loss: 19.515384674072266 , Acc: 0.0 | Time taken: 214.25619220733643
Epoch: 0, Step: 0, Loss: 19.515384674072266 , Acc: 0.0 | Time taken: 199.47413682937622
Epoch: 0, Step: 0, Loss: 19.73740577697754 , Acc: 0.0 | Time taken: 198.49446988105774
Epoch: 0, Step: 0, Loss: 19.836467742919922 , Acc: 0.0 | Time taken: 134.5910062789917
Epoch: 0, Step: 0, Loss: 19.968788146972656 , Acc: 0.0 | Time taken: 110.49465250968933
Epoch: 0, Step: 0, Loss: 11.780370712280273 , Acc: 0.0 | Time taken: 141.0984127521515
Epoch: 0, Step: 0, Loss: 20.118988037109375 , Acc: 0.0 | Time taken: 140.64971256256104
Epoch: 0, Step: 0, Loss: 11.584511756896973 , Acc: 0.0 | Time taken: 197.5575487613678
Epoch: 0, Step: 0, Loss: 19.57272720336914 , Acc: 0.0 | Time taken: 140.49980330467224
Epoch: 0, Step: 0, Loss: 12.053749084472656 , Acc: 0.0 | Time taken: 144.46794605255127
Epoch: 0, Step: 0, Loss: 11.775741577148438 , Acc: 0.0 | Time taken: 197.6045639514923
Epoch: 0, Step: 0, Loss: 19.984542846679688 , Acc: 0.0 | Time taken: 195.34999990463257
Epoch: 0, Step: 0, Loss: 19.984024047851562 , Acc: 0.0 | Time taken: 200.05970001220703
Epoch: 0, Step: 0, Loss: 14.514179229736328 , Acc: 0.0 | Time taken: 122.13053178787231
Epoch: 0, Step: 0, Loss: 20.206199645996094 , Acc: 0.0 | Time taken: 167.89329314231873
Epoch: 0, Step: 0, Loss: 19.871627807617188 , Acc: 0.0 | Time taken: 139.52446031570435
Epoch: 0, Step: 0, Loss: 19.639116287231445 , Acc: 0.0 | Time taken: 141.90000414848328
Epoch: 0, Step: 0, Loss: 19.481441497802734 , Acc: 0.0 | Time taken: 195.99468445777893
Epoch: 0, Step: 0, Loss: 15.276798248291016 , Acc: 0.0 | Time taken: 155.8594193458557
Epoch: 0, Step: 0, Loss: 19.57255744934082 , Acc: 0.0 | Time taken: 196.9389283657074
Epoch: 0, Step: 0, Loss: 19.737197875976562 , Acc: 0.171875 | Time taken: 198.97129273414612
Epoch: 0, Step: 0, Loss: 20.013158798217773 , Acc: 0.0 | Time taken: 139.22712898254395
Epoch: 0, Step: 0, Loss: 19.818248748779297 , Acc: 0.0 | Time taken: 176.25659894943237
Epoch: 0, Step: 0, Loss: 19.889266967773438 , Acc: 0.0 | Time taken: 202.01902174949646
Epoch: 0, Step: 0, Loss: 19.898035049438477 , Acc: 0.0 | Time taken: 259.3313765525818
Epoch: 0, Step: 0, Loss: 19.463672637939453 , Acc: 0.0 | Time taken: 136.95128917694092
Epoch: 0, Step: 0, Loss: 11.808197021484375 , Acc: 0.0 | Time taken: 192.862895488739
Epoch: 0, Step: 0, Loss: 16.529481887817383 , Acc: 0.0 | Time taken: 140.8465394973755
Epoch: 0, Step: 0, Loss: 19.714950561523438 , Acc: 0.15625 | Time taken: 142.33827781677246
Epoch: 0, Step: 0, Loss: 19.99258041381836 , Acc: 0.0 | Time taken: 141.23065185546875
Epoch: 0, Step: 0, Loss: 19.920970916748047 , Acc: 0.0 | Time taken: 141.70846104621887
Epoch: 0, Step: 0, Loss: 19.153942108154297 , Acc: 0.0 | Time taken: 203.262188911438
Epoch: 0, Step: 0, Loss: 20.039621353149414 , Acc: 0.0 | Time taken: 195.4386613368988
Epoch: 0, Step: 0, Loss: 19.778423309326172 , Acc: 0.0 | Time taken: 141.32375192642212
Epoch: 0, Step: 0, Loss: 17.223506927490234 , Acc: 0.0 | Time taken: 149.77710819244385
Epoch: 0, Step: 0, Loss: 20.210657119750977 , Acc: 0.0 | Time taken: 198.67266035079956
Epoch: 0, Step: 0, Loss: 19.5445556640625 , Acc: 0.0 | Time taken: 197.39984917640686
Epoch: 0, Step: 0, Loss: 19.591524124145508 , Acc: 0.0 | Time taken: 132.8765754699707
Epoch: 0, Step: 0, Loss: 15.062592506408691 , Acc: 0.0 | Time taken: 115.13482189178467
Epoch: 0, Step: 0, Loss: 17.355504989624023 , Acc: 0.0 | Time taken: 157.26107096672058
Epoch: 0, Step: 0, Loss: 19.912269592285156 , Acc: 0.0 | Time taken: 145.35011839866638
Epoch: 0, Step: 0, Loss: 18.666757583618164 , Acc: 0.0 | Time taken: 203.18580055236816
Epoch: 0, Step: 0, Loss: 19.93193817138672 , Acc: 0.0 | Time taken: 137.0443925857544
Epoch: 0, Step: 0, Loss: 11.852392196655273 , Acc: 0.0 | Time taken: 136.47687602043152
Epoch: 0, Step: 0, Loss: 11.808197021484375 , Acc: 0.0 | Time taken: 147.81792163848877
Epoch: 0, Step: 0, Loss: 19.73470115661621 , Acc: 0.0 | Time taken: 201.50432968139648
Epoch: 0, Step: 0, Loss: 19.5964412689209 , Acc: 0.0 | Time taken: 196.0880868434906
Epoch: 0, Step: 0, Loss: 19.7924861907959 , Acc: 0.0 | Time taken: 191.85613584518433
Epoch: 0, Step: 0, Loss: 15.382076263427734 , Acc: 0.0 | Time taken: 137.6862621307373
Epoch: 0, Step: 0, Loss: 15.616395950317383 , Acc: 0.0 | Time taken: 139.78927755355835
Epoch: 0, Step: 0, Loss: 19.659992218017578 , Acc: 0.0 | Time taken: 116.17348623275757
Epoch: 0, Step: 0, Loss: 15.188693046569824 , Acc: 0.0 | Time taken: 105.882239818573
Epoch: 0, Step: 0, Loss: 11.80894660949707 , Acc: 0.0 | Time taken: 198.17706179618835
Epoch: 0, Step: 0, Loss: 18.985166549682617 , Acc: 0.0 | Time taken: 138.62101411819458
Epoch: 0, Step: 0, Loss: 14.655655860900879 , Acc: 0.0 | Time taken: 143.02500200271606
Epoch: 0, Step: 0, Loss: 19.696178436279297 , Acc: 0.0 | Time taken: 138.1425290107727
Epoch: 0, Step: 0, Loss: 19.69489860534668 , Acc: 0.0 | Time taken: 122.9480459690094
Epoch: 0, Step: 0, Loss: 18.925922393798828 , Acc: 0.0 | Time taken: 145.2811713218689
Epoch: 0, Step: 0, Loss: 19.63861656188965 , Acc: 0.0 | Time taken: 194.4568736553192
Epoch: 0, Step: 0, Loss: 12.012990951538086 , Acc: 0.0 | Time taken: 140.72496008872986
Epoch: 0, Step: 0, Loss: 15.163640975952148 , Acc: 0.0 | Time taken: 140.3770787715912
Epoch: 0, Step: 0, Loss: 14.655655860900879 , Acc: 0.0 | Time taken: 162.99632287025452
Epoch: 0, Step: 0, Loss: 19.677093505859375 , Acc: 0.0 | Time taken: 198.63446235656738
Epoch: 0, Step: 0, Loss: 11.817590713500977 , Acc: 0.0 | Time taken: 201.51597785949707
Epoch: 0, Step: 0, Loss: 19.68549346923828 , Acc: 0.0 | Time taken: 137.67197608947754
Epoch: 0, Step: 0, Loss: 19.550180435180664 , Acc: 0.03125 | Time taken: 198.74911260604858
Epoch: 0, Step: 0, Loss: 19.981128692626953 , Acc: 0.0 | Time taken: 198.22725701332092
Epoch: 0, Step: 0, Loss: 11.78958797454834 , Acc: 0.0 | Time taken: 199.5324022769928
Epoch: 0, Step: 0, Loss: 20.118988037109375 , Acc: 0.0 | Time taken: 170.49751162528992
Epoch: 0, Step: 0, Loss: 15.868463516235352 , Acc: 0.0 | Time taken: 197.94081830978394
Epoch: 0, Step: 0, Loss: 11.527729034423828 , Acc: 0.0 | Time taken: 137.81649565696716
Epoch: 0, Step: 0, Loss: 19.56537437438965 , Acc: 0.0 | Time taken: 140.21940732002258
Epoch: 0, Step: 0, Loss: 11.804129600524902 , Acc: 0.0 | Time taken: 199.34225797653198
Epoch: 0, Step: 0, Loss: 19.72756576538086 , Acc: 0.0 | Time taken: 198.1713945865631
Epoch: 0, Step: 0, Loss: 19.859020233154297 , Acc: 0.0 | Time taken: 196.63833713531494
Epoch: 0, Step: 0, Loss: 19.952178955078125 , Acc: 0.0 | Time taken: 200.23561024665833
Epoch: 0, Step: 0, Loss: 11.794713973999023 , Acc: 0.0 | Time taken: 199.07277727127075
Epoch: 0, Step: 0, Loss: 19.912269592285156 , Acc: 0.0 | Time taken: 200.44544053077698
Epoch: 0, Step: 0, Loss: 19.771989822387695 , Acc: 0.0 | Time taken: 139.4658329486847
Epoch: 0, Step: 0, Loss: 15.667868614196777 , Acc: 0.0 | Time taken: 135.7098581790924
Epoch: 0, Step: 0, Loss: 19.778484344482422 , Acc: 0.0 | Time taken: 150.74809956550598
Epoch: 0, Step: 0, Loss: 15.883161544799805 , Acc: 0.0 | Time taken: 130.989337682724
Epoch: 0, Step: 0, Loss: 19.87149429321289 , Acc: 0.0 | Time taken: 140.71923184394836
Epoch: 0, Step: 0, Loss: 11.761435508728027 , Acc: 0.0 | Time taken: 195.58116483688354
Epoch: 0, Step: 50, Loss: 20.038393020629883 , Acc: 0.0 | Time taken: 3.5011448860168457
Epoch: 0, Step: 50, Loss: 18.061710357666016 , Acc: 0.0 | Time taken: 3.50065016746521
Epoch: 0, Step: 50, Loss: 24.769943237304688 , Acc: 0.0 | Time taken: 3.240126132965088
Epoch: 0, Step: 50, Loss: 17.05221176147461 , Acc: 0.0 | Time taken: 3.1633899211883545
Epoch: 0, Step: 50, Loss: 15.810155868530273 , Acc: 0.0 | Time taken: 3.499434471130371
Epoch: 0, Step: 50, Loss: 13.472417831420898 , Acc: 0.921875 | Time taken: 3.346092700958252
Epoch: 0, Step: 50, Loss: 16.903121948242188 , Acc: 0.0 | Time taken: 3.4975948333740234
Epoch: 0, Step: 50, Loss: 28.167705535888672 , Acc: 0.0 | Time taken: 3.5026490688323975
Epoch: 0, Step: 50, Loss: 17.760799407958984 , Acc: 0.0 | Time taken: 3.4991610050201416
Epoch: 0, Step: 50, Loss: 11.970874786376953 , Acc: 0.0 | Time taken: 3.504546642303467
Epoch: 0, Step: 50, Loss: 16.397994995117188 , Acc: 0.0 | Time taken: 3.494872808456421
Epoch: 0, Step: 50, Loss: 14.337437629699707 , Acc: 0.1875 | Time taken: 3.499532699584961
Epoch: 0, Step: 50, Loss: 12.147560119628906 , Acc: 0.078125 | Time taken: 3.495561122894287
Epoch: 0, Step: 50, Loss: 27.81817626953125 , Acc: 0.0 | Time taken: 3.495687484741211
Epoch: 0, Step: 50, Loss: 14.962888717651367 , Acc: 0.0 | Time taken: 3.4939026832580566
Epoch: 0, Step: 50, Loss: 16.693016052246094 , Acc: 0.0 | Time taken: 3.4954442977905273
Epoch: 0, Step: 50, Loss: 16.223472595214844 , Acc: 0.0 | Time taken: 3.494844675064087
Epoch: 0, Step: 50, Loss: 16.743656158447266 , Acc: 0.0 | Time taken: 3.502980947494507
Epoch: 0, Step: 50, Loss: 8.54758071899414 , Acc: 0.546875 | Time taken: 3.4361419677734375
Epoch: 0, Step: 50, Loss: 21.364377975463867 , Acc: 0.0 | Time taken: 3.4961352348327637
Epoch: 0, Step: 50, Loss: 24.025676727294922 , Acc: 0.0 | Time taken: 3.502211809158325
Epoch: 0, Step: 50, Loss: 16.617528915405273 , Acc: 0.0 | Time taken: 3.503201723098755
Epoch: 0, Step: 50, Loss: 18.41788673400879 , Acc: 0.0 | Time taken: 3.1850054264068604
Epoch: 0, Step: 50, Loss: 14.301187515258789 , Acc: 0.265625 | Time taken: 3.496520757675171
Epoch: 0, Step: 50, Loss: 16.27230453491211 , Acc: 0.0 | Time taken: 3.503924608230591
Epoch: 0, Step: 50, Loss: 16.859397888183594 , Acc: 0.0 | Time taken: 3.4807605743408203
Epoch: 0, Step: 50, Loss: 14.04751205444336 , Acc: 0.59375 | Time taken: 3.503180980682373
Epoch: 0, Step: 50, Loss: 15.487218856811523 , Acc: 0.0 | Time taken: 3.4923107624053955
Epoch: 0, Step: 50, Loss: 23.8161678314209 , Acc: 0.0 | Time taken: 3.4811079502105713
Epoch: 0, Step: 50, Loss: 15.456480979919434 , Acc: 0.0 | Time taken: 3.3377275466918945
Epoch: 0, Step: 50, Loss: 22.720989227294922 , Acc: 0.0 | Time taken: 3.5009539127349854
Epoch: 0, Step: 50, Loss: 8.577889442443848 , Acc: 0.046875 | Time taken: 3.4999911785125732
Epoch: 0, Step: 50, Loss: 14.469091415405273 , Acc: 0.0 | Time taken: 3.5026357173919678
Epoch: 0, Step: 50, Loss: 19.889137268066406 , Acc: 0.0 | Time taken: 3.5004515647888184
Epoch: 0, Step: 50, Loss: 11.64035701751709 , Acc: 0.0 | Time taken: 3.2225239276885986
Epoch: 0, Step: 50, Loss: 28.77550506591797 , Acc: 0.0 | Time taken: 3.493682861328125
Epoch: 0, Step: 50, Loss: 22.211091995239258 , Acc: 0.0 | Time taken: 3.496549367904663
Epoch: 0, Step: 50, Loss: 13.35706615447998 , Acc: 0.828125 | Time taken: 3.4952545166015625
Epoch: 0, Step: 50, Loss: 18.60054588317871 , Acc: 0.0 | Time taken: 3.5019266605377197
Epoch: 0, Step: 50, Loss: 16.35378646850586 , Acc: 0.0 | Time taken: 3.4917452335357666
Epoch: 0, Step: 50, Loss: 15.054426193237305 , Acc: 0.0 | Time taken: 3.5027215480804443
Epoch: 0, Step: 50, Loss: 8.577889442443848 , Acc: 0.046875 | Time taken: 3.500481605529785
Epoch: 0, Step: 50, Loss: 15.349031448364258 , Acc: 0.0 | Time taken: 3.358882427215576
Epoch: 0, Step: 50, Loss: 14.513076782226562 , Acc: 0.5 | Time taken: 3.496181011199951
Epoch: 0, Step: 50, Loss: 8.390466690063477 , Acc: 0.0 | Time taken: 3.502443313598633
Epoch: 0, Step: 50, Loss: 23.744415283203125 , Acc: 0.0 | Time taken: 3.4960215091705322
Epoch: 0, Step: 50, Loss: 8.993657112121582 , Acc: 0.0 | Time taken: 3.503264904022217
Epoch: 0, Step: 50, Loss: 14.99350357055664 , Acc: 0.0 | Time taken: 3.502300500869751
Epoch: 0, Step: 50, Loss: 16.02570915222168 , Acc: 0.0 | Time taken: 3.510237455368042
Epoch: 0, Step: 50, Loss: 16.632579803466797 , Acc: 0.0 | Time taken: 3.503570079803467
Epoch: 0, Step: 50, Loss: 25.662242889404297 , Acc: 0.0 | Time taken: 3.523710012435913
Epoch: 0, Step: 50, Loss: 16.535140991210938 , Acc: 0.0 | Time taken: 3.5236728191375732
Epoch: 0, Step: 50, Loss: 28.171875 , Acc: 0.0 | Time taken: 3.523655414581299
Epoch: 0, Step: 50, Loss: 19.19924545288086 , Acc: 0.0 | Time taken: 3.293935537338257
Epoch: 0, Step: 50, Loss: 16.626239776611328 , Acc: 0.0 | Time taken: 3.5202085971832275
Epoch: 0, Step: 50, Loss: 16.700523376464844 , Acc: 0.0 | Time taken: 3.4052746295928955
Epoch: 0, Step: 50, Loss: 10.576862335205078 , Acc: 0.234375 | Time taken: 3.488424062728882
Epoch: 0, Step: 50, Loss: 15.670470237731934 , Acc: 0.0 | Time taken: 3.5229287147521973
Epoch: 0, Step: 50, Loss: 15.471056938171387 , Acc: 0.0 | Time taken: 3.058933973312378
Epoch: 0, Step: 50, Loss: 15.814111709594727 , Acc: 0.0 | Time taken: 3.525090217590332
Epoch: 0, Step: 50, Loss: 17.715770721435547 , Acc: 0.0 | Time taken: 3.523406982421875
Epoch: 0, Step: 50, Loss: 11.344772338867188 , Acc: 0.0 | Time taken: 3.509988307952881
Epoch: 0, Step: 50, Loss: 21.345888137817383 , Acc: 0.0 | Time taken: 3.524320125579834
Epoch: 0, Step: 50, Loss: 20.00176239013672 , Acc: 0.0 | Time taken: 3.5074329376220703
Epoch: 0, Step: 50, Loss: 22.211091995239258 , Acc: 0.0 | Time taken: 3.5218594074249268
Epoch: 0, Step: 50, Loss: 14.962888717651367 , Acc: 0.0 | Time taken: 3.523887872695923
Epoch: 0, Step: 50, Loss: 17.250328063964844 , Acc: 0.0 | Time taken: 3.523099184036255
Epoch: 0, Step: 50, Loss: 17.436492919921875 , Acc: 0.0 | Time taken: 3.523989200592041
Epoch: 0, Step: 50, Loss: 16.661800384521484 , Acc: 0.0 | Time taken: 3.5239012241363525
Epoch: 0, Step: 50, Loss: 16.880949020385742 , Acc: 0.0 | Time taken: 3.5219814777374268
Epoch: 0, Step: 50, Loss: 15.777007102966309 , Acc: 0.0 | Time taken: 3.519026517868042
Epoch: 0, Step: 50, Loss: 24.054317474365234 , Acc: 0.0 | Time taken: 3.295353651046753
Epoch: 0, Step: 50, Loss: 14.469091415405273 , Acc: 0.0 | Time taken: 3.521101713180542
Epoch: 0, Step: 50, Loss: 14.86700439453125 , Acc: 0.359375 | Time taken: 3.507798194885254
Epoch: 0, Step: 50, Loss: 15.369190216064453 , Acc: 0.0 | Time taken: 3.5268609523773193
Epoch: 0, Step: 50, Loss: 15.762784957885742 , Acc: 0.0 | Time taken: 3.5224668979644775
Epoch: 0, Step: 50, Loss: 9.163860321044922 , Acc: 0.0 | Time taken: 3.521397113800049
Epoch: 0, Step: 50, Loss: 15.609086990356445 , Acc: 0.0 | Time taken: 3.518476963043213
Epoch: 0, Step: 50, Loss: 8.54758071899414 , Acc: 0.546875 | Time taken: 3.2743489742279053
Epoch: 0, Step: 50, Loss: 17.760753631591797 , Acc: 0.0 | Time taken: 3.5225236415863037
Epoch: 0, Step: 50, Loss: 21.36757469177246 , Acc: 0.0 | Time taken: 3.5263290405273438
Epoch: 0, Step: 50, Loss: 24.625385284423828 , Acc: 0.0 | Time taken: 3.232883930206299
Epoch: 0, Step: 50, Loss: 19.349485397338867 , Acc: 0.0 | Time taken: 3.5224673748016357
Epoch: 0, Step: 50, Loss: 16.285114288330078 , Acc: 0.0 | Time taken: 3.5252325534820557
Epoch: 0, Step: 50, Loss: 15.935747146606445 , Acc: 0.0 | Time taken: 3.489647150039673
Epoch: 0, Step: 50, Loss: 13.042746543884277 , Acc: 0.15625 | Time taken: 3.3556835651397705
Epoch: 0, Step: 50, Loss: 22.479259490966797 , Acc: 0.0 | Time taken: 3.4429924488067627
Epoch: 0, Step: 50, Loss: 13.412002563476562 , Acc: 0.9375 | Time taken: 3.525402545928955
Epoch: 0, Step: 50, Loss: 8.283740997314453 , Acc: 1.0 | Time taken: 3.5308353900909424
Epoch: 0, Step: 50, Loss: 16.06515121459961 , Acc: 0.0 | Time taken: 3.5328478813171387
Epoch: 0, Step: 50, Loss: 15.246273040771484 , Acc: 0.0 | Time taken: 3.529102087020874
Epoch: 0, Step: 50, Loss: 8.567059516906738 , Acc: 0.125 | Time taken: 3.4655728340148926
Epoch: 0, Step: 50, Loss: 15.020666122436523 , Acc: 0.046875 | Time taken: 3.534762144088745
Epoch: 0, Step: 50, Loss: 10.266606330871582 , Acc: 0.0 | Time taken: 3.5175764560699463
Epoch: 0, Step: 50, Loss: 20.12688446044922 , Acc: 0.0 | Time taken: 3.7150871753692627
Epoch: 0, Step: 50, Loss: 15.483797073364258 , Acc: 0.0 | Time taken: 3.550638437271118
gpu-st-p4d-24xlarge-51:30345:30909 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<45466>
gpu-st-p4d-24xlarge-51:30345:30909 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-51:30345:30909 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-51:30345:30909 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-51:30345:30909 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-51:30345:30909 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-53:26988:27564 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<48964>
gpu-st-p4d-24xlarge-53:26988:27564 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-53:26988:27564 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-53:26988:27564 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-53:26988:27564 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-53:26988:27564 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-54:29518:30123 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<52200>
gpu-st-p4d-24xlarge-54:29518:30123 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-54:29518:30123 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-54:29518:30123 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-54:29518:30123 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-54:29518:30123 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29154 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29155 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29157 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29158 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29159 closing signal SIGTERM
gpu-st-p4d-24xlarge-54:29519:30125 [0] misc/socket.cc:503 NCCL WARN Net : Call to recv from 172.31.239.214<43187> failed : Connection reset by peer
gpu-st-p4d-24xlarge-54:29519:30125 [0] NCCL INFO misc/socket.cc:520 -> 2
gpu-st-p4d-24xlarge-54:29519:30125 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-54:29519:30125 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-54:29519:30125 [0] NCCL INFO transport/net.cc:870 -> 2
gpu-st-p4d-24xlarge-54:29519:30125 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-54:29519:30125 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-52:39401:39975 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<47846>
gpu-st-p4d-24xlarge-52:39401:39975 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-52:39401:39975 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-52:39401:39975 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-52:39401:39975 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-52:39401:39975 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-51:30345:30643 [0] NCCL INFO comm 0x7fc5dc009010 rank 45 nranks 96 cudaDev 5 busId 901d0 - Abort COMPLETE
gpu-st-p4d-24xlarge-53:26988:27286 [0] NCCL INFO comm 0x7fbf94009010 rank 56 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-56:28176:29026 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-229-154.ec2.internal<37810>
gpu-st-p4d-24xlarge-56:28176:29026 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-56:28176:29026 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-56:28176:29026 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-56:28176:29026 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-56:28176:29026 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-53:26989:27566 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-229-154.ec2.internal<40624>
gpu-st-p4d-24xlarge-53:26989:27566 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-53:26989:27566 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-53:26989:27566 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-53:26989:27566 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-53:26989:27566 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-54:29518:29815 [0] NCCL INFO comm 0x7fa29c009010 rank 64 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
gpu-st-p4d-24xlarge-54:29514:30126 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<58940>
gpu-st-p4d-24xlarge-54:29514:30126 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-54:29514:30126 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-54:29514:30126 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-54:29514:30126 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-54:29514:30126 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-53:26986:27567 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<36242>
gpu-st-p4d-24xlarge-53:26986:27567 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-53:26986:27567 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-53:26986:27567 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-53:26986:27567 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-53:26986:27567 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-51:30343:30906 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<55604>
gpu-st-p4d-24xlarge-51:30343:30906 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-51:30343:30906 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-51:30343:30906 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-51:30343:30906 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-51:30343:30906 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-57:27793:28456 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<47998>
gpu-st-p4d-24xlarge-57:27793:28456 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-57:27793:28456 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-57:27793:28456 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-57:27793:28456 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-57:27793:28456 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-56:28175:29024 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<45056>
gpu-st-p4d-24xlarge-56:28175:29024 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-56:28175:29024 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-56:28175:29024 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-56:28175:29024 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-56:28175:29024 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-57:27791:28460 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<33796>
gpu-st-p4d-24xlarge-57:27791:28460 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-57:27791:28460 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-57:27791:28460 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-57:27791:28460 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-57:27791:28460 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-56:28171:29027 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-150.ec2.internal<56018>
gpu-st-p4d-24xlarge-56:28171:29027 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-56:28171:29027 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-56:28171:29027 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-56:28171:29027 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-56:28171:29027 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 2 (pid: 29156) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-54:29519:29807 [0] NCCL INFO comm 0x7f3130009010 rank 65 nranks 96 cudaDev 1 busId 101d0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-52:39402:39977 [0] misc/socket.cc:503 NCCL WARN Net : Call to recv from 172.31.227.208<54377> failed : Broken pipe
gpu-st-p4d-24xlarge-52:39402:39977 [0] NCCL INFO misc/socket.cc:520 -> 2
gpu-st-p4d-24xlarge-52:39402:39977 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-52:39402:39977 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-52:39402:39977 [0] NCCL INFO transport/net.cc:870 -> 2
gpu-st-p4d-24xlarge-52:39402:39977 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-52:39402:39977 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29514 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29515 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29516 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29517 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29519 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30342 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30343 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30344 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30346 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30347 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 26986 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 26987 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 26989 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 26990 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 26991 closing signal SIGTERM
gpu-st-p4d-24xlarge-58:27723:28290 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-227-130.ec2.internal<34164>
gpu-st-p4d-24xlarge-58:27723:28290 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-58:27723:28290 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-58:27723:28290 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-58:27723:28290 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-58:27723:28290 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-56:28176:28618 [0] NCCL INFO comm 0x7f41ac009010 rank 77 nranks 96 cudaDev 5 busId 901d0 - Abort COMPLETE
gpu-st-p4d-24xlarge-56:28175:28567 [0] NCCL INFO comm 0x7f2810009010 rank 76 nranks 96 cudaDev 4 busId 901c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'terminate called after throwing an instance of 'std::runtime_errorstd::runtime_error'
'
what(): what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-44:28069:28967 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-230-245.ec2.internal<47806>
gpu-st-p4d-24xlarge-44:28069:28967 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-44:28069:28967 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-44:28069:28967 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-44:28069:28967 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-44:28069:28967 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-52:39401:39699 [0] NCCL INFO comm 0x7f9220009010 rank 52 nranks 96 cudaDev 4 busId 901c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-56:28172:29025 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-229-154.ec2.internal<42724>
gpu-st-p4d-24xlarge-56:28172:29025 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-56:28172:29025 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-56:28172:29025 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-56:28172:29025 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-56:28172:29025 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-58:27724:28292 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-229-205.ec2.internal<40844>
gpu-st-p4d-24xlarge-58:27724:28292 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-58:27724:28292 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-58:27724:28292 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-58:27724:28292 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-58:27724:28292 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-57:27793:28095 [0] NCCL INFO comm 0x7f3cc0009010 rank 80 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
gpu-st-p4d-24xlarge-50:29215:29785 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<51948>
gpu-st-p4d-24xlarge-50:29215:29785 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-50:29215:29785 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-47:30190:30763 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<42136>
gpu-st-p4d-24xlarge-47:30190:30763 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-47:30190:30763 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-47:30190:30763 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-47:30190:30763 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-50:29215:29785 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-50:29215:29785 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-50:29215:29785 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-47:30190:30763 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-59:27920:28696 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<54450>
gpu-st-p4d-24xlarge-59:27920:28696 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-59:27920:28696 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-59:27920:28696 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-59:27920:28696 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-59:27920:28696 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-52:39397:39978 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<44236>
gpu-st-p4d-24xlarge-52:39397:39978 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-52:39397:39978 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-52:39397:39978 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-52:39397:39978 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-52:39397:39978 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-50:29211:29788 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<58664>
gpu-st-p4d-24xlarge-50:29211:29788 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-50:29211:29788 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-47:30188:30762 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<39658>
gpu-st-p4d-24xlarge-47:30188:30762 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-47:30188:30762 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-47:30188:30762 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-50:29211:29788 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-50:29211:29788 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-50:29211:29788 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-47:30188:30762 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-47:30188:30762 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-59:27918:28695 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-233-244.ec2.internal<58630>
gpu-st-p4d-24xlarge-59:27918:28695 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-59:27918:28695 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-59:27918:28695 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-59:27918:28695 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-59:27918:28695 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 29518) of binary: /opt/conda/bin/python
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 2 (pid: 26988) of binary: /opt/conda/bin/python
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 3 (pid: 30345) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-48:30195:30760 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-230-245.ec2.internal<49974>
gpu-st-p4d-24xlarge-48:30195:30760 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-48:30195:30760 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-48:30195:30760 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-48:30195:30760 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-48:30195:30760 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-52:39402:39702 [0] NCCL INFO comm 0x7f9d04009010 rank 53 nranks 96 cudaDev 5 busId 901d0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28171 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28172 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28173 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28174 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 39397 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 39398 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 39399 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 39400 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 39402 closing signal SIGTERM
gpu-st-p4d-24xlarge-50:29212:29786 [0] misc/socket.cc:503 NCCL WARN Net : Call to recv from 172.31.226.133<56915> failed : Connection reset by peer
gpu-st-p4d-24xlarge-50:29212:29786 [0] NCCL INFO misc/socket.cc:520 -> 2
gpu-st-p4d-24xlarge-50:29212:29786 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-50:29212:29786 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-50:29212:29786 [0] NCCL INFO transport/net.cc:870 -> 2
gpu-st-p4d-24xlarge-50:29212:29786 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-50:29212:29786 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-50:29216:29787 [0] misc/socket.cc:503 NCCL WARN Net : Call to recv from 172.31.234.3<54987> failed : Connection reset by peer
gpu-st-p4d-24xlarge-50:29216:29787 [0] NCCL INFO misc/socket.cc:520 -> 2
gpu-st-p4d-24xlarge-50:29216:29787 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-50:29216:29787 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-50:29216:29787 [0] NCCL INFO transport/net.cc:870 -> 2
gpu-st-p4d-24xlarge-50:29216:29787 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-50:29216:29787 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27791 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27792 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27794 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27795 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27796 closing signal SIGTERM
gpu-st-p4d-24xlarge-58:27723:28012 [0] NCCL INFO comm 0x7f866c009010 rank 88 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-58:27724:28025 [0] NCCL INFO comm 0x7f0b24009010 rank 89 nranks 96 cudaDev 1 busId 101d0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-58:27719:28293 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-227-130.ec2.internal<58756>
gpu-st-p4d-24xlarge-58:27719:28293 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-58:27719:28293 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-58:27719:28293 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-58:27719:28293 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-58:27719:28293 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-48:30191:30763 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-230-245.ec2.internal<51888>
gpu-st-p4d-24xlarge-48:30191:30763 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-48:30191:30763 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-48:30191:30763 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-48:30191:30763 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-48:30191:30763 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-44:28065:28969 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-230-245.ec2.internal<38130>
gpu-st-p4d-24xlarge-44:28065:28969 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-44:28065:28969 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-44:28065:28969 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-44:28065:28969 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-44:28065:28969 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 28175) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-59:27920:28206 [0] NCCL INFO comm 0x7f6e6c009010 rank 92 nranks 96 cudaDev 4 busId 901c0 - Abort COMPLETE
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 39401) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-45:30137:30702 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-78.ec2.internal<42102>
gpu-st-p4d-24xlarge-45:30137:30702 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-45:30137:30702 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-45:30137:30702 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-45:30137:30702 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-45:30137:30702 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-44:28069:28455 [0] NCCL INFO comm 0x7f7aac009010 rank 4 nranks 96 cudaDev 4 busId 901c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-49:25778:26348 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-234-184.ec2.internal<39924>
gpu-st-p4d-24xlarge-49:25778:26348 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-49:25778:26348 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-49:25778:26348 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-49:25778:26348 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-49:25778:26348 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-48:30195:30488 [0] NCCL INFO comm 0x7f185c009010 rank 28 nranks 96 cudaDev 4 busId 901c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-58:27720:28291 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-229-205.ec2.internal<33608>
gpu-st-p4d-24xlarge-58:27720:28291 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-58:27720:28291 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-58:27720:28291 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-58:27720:28291 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-58:27720:28291 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27719 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27720 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27721 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27722 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27724 closing signal SIGTERM
gpu-st-p4d-24xlarge-49:25779:26350 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-152.ec2.internal<50496>
gpu-st-p4d-24xlarge-49:25779:26350 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-49:25779:26350 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-49:25779:26350 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-49:25779:26350 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-49:25779:26350 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-50:29215:29514 [0] NCCL INFO comm 0x7fe408009010 rank 40 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 2 (pid: 27793) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-50:29216:29510 [0] NCCL INFO comm 0x7f3b88009010 rank 41 nranks 96 cudaDev 1 busId 101d0 - Abort COMPLETE
gpu-st-p4d-24xlarge-46:30085:30689 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-226-226.ec2.internal<48056>
gpu-st-p4d-24xlarge-46:30085:30689 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-46:30085:30689 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-46:30085:30689 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-46:30085:30689 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-46:30085:30689 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-47:30190:30482 [0] NCCL INFO comm 0x7f7bb4009010 rank 20 nranks 96 cudaDev 4 busId 901c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30191 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30192 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30193 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30194 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30196 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27918 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27919 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27921 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27922 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 27923 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28065 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28066 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28067 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28068 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 28070 closing signal SIGTERM
gpu-st-p4d-24xlarge-46:30086:30691 [0] misc/socket.cc:503 NCCL WARN Net : Call to recv from 172.31.224.181<45511> failed : Connection reset by peer
gpu-st-p4d-24xlarge-46:30086:30691 [0] NCCL INFO misc/socket.cc:520 -> 2
gpu-st-p4d-24xlarge-46:30086:30691 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-46:30086:30691 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-46:30086:30691 [0] NCCL INFO transport/net.cc:870 -> 2
gpu-st-p4d-24xlarge-46:30086:30691 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-46:30086:30691 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30188 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30189 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30191 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30192 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30193 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29211 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29212 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29213 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 29214 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 27723) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-49:25779:26078 [0] NCCL INFO comm 0x7f080c009010 rank 33 nranks 96 cudaDev 1 busId 101d0 - Abort COMPLETE
gpu-st-p4d-24xlarge-49:25778:26081 [0] NCCL INFO comm 0x7f1324009010 rank 32 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
gpu-st-p4d-24xlarge-45:30138:30705 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-226-192.ec2.internal<46866>
gpu-st-p4d-24xlarge-45:30138:30705 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-45:30138:30705 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-45:30138:30705 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-45:30138:30705 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-45:30138:30705 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-46:30086:30416 [0] NCCL INFO comm 0x7f2de8009010 rank 17 nranks 96 cudaDev 1 busId 101d0 - Abort COMPLETE
gpu-st-p4d-24xlarge-46:30085:30427 [0] NCCL INFO comm 0x7f674c009010 rank 16 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'terminate called after throwing an instance of 'std::runtime_errorstd::runtime_error'
'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-49:25776:26349 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-234-184.ec2.internal<47822>
gpu-st-p4d-24xlarge-49:25776:26349 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-49:25776:26349 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-49:25776:26349 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-49:25776:26349 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-49:25776:26349 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-45:30135:30704 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-78.ec2.internal<35068>
gpu-st-p4d-24xlarge-45:30135:30704 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-45:30135:30704 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-45:30135:30704 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-45:30135:30704 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-45:30135:30704 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
gpu-st-p4d-24xlarge-46:30081:30692 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-234-184.ec2.internal<50118>
gpu-st-p4d-24xlarge-46:30081:30692 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-46:30081:30692 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-46:30081:30692 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-46:30081:30692 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-46:30081:30692 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 30195) of binary: /opt/conda/bin/python
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 28069) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-45:30137:30431 [0] NCCL INFO comm 0x7f4de8009010 rank 8 nranks 96 cudaDev 0 busId 101c0 - Abort COMPLETE
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
gpu-st-p4d-24xlarge-49:25777:26351 [0] misc/socket.cc:523 NCCL WARN Net : Connection closed by remote peer ip-172-31-231-152.ec2.internal<44236>
gpu-st-p4d-24xlarge-49:25777:26351 [0] NCCL INFO transport/net_socket.cc:493 -> 2
gpu-st-p4d-24xlarge-49:25777:26351 [0] NCCL INFO include/net.h:32 -> 2
gpu-st-p4d-24xlarge-49:25777:26351 [0] NCCL INFO transport/net.cc:996 -> 2
gpu-st-p4d-24xlarge-49:25777:26351 [0] NCCL INFO proxy.cc:494 -> 2
gpu-st-p4d-24xlarge-49:25777:26351 [0] NCCL INFO proxy.cc:614 -> 2 [Proxy Thread]
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 25776 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 25777 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 25779 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 25780 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 25781 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 2 (pid: 27920) of binary: /opt/conda/bin/python
gpu-st-p4d-24xlarge-45:30138:30426 [0] NCCL INFO comm 0x7fb2c4009010 rank 9 nranks 96 cudaDev 1 busId 101d0 - Abort COMPLETE
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 29215) of binary: /opt/conda/bin/python
[E ProcessGroupNCCL.cpp:480] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. To avoid this inconsistency, we are taking the entire process down.
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL communicator encountered error set by ProcessGroupNCCL: NCCL error: unhandled system error, NCCL version 2.12.12
ncclSystemError: System call (e.g. socket, malloc) or external library call failed or device error. It can be also caused by unexpected exit of a remote peer, you can check NCCL warnings for failure reason and see if there is connection closure by a peer.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 2 (pid: 30190) of binary: /opt/conda/bin/python
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30135 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30136 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30138 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30139 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30145 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30081 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30082 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30083 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 30084 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 2 (pid: 25778) of binary: /opt/conda/bin/python
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 4 (pid: 30085) of binary: /opt/conda/bin/python
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 2 (pid: 30137) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:16
host : gpu-st-p4d-24xlarge-54.hpc-1click-prod450.pcluster
rank : 64 (local_rank: 4)
exitcode : -6 (pid: 29518)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 29518
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
[1]:
time : 2022-08-26_22:41:31
host : gpu-st-p4d-24xlarge-56.hpc-1click-prod450.pcluster
rank : 77 (local_rank: 5)
exitcode : -6 (pid: 28176)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 28176
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:31
host : gpu-st-p4d-24xlarge-56.hpc-1click-prod450.pcluster
rank : 76 (local_rank: 4)
exitcode : -6 (pid: 28175)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 28175
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:37
host : gpu-st-p4d-24xlarge-57.hpc-1click-prod450.pcluster
rank : 80 (local_rank: 2)
exitcode : -6 (pid: 27793)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 27793
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:47
host : gpu-st-p4d-24xlarge-58.hpc-1click-prod450.pcluster
rank : 88 (local_rank: 4)
exitcode : -6 (pid: 27723)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 27723
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:51
host : gpu-st-p4d-24xlarge-59.hpc-1click-prod450.pcluster
rank : 92 (local_rank: 2)
exitcode : -6 (pid: 27920)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 27920
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:56
host : gpu-st-p4d-24xlarge-47.hpc-1click-prod450.pcluster
rank : 20 (local_rank: 2)
exitcode : -6 (pid: 30190)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 30190
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
[1]:
time : 2022-08-26_22:41:56
host : gpu-st-p4d-24xlarge-50.hpc-1click-prod450.pcluster
rank : 41 (local_rank: 5)
exitcode : -6 (pid: 29216)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 29216
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:56
host : gpu-st-p4d-24xlarge-50.hpc-1click-prod450.pcluster
rank : 40 (local_rank: 4)
exitcode : -6 (pid: 29215)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 29215
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:16
host : gpu-st-p4d-24xlarge-53.hpc-1click-prod450.pcluster
rank : 56 (local_rank: 2)
exitcode : -6 (pid: 26988)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 26988
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:51
host : gpu-st-p4d-24xlarge-48.hpc-1click-prod450.pcluster
rank : 28 (local_rank: 4)
exitcode : -6 (pid: 30195)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 30195
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:52
host : gpu-st-p4d-24xlarge-44.hpc-1click-prod450.pcluster
rank : 4 (local_rank: 4)
exitcode : -6 (pid: 28069)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 28069
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:42:06
host : gpu-st-p4d-24xlarge-49.hpc-1click-prod450.pcluster
rank : 32 (local_rank: 2)
exitcode : -6 (pid: 25778)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 25778
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:42:11
host : gpu-st-p4d-24xlarge-45.hpc-1click-prod450.pcluster
rank : 8 (local_rank: 2)
exitcode : -6 (pid: 30137)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 30137
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
[1]:
time : 2022-08-26_22:42:11
host : gpu-st-p4d-24xlarge-46.hpc-1click-prod450.pcluster
rank : 17 (local_rank: 5)
exitcode : -6 (pid: 30086)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 30086
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:42:11
host : gpu-st-p4d-24xlarge-46.hpc-1click-prod450.pcluster
rank : 16 (local_rank: 4)
exitcode : -6 (pid: 30085)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 30085
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:40:52
host : gpu-st-p4d-24xlarge-55.hpc-1click-prod450.pcluster
rank : 68 (local_rank: 2)
exitcode : -9 (pid: 29156)
error_file: <N/A>
traceback : Signal 9 (SIGKILL) received by PID 29156
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:16
host : gpu-st-p4d-24xlarge-51.hpc-1click-prod450.pcluster
rank : 45 (local_rank: 3)
exitcode : -6 (pid: 30345)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 30345
============================================================
Traceback (most recent call last):
File "/opt/conda/bin/torchrun", line 33, in <module>
sys.exit(load_entry_point('torch==1.13.0a0+08820cb', 'console_scripts', 'torchrun')())
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
return f(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 761, in main
run(args)
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run
elastic_launch(
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
scripts/ddp_convnext.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-08-26_22:41:31
host : gpu-st-p4d-24xlarge-52.hpc-1click-prod450.pcluster
rank : 52 (local_rank: 4)
exitcode : -6 (pid: 39401)
error_file: <N/A>
traceback : Signal 6 (SIGABRT) received by PID 39401
============================================================
srun: error: gpu-st-p4d-24xlarge-50: task 6: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-58: task 14: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-47: task 3: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-59: task 15: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-56: task 12: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-53: task 9: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-54: task 10: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-44: task 0: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-49: task 5: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-52: task 8: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-45: task 1: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-48: task 4: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-57: task 13: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-46: task 2: Exited with exit code 1
slurmstepd: error: Detected 1142 oom-kill event(s) in StepId=4236.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: gpu-st-p4d-24xlarge-51: task 7: Exited with exit code 1
srun: error: gpu-st-p4d-24xlarge-55: task 11: Out Of Memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment