Skip to content

Instantly share code, notes, and snippets.

@jamesr66a
Created February 23, 2022 20:09
Show Gist options
  • Save jamesr66a/345353bc58458a407d7699ff4be61268 to your computer and use it in GitHub Desktop.
Save jamesr66a/345353bc58458a407d7699ff4be61268 to your computer and use it in GitHub Desktop.
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
REPLICATE config: False -> MultiUseParameterConfig.TRANSMIT
GraphModule(
(submod_0): GraphModule()
(submod_1): GraphModule()
(submod_2): GraphModule()
(_loss): MSELoss()
)
def forward(self, x, target):
submod_0 = self.submod_0(x)
getitem_2 = submod_0[2]
getitem = submod_0[0]
getitem_1 = submod_0[1]
submod_1 = self.submod_1(getitem, getitem_2)
getitem_4 = submod_1[1]
getitem_3 = submod_1[0]
submod_2 = self.submod_2(getitem_3, getitem_1, getitem_4)
_loss = self._loss(submod_2, target)
stage_backward = pippy_IR_stage_backward(stage_output = _loss, output_grads = None, input_values = [submod_2, target]); target = None
getitem_5 = stage_backward[0]
getitem_6 = stage_backward[1]; stage_backward = None
stage_backward_1 = pippy_IR_stage_backward(stage_output = submod_2, output_grads = getitem_5, input_values = [getitem_3, getitem_1, getitem_4]); submod_2 = getitem_5 = getitem_3 = getitem_1 = getitem_4 = None
getitem_7 = stage_backward_1[0]
getitem_8 = stage_backward_1[1]
getitem_9 = stage_backward_1[2]; stage_backward_1 = None
stage_backward_2 = pippy_IR_stage_backward(stage_output = submod_1, output_grads = [getitem_7, getitem_9], input_values = [getitem, getitem_2]); submod_1 = getitem_7 = getitem_9 = getitem = getitem_2 = None
getitem_10 = stage_backward_2[0]
getitem_11 = stage_backward_2[1]; stage_backward_2 = None
stage_backward_3 = pippy_IR_stage_backward(stage_output = submod_0, output_grads = [getitem_10, getitem_8, getitem_11], input_values = [x]); submod_0 = getitem_10 = getitem_8 = getitem_11 = x = None
getitem_12 = stage_backward_3[0]; stage_backward_3 = None
return _loss
/fsx/users/jamesreed/pipeline_for_real/pippy/PipelineDriver.py:394: UserWarning: Running pipeline with 3 stages on world_size of 10. Remaining ranks will be idle.
warnings.warn(f'Running pipeline with {len(executor_descriptors)} stages on world_size of {self.world_size}. '
INFO:root:[root] Running pipeline
INFO:root:[root] Splitting args with sizes [torch.Size([503, 50]), torch.Size([503, 50])] into 1 chunks.
INFO:root:[root] Arguments have batch dims [0, 0]
INFO:root:[root] Splitting arg 0 with size torch.Size([503, 50]) into 1 chunks along dimension 0
INFO:root:[root] splits [(0, 503)]
INFO:root:[root] Chunk tensor sizes [torch.Size([503, 50])]
INFO:root:[root] Splitting arg 1 with size torch.Size([503, 50]) into 1 chunks along dimension 0
INFO:root:[root] splits [(0, 503)]
INFO:root:[root] Chunk tensor sizes [torch.Size([503, 50])]
INFO:root:[root] Final splits: ['MicrobatchSplitTensor(chunks=[torch.Size([503, 50])]', 'MicrobatchSplitTensor(chunks=[torch.Size([503, 50])]']
INFO:root:[root] Instantiating microbatch interpreter for chunk 0
INFO:root:[root] 1 instantiated
INFO:root:[root][0] Executing forward stages
INFO:root:[0] Issue command to run %x : [#users=2] = placeholder[target=x]
INFO:root:[0] Issue command to run %target : [#users=2] = placeholder[target=target]
INFO:root:[0] Issue command to run %submod_0 : [#users=4] = call_module[target=submod_0](args = (%x,), kwargs = {})
INFO:root:[root][0] Issuing Phase.FORWARD invocation for target submod_0 on rank 0
INFO:root:[0] Instantiating PipeStageExecutor
INFO:root:[0] Issue command to run %getitem_2 : [#users=2] = call_function[target=operator.getitem](args = (%submod_0, 2), kwargs = {})
INFO:root:[0][0] Received invoke call for %submod_0 : [#users=4] = call_module[target=submod_0](args = (%x,), kwargs = {})
INFO:root:[0][0] Invoke call found 0 RRef arguments
INFO:root:[0] Issue command to run %getitem : [#users=2] = call_function[target=operator.getitem](args = (%submod_0, 0), kwargs = {})
INFO:root:[0][0] Invoke instantiated WorkItem WorkItem(%submod_0 : [#users=4] = call_module[target=submod_0](args = (%x,), kwargs = {})) with key 0_submod_0
INFO:root:[0][0] No RRef arguments. Scheduling directly as READY workitem
INFO:root:[0][0] Current ready runlist keys: dict_keys([])
INFO:root:[0] Issue command to run %getitem_1 : [#users=2] = call_function[target=operator.getitem](args = (%submod_0, 1), kwargs = {})
INFO:root:[0] Dequeueing workitem from set of 1
INFO:root:[0][0] Got WorkItem WorkItem(%submod_0 : [#users=4] = call_module[target=submod_0](args = (%x,), kwargs = {}))
INFO:root:[0] Issue command to run %submod_1 : [#users=3] = call_module[target=submod_1](args = (%getitem, %getitem_2), kwargs = {})
INFO:root:[0][0] Running forward module
INFO:root:[root][0] Issuing Phase.FORWARD invocation for target submod_1 on rank 1
INFO:root:[0][0] Populating result of type <class 'tuple'> for 0_submod_0
INFO:root:[0][0] *****INDEXING VALUE %getitem_2 : [#users=2] = call_function[target=operator.getitem](args = (%submod_0, 2), kwargs = {}) input_type <class 'tuple'> index 2 output type <class 'torch.nn.parameter.Parameter'>
INFO:root:[0][0] *****INDEXING VALUE %getitem : [#users=2] = call_function[target=operator.getitem](args = (%submod_0, 0), kwargs = {}) input_type <class 'tuple'> index 0 output type <class 'torch.Tensor'>
INFO:root:[0][0] *****INDEXING VALUE %getitem_1 : [#users=2] = call_function[target=operator.getitem](args = (%submod_0, 1), kwargs = {}) input_type <class 'tuple'> index 1 output type <class 'torch.Tensor'>
INFO:root:[2] Instantiating PipeStageExecutor
INFO:root:[1] Instantiating PipeStageExecutor
INFO:root:[0] Issue command to run %getitem_4 : [#users=2] = call_function[target=operator.getitem](args = (%submod_1, 1), kwargs = {})
INFO:root:[0] Issue command to run %getitem_3 : [#users=2] = call_function[target=operator.getitem](args = (%submod_1, 0), kwargs = {})
INFO:root:[1][0] Received invoke call for %submod_1 : [#users=3] = call_module[target=submod_1](args = (%getitem, %getitem_2), kwargs = {})
INFO:root:[1][0] Invoke call found 2 RRef arguments
INFO:root:[0] Issue command to run %submod_2 : [#users=3] = call_module[target=submod_2](args = (%getitem_3, %getitem_1, %getitem_4), kwargs = {})
INFO:root:[root][0] Issuing Phase.FORWARD invocation for target submod_2 on rank 2
INFO:root:[1][0] Invoke instantiated WorkItem WorkItem(%submod_1 : [#users=3] = call_module[target=submod_1](args = (%getitem, %getitem_2), kwargs = {})) with key 0_submod_1
INFO:root:[1][0] Scheduling WorkItem as WAITING workitem
INFO:root:[1][0] Current waiting runlist keys: dict_keys([])
INFO:root:[1][0] Launching asynchronous data transfer for RRef 0 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=10), ForkId = GloballyUniqueId(created_on=0, local_id=15))
INFO:root:[root] Executing loss + backward stages
INFO:root:[0] Issue command to run %_loss : [#users=2] = call_module[target=_loss](args = (%submod_2, %target), kwargs = {})
INFO:root:[root][0] Issuing Phase.LOSS invocation for target _loss on rank 2
INFO:root:[1][0] Launching asynchronous data transfer for RRef 1 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=8), ForkId = GloballyUniqueId(created_on=0, local_id=16))
INFO:root:[0] Issue command to run %stage_backward : [#users=2] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %_loss, output_grads: None, input_values: [%submod_2, %target]})
INFO:root:[root][0] Issuing BW invocation for target _loss on rank 2
INFO:root:[0] Issue command to run %getitem_5 : [#users=1] = call_function[target=operator.getitem](args = (%stage_backward, 0), kwargs = {})
INFO:root:[0] Issue command to run %getitem_6 : [#users=0] = call_function[target=operator.getitem](args = (%stage_backward, 1), kwargs = {})
INFO:root:[2][0] Received invoke call for %_loss : [#users=2] = call_module[target=_loss](args = (%submod_2, %target), kwargs = {})
INFO:root:[2][0] Invoke call found 1 RRef arguments
INFO:root:[0] Issue command to run %stage_backward_1 : [#users=3] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_2, output_grads: %getitem_5, input_values: [%getitem_3, %getitem_1, %getitem_4]})
INFO:root:[root][0] Issuing BW invocation for target submod_2 on rank 2
INFO:root:[2][0] Invoke instantiated WorkItem WorkItem(%_loss : [#users=2] = call_module[target=_loss](args = (%submod_2, %target), kwargs = {})) with key 0__loss
INFO:root:[2][0] Scheduling WorkItem as WAITING workitem
INFO:root:[2][0] Current waiting runlist keys: dict_keys([])
INFO:root:[2][0] Launching asynchronous data transfer for RRef 0 OwnerRRef(GloballyUniqueId(created_on=0, local_id=30))
INFO:root:[0] Issue command to run %getitem_7 : [#users=1] = call_function[target=operator.getitem](args = (%stage_backward_1, 0), kwargs = {})
INFO:root:[0] Issue command to run %getitem_8 : [#users=1] = call_function[target=operator.getitem](args = (%stage_backward_1, 1), kwargs = {})
INFO:root:[0] Issue command to run %getitem_9 : [#users=1] = call_function[target=operator.getitem](args = (%stage_backward_1, 2), kwargs = {})
INFO:root:[0] Issue command to run %stage_backward_2 : [#users=2] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_1, output_grads: [%getitem_7, %getitem_9], input_values: [%getitem, %getitem_2]})
INFO:root:[root][0] Issuing BW invocation for target submod_1 on rank 1
INFO:root:[0] Issue command to run %getitem_10 : [#users=1] = call_function[target=operator.getitem](args = (%stage_backward_2, 0), kwargs = {})
INFO:root:[0] Issue command to run %getitem_11 : [#users=1] = call_function[target=operator.getitem](args = (%stage_backward_2, 1), kwargs = {})
INFO:root:[0] Issue command to run %stage_backward_3 : [#users=1] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_0, output_grads: [%getitem_10, %getitem_8, %getitem_11], input_values: [%x]})
INFO:root:[root][0] Issuing BW invocation for target submod_0 on rank 0
INFO:root:[0] Issue command to run %getitem_12 : [#users=0] = call_function[target=operator.getitem](args = (%stage_backward_3, 0), kwargs = {})
INFO:root:[root] Combining output values from 1 chunks
INFO:root:[0] Issue command to run return _loss
INFO:root:[2][0] Received invoke call for %stage_backward : [#users=2] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %_loss, output_grads: None, input_values: [%submod_2, %target]})
INFO:root:[2][0] Invoke call found 2 RRef arguments
INFO:root:[2][0] Invoke instantiated WorkItem WorkItem(%stage_backward : [#users=2] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %_loss, output_grads: None, input_values: [%submod_2, %target]})) with key 0_stage_backward
INFO:root:[2][0] Scheduling WorkItem as WAITING workitem
INFO:root:[2][0] Current waiting runlist keys: dict_keys(['0__loss'])
INFO:root:[2][0] Launching asynchronous data transfer for RRef 0 OwnerRRef(GloballyUniqueId(created_on=0, local_id=34))
INFO:root:[2][0] Launching asynchronous data transfer for RRef 1 OwnerRRef(GloballyUniqueId(created_on=0, local_id=30))
INFO:root:[0][0] Received invoke call for %stage_backward_3 : [#users=1] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_0, output_grads: [%getitem_10, %getitem_8, %getitem_11], input_values: [%x]})
INFO:root:[0][0] Invoke call found 4 RRef arguments
INFO:root:[2][0] Starting transfer
INFO:root:[0][0] Invoke instantiated WorkItem WorkItem(%stage_backward_3 : [#users=1] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_0, output_grads: [%getitem_10, %getitem_8, %getitem_11], input_values: [%x]})) with key 0_stage_backward_3
INFO:root:[0][0] Scheduling WorkItem as WAITING workitem
INFO:root:[0][0] Current waiting runlist keys: dict_keys([])
INFO:root:[0][0] Launching asynchronous data transfer for RRef 0 OwnerRRef(GloballyUniqueId(created_on=0, local_id=6))
INFO:root:[2][0] Starting transfer
INFO:root:[0][0] Launching asynchronous data transfer for RRef 1 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=73), ForkId = GloballyUniqueId(created_on=0, local_id=80))
INFO:root:[0][0] Starting transfer
INFO:root:[0][0] Launching asynchronous data transfer for RRef 2 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=59), ForkId = GloballyUniqueId(created_on=0, local_id=81))
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=6)) initiated by rank 0 for 0_stage_backward_3
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=30)) initiated by rank 2 for 0__loss
INFO:root:[0][0] Launching asynchronous data transfer for RRef 3 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=76), ForkId = GloballyUniqueId(created_on=0, local_id=82))
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=34)) initiated by rank 2 for 0_stage_backward
INFO:root:[2][0] Starting transfer
INFO:root:[0][0] Starting transfer
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=30)) initiated by rank 2 for 0_stage_backward
INFO:root:[0][0] Completing transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=6)) from 0 for runlist item 0_stage_backward_3
INFO:root:[0][0] Still waiting for 3 operands.
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=59)) initiated by rank 0 for 0_stage_backward_3
INFO:root:[0][0] Starting transfer
INFO:root:[1][0] Received invoke call for %stage_backward_2 : [#users=2] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_1, output_grads: [%getitem_7, %getitem_9], input_values: [%getitem, %getitem_2]})
INFO:root:[0][0] Starting transfer
INFO:root:[1][0] Invoke call found 5 RRef arguments
INFO:root:[2][0] Received invoke call for %submod_2 : [#users=3] = call_module[target=submod_2](args = (%getitem_3, %getitem_1, %getitem_4), kwargs = {})
INFO:root:[2][0] Invoke call found 3 RRef arguments
INFO:root:[1][0] Invoke instantiated WorkItem WorkItem(%stage_backward_2 : [#users=2] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_1, output_grads: [%getitem_7, %getitem_9], input_values: [%getitem, %getitem_2]})) with key 0_stage_backward_2
INFO:root:[1][0] Scheduling WorkItem as WAITING workitem
INFO:root:[2][0] Received invoke call for %stage_backward_1 : [#users=3] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_2, output_grads: %getitem_5, input_values: [%getitem_3, %getitem_1, %getitem_4]})
INFO:root:[2][0] Invoke instantiated WorkItem WorkItem(%submod_2 : [#users=3] = call_module[target=submod_2](args = (%getitem_3, %getitem_1, %getitem_4), kwargs = {})) with key 0_submod_2
INFO:root:[2][0] Invoke call found 5 RRef arguments
INFO:root:[1][0] Starting transfer
INFO:root:[2][0] Scheduling WorkItem as WAITING workitem
INFO:root:[2][0] Invoke instantiated WorkItem WorkItem(%stage_backward_1 : [#users=3] = call_function[target=pippy.IR.stage_backward](args = (), kwargs = {stage_output: %submod_2, output_grads: %getitem_5, input_values: [%getitem_3, %getitem_1, %getitem_4]})) with key 0_stage_backward_1
INFO:root:[2][0] Current waiting runlist keys: dict_keys(['0__loss', '0_stage_backward'])
INFO:root:[2][0] Scheduling WorkItem as WAITING workitem
INFO:root:[2][0] Launching asynchronous data transfer for RRef 0 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=23), ForkId = GloballyUniqueId(created_on=0, local_id=27))
INFO:root:[1][0] Current waiting runlist keys: dict_keys(['0_submod_1'])
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=73)) initiated by rank 0 for 0_stage_backward_3
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=76)) initiated by rank 0 for 0_stage_backward_3
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=10)) initiated by rank 1 for 0_submod_1
INFO:root:[1][0] Starting transfer
INFO:root:[2][0] Current waiting runlist keys: dict_keys(['0__loss', '0_stage_backward', '0_submod_2'])
INFO:root:[2][0] Launching asynchronous data transfer for RRef 0 OwnerRRef(GloballyUniqueId(created_on=0, local_id=30))
INFO:root:[1][0] Launching asynchronous data transfer for RRef 0 OwnerRRef(GloballyUniqueId(created_on=0, local_id=17))
INFO:root:[2][0] Launching asynchronous data transfer for RRef 1 OwnerRRef(GloballyUniqueId(created_on=0, local_id=42))
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=8)) initiated by rank 1 for 0_submod_1
INFO:root:[1][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=10), ForkId = GloballyUniqueId(created_on=1, local_id=2)) from 0 for runlist item 0_submod_1
INFO:root:[1][0] Still waiting for 1 operands.
INFO:root:[2][0] Launching asynchronous data transfer for RRef 1 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=12), ForkId = GloballyUniqueId(created_on=0, local_id=28))
INFO:root:[1][0] Launching asynchronous data transfer for RRef 1 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=56), ForkId = GloballyUniqueId(created_on=0, local_id=66))
INFO:root:[1][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=8), ForkId = GloballyUniqueId(created_on=1, local_id=5)) from 0 for runlist item 0_submod_1
INFO:root:[2][0] Starting transfer
INFO:root:[1][0] Starting transfer
INFO:root:[1][0] All operands ready
INFO:root:[2][0] Starting transfer
INFO:root:[1][0] Launching asynchronous data transfer for RRef 2 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=62), ForkId = GloballyUniqueId(created_on=0, local_id=67))
INFO:root:[1] Dequeueing workitem from set of 1
INFO:root:[1][0] Got WorkItem WorkItem(%submod_1 : [#users=3] = call_module[target=submod_1](args = (%getitem, %getitem_2), kwargs = {}))
INFO:root:[2][0] Launching asynchronous data transfer for RRef 2 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=23), ForkId = GloballyUniqueId(created_on=0, local_id=50))
INFO:root:[1][0] Starting transfer
INFO:root:[2][0] Starting transfer
INFO:root:[2][0] Launching asynchronous data transfer for RRef 2 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=20), ForkId = GloballyUniqueId(created_on=0, local_id=29))
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=17)) initiated by rank 1 for 0_stage_backward_2
INFO:root:[1][0] Running forward module
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=23)) initiated by rank 2 for 0_submod_2
INFO:root:[1][0] Launching asynchronous data transfer for RRef 3 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=10), ForkId = GloballyUniqueId(created_on=0, local_id=68))
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=30)) initiated by rank 2 for 0_stage_backward_1
INFO:root:[1][0] Launching asynchronous data transfer for RRef 4 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=8), ForkId = GloballyUniqueId(created_on=0, local_id=69))
INFO:root:[2][0] Starting transfer
INFO:root:[1][0] Starting transfer
INFO:root:[2][0] Launching asynchronous data transfer for RRef 3 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=12), ForkId = GloballyUniqueId(created_on=0, local_id=51))
INFO:root:[1][0] Starting transfer
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=56)) initiated by rank 1 for 0_stage_backward_2
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=12)) initiated by rank 2 for 0_submod_2
INFO:root:[1][0] Populating result of type <class 'tuple'> for 0_submod_1
INFO:root:[2][0] Starting transfer
INFO:root:[1][0] *****INDEXING VALUE %getitem_4 : [#users=2] = call_function[target=operator.getitem](args = (%submod_1, 1), kwargs = {}) input_type <class 'tuple'> index 1 output type <class 'torch.nn.parameter.Parameter'>
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=42)) initiated by rank 2 for 0_stage_backward_1
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=10)) initiated by rank 1 for 0_stage_backward_2
INFO:root:[2][0] Launching asynchronous data transfer for RRef 4 UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=20), ForkId = GloballyUniqueId(created_on=0, local_id=52))
INFO:root:[1][0] *****INDEXING VALUE %getitem_3 : [#users=2] = call_function[target=operator.getitem](args = (%submod_1, 0), kwargs = {}) input_type <class 'tuple'> index 0 output type <class 'torch.Tensor'>
INFO:root:[2][0] Starting transfer
INFO:root:[2][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=62)) initiated by rank 1 for 0_stage_backward_2
INFO:root:[1][0] Starting transfer
INFO:root:[2][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=12), ForkId = GloballyUniqueId(created_on=2, local_id=23)) from 0 for runlist item 0_submod_2
INFO:root:[2][0] Still waiting for 2 operands.
INFO:root:[2][0] Starting transfer
INFO:root:[1][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=10), ForkId = GloballyUniqueId(created_on=1, local_id=21)) from 0 for runlist item 0_stage_backward_2
INFO:root:[2][0] Starting transfer
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=12)) initiated by rank 2 for 0_stage_backward_1
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=23)) initiated by rank 2 for 0_stage_backward_1
INFO:root:[1][0] Completing transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=17)) from 1 for runlist item 0_stage_backward_2
INFO:root:[0][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=8)) initiated by rank 1 for 0_stage_backward_2
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=20)) initiated by rank 2 for 0_submod_2
INFO:root:[1][0] Still waiting for 4 operands.
INFO:root:[2][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=23), ForkId = GloballyUniqueId(created_on=2, local_id=14)) from 1 for runlist item 0_submod_2
INFO:root:[2][0] Still waiting for 1 operands.
INFO:root:[1][0] Still waiting for 3 operands.
INFO:root:[2][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=12), ForkId = GloballyUniqueId(created_on=2, local_id=36)) from 0 for runlist item 0_stage_backward_1
INFO:root:[1][0] Executing async transfer of value OwnerRRef(GloballyUniqueId(created_on=0, local_id=20)) initiated by rank 2 for 0_stage_backward_1
INFO:root:[2][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=23), ForkId = GloballyUniqueId(created_on=2, local_id=28)) from 1 for runlist item 0_stage_backward_1
INFO:root:[2][0] Still waiting for 4 operands.
INFO:root:[1][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=8), ForkId = GloballyUniqueId(created_on=1, local_id=25)) from 0 for runlist item 0_stage_backward_2
INFO:root:[2][0] Still waiting for 3 operands.
INFO:root:[2][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=20), ForkId = GloballyUniqueId(created_on=2, local_id=32)) from 1 for runlist item 0_submod_2
INFO:root:[1][0] Still waiting for 2 operands.
INFO:root:[2][0] All operands ready
INFO:root:[2] Dequeueing workitem from set of 1
INFO:root:[2][0] Got WorkItem WorkItem(%submod_2 : [#users=3] = call_module[target=submod_2](args = (%getitem_3, %getitem_1, %getitem_4), kwargs = {}))
INFO:root:[2][0] Running forward module
INFO:root:[2][0] Completing transfer of value UserRRef(RRefId = GloballyUniqueId(created_on=0, local_id=20), ForkId = GloballyUniqueId(created_on=2, local_id=40)) from 1 for runlist item 0_stage_backward_1
INFO:root:[2][0] Still waiting for 2 operands.
INFO:root:[2][0] Populating result of type <class 'torch.Tensor'> for 0_submod_2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment