Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save VRehnberg/87b4dce2b6018e4499a08db4e7a0b298 to your computer and use it in GitHub Desktop.
Save VRehnberg/87b4dce2b6018e4499a08db4e7a0b298 to your computer and use it in GitHub Desktop.
(partial) EasyBuild log for failed build of /cephyr/NOBACKUP/priv/c3-staff/eb-tmp/eb-owkd570g/files_pr21438/d/DeepSpeed/DeepSpeed-0.14.5-foss-2023a-CUDA-12.1.1.eb (PR(s) #21438) (easyblock PR(s) #3450)
6.60s call unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[3]
6.60s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-3-dtype1]
6.60s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-True]
6.59s call unit/runtime/zero/test_zero_leaf_module.py::TestSetZ3LeafModule::test_choose_module_by_counter
6.59s call unit/runtime/half_precision/test_bf16.py::TestAdamBF16ZeroOneCycleCompatibility::test
6.59s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-3-dtype1]
6.58s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt_sparse_embedding
6.58s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-False-False]
6.58s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_save_exclude_custom_frozen_weights[1]
6.58s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_save_exclude_custom_frozen_weights[2]
6.58s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-True-True]
6.58s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpoint::test_load_module_only[0]
6.57s call unit/runtime/test_runtime_utils.py::TestClipGradNorm::test_clipped_val
6.57s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-3-dtype2]
6.57s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-3]
6.57s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-19]
6.56s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_model_class[EltwiseMultiplicationTestNetwork_NamedTuple]
6.56s call unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[200-20-0.1-0.2]
6.56s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-10]
6.56s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_4to2[True-False-1-dtype0]
6.56s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-1-dtype0]
6.56s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask0]
6.55s call unit/runtime/sparse_tensor/test_averaging_sparse_gradients.py::TestSparseAdam::test
6.55s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_4to2[False-True-3-dtype1]
6.55s call unit/comm/test_dist.py::TestInit::test
6.54s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[1e-05-1e-05-1-False]
6.54s call unit/runtime/zero/test_zero.py::TestZeroToFP32::test_1_param_group[False-2]
6.54s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-15]
6.54s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-False-3-dtype0]
6.54s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-15]
6.53s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_4to2[True-False-1-dtype2]
6.53s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_offload_optimizer[False]
6.53s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-True-1-dtype1]
6.52s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_4to2[True-False-3-dtype2]
6.51s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_prefetching[True]
6.51s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-2]
6.51s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-1-dtype2]
6.51s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-33]
6.50s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[True]
6.49s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-True-False]
6.48s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-False-1-dtype0]
6.48s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-3]
6.48s call unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test[True]
6.48s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-3-dtype2]
6.48s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-3-dtype0]
6.47s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[3]
6.47s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[non_tensor4]
6.47s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_ext_param_return
6.46s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[None]
6.46s call unit/runtime/test_multi_output_model.py::TestTwoOutputModel::test
6.46s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-True-True]
6.46s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs1_outputs1[mask0]
6.46s call unit/runtime/comm/test_coalesced_collectives.py::TestAllToAllQuantReduceFallback::test_1d_tensor
6.45s call unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[True-True]
6.45s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple]
6.45s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_4to2[False-False-3-dtype0]
6.45s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_model_class[EltwiseMultiplicationTestNetwork_namedtuple]
6.45s call unit/moe/test_moe.py::TestTopk::test
6.44s call unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalescedTensorSmallerThanWorldSize::test
6.44s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-100]
6.44s call unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_4to2[False-False-3-dtype2]
6.44s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs2[mask1]
6.44s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor3]
6.44s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_model_class[EltwiseMultiplicationTestNetwork_List]
6.43s call unit/checkpoint/test_lr_scheduler.py::TestLRSchedulerCheckpoint::test_checkpoint_no_lr_scheduler[0-False]
6.43s call unit/checkpoint/test_zero_optimizer.py::TestZeROCheckpointFrozenWeights::test_save_exclude_frozen_weights[1]
6.43s call unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[True]
6.43s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None]
6.43s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[False-3]
6.42s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_reduce_scatter[True]
6.42s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[dict]
6.41s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-19]
6.41s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_contiguous_gradients[False]
6.41s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-1]
6.41s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-1e-05-1-True]
6.41s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[None]
6.41s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[3]
6.40s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.1-0-10-0]
6.39s call unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fused_optimizer
6.39s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable[None]
6.38s call unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1000]
6.38s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-101]
6.37s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupDecayLR-params1]
6.37s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-1]
6.37s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-False-True]
6.36s call unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test[False]
6.36s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_save_before_accum_grad_is_done[2]
6.36s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagrad::test_cpu_adagrad_opt
6.35s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs2[mask0]
6.35s call unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[False-True]
6.35s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-False-False]
6.35s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_arg_none[mask1]
6.34s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-3]
6.34s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[1]
6.34s call unit/elasticity/test_elastic.py::TestNonElasticBatchParamsWithOverride::test
6.34s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-True-False]
6.34s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-15]
6.34s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[LRRangeTest-params3]
6.34s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-1]
6.33s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-211]
6.33s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-3-dtype1]
6.33s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-8-2-True]
6.33s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-True-False]
6.33s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-True-1-dtype0]
6.32s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs1[mask1]
6.32s call unit/linear/test_ctx.py::TestInitTransformers::test_config_init
6.32s call unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_two_inputs
6.32s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-False-1-dtype2]
6.32s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[2]
6.32s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[True-False-True]
6.32s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupDecayLR-params1]
6.31s call unit/checkpoint/test_shared_weights.py::TestCheckpointSharedWeights::test_checkpoint_shared_weights
6.31s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-2]
6.31s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[True]
6.31s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-0.001-10-True]
6.30s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[0.001-0.1-0-21-21]
6.30s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[dict]
6.29s call unit/checkpoint/test_zero_optimizer.py::TestZeROElasticCheckpoint::test_elastic_checkpoint_change_dp[False-True-False]
6.28s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-False-3-dtype2]
6.28s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupLR-params0]
6.28s call unit/utils/test_init_on_device.py::TestOnDevice::test_on_device[cuda:0]
6.28s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs1_outputs1[mask1]
6.28s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_some_overflow
6.27s call unit/comm/test_dist.py::TestDistInitWithModel::test_already_init[False]
6.27s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-True-True]
6.27s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0]
6.27s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[tensor]
6.27s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.1]
6.27s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[True-1]
6.27s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-True-False]
6.27s call unit/runtime/test_data_efficiency.py::TestLegacyCurriculumScheduler::test_fixed_discrete
6.27s call unit/runtime/test_runtime_utils.py::TestClipGradNorm::test_gather
6.27s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[None]
6.26s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-False-3-dtype2]
6.26s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output3]
6.26s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[2]
6.26s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-210]
6.26s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-True]
6.26s call unit/comm/test_dist.py::TestDistInit::test_already_init[None]
6.26s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[OneCycle-params2]
6.26s call unit/runtime/zero/test_zero_nesting_init.py::TestNestingInit::test_nesting_init
6.26s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-10]
6.25s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[2]
6.25s call unit/runtime/comm/test_coalesced_collectives.py::TestReduceScatterCoalesced::test_single_input
6.25s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-19]
6.25s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-True-3-dtype1]
6.25s call unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[1]
6.25s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-1-dtype1]
6.24s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-1-dtype1]
6.24s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-_LRScheduler]
6.24s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_same_lrscheler_and_callable[Callable]
6.24s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-True-False]
6.24s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-False]
6.23s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-False-1-dtype1]
6.23s call unit/runtime/test_runtime_utils.py::TestCheckOverflow::test[True]
6.23s call unit/runtime/zero/test_zero.py::TestZero3ParamPartitioningBase::test_offload_optimizer[True]
6.23s call unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1001]
6.23s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask1]
6.23s call unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_existing_latest
6.23s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-False-3-dtype2]
6.22s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[FAIL]
6.22s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[1-False]
6.22s call unit/ops/adagrad/test_cpu_adagrad.py::TestCPUAdagradGPUError::test_cpu_adagrad_gpu_error
6.22s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scattered_init_dist
6.22s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-_LRScheduler]
6.21s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-1-dtype2]
6.21s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-False-False]
6.21s call unit/runtime/compile/test_compile_zero.py::TestZeRO::test_compile_zero[nvme-1-dtype0]
6.21s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[IGNORE]
6.20s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.001-0.001-10-False]
6.20s call unit/comm/test_dist.py::TestDistInitWithModel::test_no_init[True]
6.20s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[OneCycle-params2]
6.20s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-False-False-False]
6.20s call unit/runtime/test_ds_initialize.py::TestNoOptim::test[0]
6.19s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-False-1-dtype0]
6.19s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-False-True]
6.19s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask1]
6.19s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-1-dtype0]
6.19s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output2]
6.19s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2]
6.18s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-True-1-dtype2]
6.17s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-18-1-False]
6.17s call unit/runtime/test_ds_config_dict.py::TestArgs::test_no_args
6.17s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask0]
6.17s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-False-False]
6.17s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor4]
6.17s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-15]
6.17s call unit/runtime/test_autocast.py::TestAutoCastDisable::test_disable_autocast_linear[True]
6.17s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_no_overflow
6.16s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-33]
6.16s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-True-True]
6.16s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-True-True]
6.16s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-False-1-dtype0]
6.16s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-33]
6.16s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[True-True-False-False]
6.16s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[True-2]
6.15s call unit/comm/test_dist.py::TestDistInitWithModel::test_already_init[True]
6.15s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-False-False]
6.15s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupLR-params0]
6.15s call unit/checkpoint/test_other_optimizer.py::TestOtherOptimizerCheckpoint::test_checkpoint_fp32_optimizer
6.14s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable_onecyclelr_steplr[None]
6.14s call unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[FusedAdam]
6.13s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-False-3-dtype0]
6.13s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[non_tensor4]
6.13s call unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[Adam]
6.13s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-1]
6.13s call unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test
6.13s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[non_tensor3]
6.12s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unique_tag[WARN]
6.12s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs1[mask0]
6.12s call unit/runtime/zero/test_zero.py::TestZero3InitForParentWeightInitialization::test
6.12s call unit/comm/test_dist.py::TestDistInit::test_no_init[None]
6.12s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs2[mask0]
6.12s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-3-dtype0]
6.12s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0]
6.12s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-False-3-dtype1]
6.11s call unit/runtime/zero/test_zero_nesting_init.py::TestShutdownInNestingInit::test_shutdown_in_nesting_init
6.11s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[None]
6.10s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-False-3-dtype1]
6.10s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[2]
6.10s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor4]
6.10s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-1-dtype0]
6.10s call unit/runtime/test_ds_config_dict.py::TestInitNoOptimizer::test
6.09s call unit/runtime/half_precision/test_bf16.py::TestZeroAllowUntestedOptimizer::test
6.09s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[None]
6.09s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow
6.09s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-False-3-dtype0]
6.09s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-101]
6.09s call unit/compression/test_compression.py::TestCompression::test_linear_layer_compress
6.09s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.9]
6.09s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[non_tensor3]
6.09s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[non_tensor4]
6.08s call unit/monitor/test_monitor.py::TestCSVMonitor::test_empty_csv_monitor
6.08s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs3[mask1]
6.07s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs1[mask0]
6.06s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output2]
6.06s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-False-1-dtype1]
6.06s call unit/checkpoint/test_sparse.py::TestSparseCheckpoint::test_non_strict_load_sparse[False-False-True-False]
6.06s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output3]
6.05s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[None]
6.05s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-1-dtype0]
6.05s call unit/checkpoint/test_tag_validation.py::TestCheckpointValidationTag::test_checkpoint_unknown_tag_validation
6.05s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-True-3-dtype1]
6.05s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-10]
6.05s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output1]
6.05s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-1-dtype2]
6.05s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[non_tensor4]
6.05s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask1]
6.04s call unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredNestingInit::test_new_class_declared_nesting_init
6.04s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[2]
6.04s call unit/runtime/test_ds_config_dict.py::TestDeprecatedDeepScaleConfig::test
6.04s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-3-dtype2]
6.04s call unit/compression/test_compression.py::TestCompression::test_conv1d_convertion
6.04s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[None]
6.04s call unit/comm/test_dist.py::TestDistInit::test_no_init[True]
6.03s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-3-dtype2]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs2[mask0]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithoutGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output1]
6.03s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[Optimizer]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_arg_none[mask0]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[True]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs1_outputs1[mask1]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[True]
6.03s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-3-dtype1]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[None]
6.03s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithoutGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output3]
6.02s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs1[mask1]
6.02s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-1-dtype2]
6.02s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[False-2]
6.02s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-True-3-dtype0]
6.02s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-False-3-dtype1]
6.02s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-3]
6.02s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[non_tensor3]
6.02s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[False-1]
6.02s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-True-1-dtype2]
6.02s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-Callable]
6.02s call unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[False]
6.01s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs3[mask0]
6.01s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable[Callable]
6.01s call unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_2
6.01s call unit/comm/test_dist.py::TestDistInit::test_already_init[False]
6.01s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_no_overflow
6.01s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-Callable]
6.00s call unit/monitor/test_monitor.py::TestCSVMonitor::test_csv_monitor
6.00s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable[_LRScheduler]
6.00s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_arg_none[mask1]
6.00s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[2]
6.00s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-True-3-dtype0]
5.99s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output1]
5.99s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-None]
5.99s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor3]
5.98s call unit/comm/test_dist.py::TestDistArgs::test[hello-icosahedron-1138-purple]
5.98s call unit/elasticity/test_elastic.py::TestNonElasticBatchParams::test
5.98s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-1-dtype1]
5.98s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable_onecyclelr_steplr[_LRScheduler]
5.98s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-False-1-dtype2]
5.98s call unit/runtime/test_autocast.py::TestAutoCastDisable::test_disable_autocast_linear[False]
5.98s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-100]
5.98s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs3[mask0]
5.98s call unit/comm/test_dist.py::TestDistributedFixture::test[2-32]
5.97s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-None]
5.97s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs1_outputs1[mask0]
5.97s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_all_overflow
5.96s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask0]
5.96s call unit/runtime/test_autocast.py::TestAutoCastDisable::test_missing_amp_autocast[True]
5.96s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-33-17-2-False]
5.95s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[None]
5.95s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-3-dtype0]
5.95s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_arg_none[mask0]
5.94s call unit/comm/test_dist.py::TestDistInit::test_already_init[True]
5.94s call unit/runtime/test_autocast.py::TestAutoCastEnable::test_autocast_linear[False-False]
5.94s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-True-3-dtype2]
5.94s call unit/elasticity/test_elastic.py::TestElasticConfigChanged::test
5.93s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs2[mask1]
5.93s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-True-3-dtype0]
5.93s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[True-False-1-dtype2]
5.93s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-2]
5.92s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-True-3-dtype1]
5.92s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs2[mask1]
5.92s call unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[True]
5.92s call unit/runtime/test_autocast.py::TestAutoCastDisable::test_missing_amp_autocast[False]
5.92s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Callable-None]
5.91s call unit/runtime/compile/test_compile_zero.py::TestZeRO::test_compile_zero[nvme-1-dtype1]
5.91s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[True-True-3-dtype2]
5.91s call unit/runtime/compile/test_compile_zero.py::TestZeRO::test_compile_zero[nvme-2-dtype2]
5.91s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[True]
5.91s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Callable-Callable]
5.91s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[True]
5.91s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0]
5.91s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs3[mask1]
5.90s call unit/checkpoint/test_zero_optimizer.py::TestZeROSaveLoadEdgeCase::test_immediate_save_load[0]
5.90s call unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[2]
5.89s call unit/comm/test_dist.py::TestGroupedDistTest::test_one[1138]
5.89s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-True]
5.88s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_same_lrscheler_and_callable[None]
5.88s call unit/runtime/test_ds_config_dict.py::TestBatchConfig::test[2-32-16-1-True]
5.88s call unit/runtime/zero/test_zero_context.py::TestGatherUpdate::test
5.87s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-3]
5.87s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_same_lrscheler_and_callable[_LRScheduler]
5.87s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to4[False-False-3-dtype0]
5.87s call unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest
5.86s setup unit/checkpoint/test_universal_checkpoint.py::TestZeROUniversalCheckpointDP::test_dp_world_size_2to2[False-False-3-dtype2]
5.85s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[1-True]
5.85s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable_onecyclelr_steplr[Callable]
5.85s call unit/comm/test_dist.py::TestDistributedFixture::test[2-16]
5.84s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_json
5.84s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[non_tensor3]
5.84s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[2]
5.83s call unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[False]
5.83s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-False]
5.82s call unit/comm/test_dist.py::TestDistInitNoEnv::test
5.81s call unit/runtime/compile/test_compile_zero.py::TestZeRO::test_compile_zero[nvme-2-dtype1]
5.81s call unit/runtime/compile/test_compile_zero.py::TestZeRO::test_compile_zero[nvme-1-dtype2]
5.81s call unit/runtime/zero/test_zero_context.py::TestScatterGather::test
5.81s call unit/runtime/half_precision/test_fp16.py::TestZeroEmptyPartition::test[True-3]
5.80s call unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_1
5.80s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_hjson
5.79s call unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideNestingInit::test_new_class_declared_inside_nesting_init
5.78s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_dict
5.75s call unit/runtime/test_ds_config_dict.py::TestNoModel::test
5.74s call unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test
5.74s call unit/comm/test_dist.py::TestDistributedFixture::test[4-32]
5.73s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scatter_halftype
5.73s call unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_check_version
5.71s call unit/runtime/compile/test_compile_zero.py::TestZeRO::test_compile_zero[nvme-2-dtype0]
5.71s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Callable-_LRScheduler]
5.70s call unit/runtime/zero/test_zero_context_ancestry.py::TestDSInitWZinit::test
5.70s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithoutGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output2]
5.69s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_subclass_param
5.68s setup unit/comm/test_dist.py::TestDistributedFixture::test[4-16]
5.67s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True]
5.65s setup unit/comm/test_dist.py::TestDistributedFixture::test[2-32]
5.65s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05]
5.63s call unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test_flops_profiler_in_inference
5.63s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-True-False]
5.60s call unit/runtime/zero/test_zero_context.py::TestZeroGatheredParametersFree::test
5.58s call unit/runtime/zero/test_zero_context_ancestry.py::TestSerialParamInit::test_subclass_param_init
5.56s call unit/comm/test_dist.py::TestDistributedFixture::test[4-16]
5.52s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False]
5.52s call unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[False]
5.51s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05]
5.46s call unit/runtime/zero/test_zero_context.py::TestMiCSGatheredParametersFree::test
5.44s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-True-True]
5.44s setup unit/comm/test_dist.py::TestDistributedFixture::test[2-16]
5.42s setup unit/comm/test_dist.py::TestDistributedFixture::test[4-32]
5.40s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2]
5.33s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-160-128-2-3-True-True-0.1]
5.33s call unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[True]
4.83s call unit/comm/test_dist.py::TestDistInit::test_no_init[False]
4.53s call unit/comm/test_dist.py::TestDistInitWithModel::test_no_init[False]
3.19s call unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_post_init_quant[4bits]
2.31s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[nvme-3-full-True]
2.15s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-3-local-dtype1]
2.04s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[nvme-3-full-dtype2]
2.00s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[nvme-3-local-False]
1.97s call unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[8-fp16]
1.95s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[nvme-3-full-dtype0]
1.91s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-3-full-dtype2]
1.91s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[nvme-3-full-dtype1]
1.90s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-1-full-dtype0]
1.90s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-3-full-False]
1.90s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-3-full-dtype0]
1.90s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[nvme-3-full-False]
1.89s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-3-full-True]
1.89s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-3-full-dtype1]
1.86s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-3-local-False]
1.86s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[nvme-3-local-True]
1.84s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[nvme-3-local-dtype0]
1.82s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-3-local-dtype0]
1.81s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-3-local-dtype2]
1.81s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[nvme-3-local-dtype1]
1.77s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-3-local-True]
1.76s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-False-resulting_optimizer2]
1.75s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[nvme-3-local-dtype2]
1.71s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp16]
1.70s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-1-full-dtype2]
1.67s call unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[64-fp16]
1.64s call unit/ops/lion/test_lion.py::TestLionConfigs::test[Lion-True-DeepSpeedCPULion]
1.64s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-False-resulting_optimizer10]
1.61s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-True-resulting_optimizer14]
1.60s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-True-resulting_optimizer6]
1.54s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-fp32]
1.47s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-bf16]
1.47s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-2-full-dtype0]
1.46s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-2-full-dtype1]
1.45s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-1-full-dtype1]
1.44s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[cpu-2-full-dtype2]
1.43s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-True-False-resulting_optimizer1]
1.40s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-1-full-False]
1.35s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-1-full-True]
1.35s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-2-full-True]
1.35s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero3]
1.35s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[cpu-2-full-False]
1.34s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_bf16_fragments[False]
1.32s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-3-full-dtype0]
1.29s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-3-local-dtype0]
1.27s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-3-full-dtype1]
1.27s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[none-3-full-True]
1.26s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[none-3-local-True]
1.25s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-3-full-dtype2]
1.23s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[none-3-full-False]
1.22s call unit/runtime/zero/test_zero_leaf_module.py::TestSetZ3LeafModule::test_choose_module_by_rank
1.20s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-2-full-dtype2]
1.20s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentGet::test_zero_fragments[none-3-local-False]
1.19s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero3]
1.19s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-2-full-dtype1]
1.19s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-3-local-dtype2]
1.17s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-1-full-dtype2]
1.16s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero3]
1.16s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-1-full-dtype1]
1.14s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-3-local-dtype1]
1.14s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero3]
1.11s call unit/runtime/zero/test_zero_leaf_module.py::TestSetZ3LeafModule::test_no_grad_input_error
1.10s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero3]
1.09s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero3]
1.09s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero3]
1.09s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero3]
1.08s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero3]
1.08s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero3]
1.07s call unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[16-fp16]
1.07s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero3]
1.07s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero3]
1.05s call unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_model_quantization[4bits]
1.02s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero1]
1.02s call unit/runtime/zero/test_zero_tensor_fragment.py::TestTensorFragmentUpdate::test_zero_fragments[none-2-full-dtype0]
1.01s call unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[8-bf16]
(2628 durations < 1s hidden. Use -vv to show these durations.)
===================================================================================== short test summary info =====================================================================================
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_post_init_quant[4bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_post_init_quant[8bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_post_init_quant_cpu_offload[4bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_post_init_quant_cpu_offload[8bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_post_init_quant_nvme_offload - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_quantized_initialization[4bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_quantized_initialization[8bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_quantized_initialization_cpu_offload[4bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_quantized_initialization_cpu_offload[8bits] - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/inference/quantization/test_intX_quantization.py::TestQuantizedInt::test_zero3_int4_quantized_initialization_nvme_offload - ModuleNotFoundError: No module named 'mpi4py'
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-160-128-2-3-True-True-0.1] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-True-False] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-True-True] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True] - ImportError: dynamic module does not define module export function (PyInit_transformer_op)
FAILED tests/unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[False] - RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
FAILED tests/unit/pipe/test_pipe_module.py::TestPipeModuleSequential::test[True] - RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
==================================================== 21 failed, 1033 passed, 129 skipped, 4426 deselected, 185 warnings in 5739.02s (1:35:39) =====================================================
(at easybuild/tools/run.py:695 in parse_cmd_output)
== 2024-11-04 17:44:09,951 build_log.py:267 INFO ... (took 1 hour 43 mins 2 secs)
== 2024-11-04 17:44:09,951 build_log.py:267 INFO ... (took 1 hour 43 mins 21 secs)
== 2024-11-04 17:44:09,951 filetools.py:2025 INFO Removing lock /apps/Test/software/.locks/_apps_Test_software_DeepSpeed_0.14.5-foss-2023a-CUDA-12.1.1.lock...
== 2024-11-04 17:44:09,957 filetools.py:385 INFO Path /apps/Test/software/.locks/_apps_Test_software_DeepSpeed_0.14.5-foss-2023a-CUDA-12.1.1.lock successfully removed.
== 2024-11-04 17:44:09,957 filetools.py:2029 INFO Lock removed: /apps/Test/software/.locks/_apps_Test_software_DeepSpeed_0.14.5-foss-2023a-CUDA-12.1.1.lock
== 2024-11-04 17:44:09,957 easyblock.py:4297 WARNING build failed (first 300 chars): cmd "export PATH=/cephyr/NOBACKUP/priv/c3-staff/eb-tmp/eb-zy0xl6iw/tmplkzkg3ii/bin:$PATH PYTHONPATH=/cephyr/NOBACKUP/priv/c3-staff/eb-tmp/eb-zy0xl6iw/tmplkzkg3ii/lib/python3.11/site-packages:$PYTHONPATH && ln -s $PWD/tests/ ../tests && cd ../ && pytest tests/unit/ -k "not TestTensorBoard and not Te
== 2024-11-04 17:44:09,958 easyblock.py:326 INFO Closing log for application name DeepSpeed version 0.14.5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment