Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save VRehnberg/4f74901824e743f4f9a2a7c16d472e73 to your computer and use it in GitHub Desktop.
Save VRehnberg/4f74901824e743f4f9a2a7c16d472e73 to your computer and use it in GitHub Desktop.
(partial) EasyBuild log for failed build of /apps/c3se-easyconfigs/DeepSpeed-0.14.5-foss-2023a-CUDA-12.1.1.eb
5.98s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2]
5.94s call unit/runtime/half_precision/test_fp16.py::TestAdamwFP16EmptyGrad::test
5.94s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask1]
5.93s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor3]
5.93s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[None]
5.92s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-3]
5.90s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-101]
5.90s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[None]
5.90s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[non_tensor4]
5.89s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[True]
5.88s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[None]
5.88s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-3]
5.86s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupDecayLR-params1]
5.84s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[True]
5.84s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable[Callable]
5.83s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[non_tensor4]
5.82s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs1[mask0]
5.82s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[False-1]
5.81s call unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[100-10-0.1-0.2]
5.81s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_arg_none[mask0]
5.81s call unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest
5.81s call unit/comm/test_dist.py::TestDistAllReduce::test
5.80s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor4]
5.80s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithoutGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output1]
5.79s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow
5.79s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-True]
5.79s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithoutGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output3]
5.79s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_arg_none[mask1]
5.78s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[True]
5.78s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_input[non_tensor3]
5.77s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs2[mask1]
5.77s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_same_lrscheler_and_callable[_LRScheduler]
5.77s call unit/runtime/test_multi_output_model.py::TestTwoOutputModel::test
5.77s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-2]
5.76s call unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[False]
5.76s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs3[mask0]
5.75s call unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[2]
5.74s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[non_tensor3]
5.74s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output2]
5.73s call unit/runtime/zero/test_zero_context.py::TestMiCSGatheredParametersFree::test
5.73s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_same_lrscheler_and_callable[Callable]
5.73s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs2[mask0]
5.72s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensor::test_ckpt_non_tensor_output[2]
5.72s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[2]
5.72s call unit/comm/test_dist.py::TestDistInferenceAllReduce::test[dtype1]
5.71s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs3[mask0]
5.71s call unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[Adam]
5.71s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[Adam-3]
5.70s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-19]
5.70s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[None]
5.70s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask0]
5.70s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-False-False]
5.70s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-True-True]
5.70s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-False-True]
5.69s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05]
5.69s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_arg_none[mask1]
5.68s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[Optimizer]
5.68s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-Callable]
5.68s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs2_outputs3[mask0]
5.68s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-19]
5.67s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-False-True]
5.67s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[None]
5.67s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask1]
5.67s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[non_tensor4]
5.66s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[False-2]
5.66s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-15]
5.66s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs2[mask0]
5.65s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_input[2]
5.64s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestCheckpointNonTensorOutputOrdering::test_ckpt_non_tensor_output_ordering[non_tensor_output1]
5.64s call unit/runtime/zero/test_zero.py::TestZeroOffloadOptim::test[True]
5.64s call unit/runtime/test_pld.py::TestNonPLDModel::test_non_pld_model
5.64s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs1[mask0]
5.64s call unit/runtime/half_precision/test_bf16.py::TestZeroAllowUntestedOptimizer::test
5.63s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[True]
5.63s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[1-True]
5.63s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs1_outputs1[mask1]
5.62s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithGrad::test_ckpt_inputs2_outputs1[mask1]
5.62s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_throughput_calculation
5.61s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[2]
5.61s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-True-True]
5.60s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs1_outputs1[mask0]
5.60s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-2]
5.59s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs1[mask0]
5.59s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[1-False]
5.59s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-None]
5.59s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable[None]
5.58s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs1[mask1]
5.58s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_input[non_tensor4]
5.58s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-True-True]
5.58s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[None]
5.57s call unit/runtime/activation_checkpointing/test_activation_checkpointing.py::TestActivationCheckpoint::test_ckpt_inputs1_outputs1[mask0]
5.57s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorOutputOrderingWithGrad::test_ckpt_non_tensor_output_ordering[non_tensor_output1]
5.56s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[2]
5.55s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-15]
5.55s call unit/runtime/half_precision/test_fp16.py::TestZeroAllowUntestedOptimizer::test[True-1]
5.55s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[Optimizer-_LRScheduler]
5.54s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithGrad::test_ckpt_non_tensor_output[non_tensor3]
5.53s call unit/comm/test_dist.py::TestWorldSizeOverrideDistTest::test_world_size_1
5.53s call unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test_flops_profiler_in_inference
5.53s call unit/runtime/test_ds_config_dict.py::TestInitNoOptimizer::test
5.52s call unit/utils/test_init_on_device.py::TestOnDevice::test_on_device[cuda:0]
5.52s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestCheckpointNonTensorWithoutGrad::test_ckpt_non_tensor_output[None]
5.52s call unit/runtime/activation_checkpointing/test_activation_checkpointing_non_reentrant.py::TestActivationCheckpointWithoutGrad::test_ckpt_inputs2_outputs2[mask1]
5.51s call unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_check_version
5.50s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-101]
5.50s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-3]
5.49s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable_onecyclelr_steplr[Callable]
5.49s call unit/runtime/test_data.py::TestDataLoaderDropLast::test[4-False]
5.48s call unit/comm/test_dist.py::TestDistInferenceAllReduce::test[dtype0]
5.48s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[False-1]
5.48s call unit/runtime/test_ds_config_dict.py::TestNoModel::test
5.47s call unit/runtime/test_ds_initialize.py::TestClientLrSchedulerInit::test_diff_lrscheler_and_callable[_LRScheduler]
5.47s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-211]
5.46s call unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[2]
5.44s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.001-0.001-10-False]
5.44s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-15]
5.43s call unit/runtime/half_precision/test_bf16.py::TestAdamBF16ZeroOneCycleCompatibility::test
5.43s call unit/runtime/zero/test_zero_nesting_init.py::TestShutdownInNestingInit::test_shutdown_in_nesting_init
5.43s call unit/runtime/test_ds_config_dict.py::TestBasicConfig::test_accelerator
5.41s call unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[None]
5.40s call unit/comm/test_dist.py::TestDistInitNoEnv::test
5.38s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-10]
5.36s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-1]
5.36s call unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-2]
5.36s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True]
5.35s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[OneCycle-params2]
5.35s call unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[False]
5.35s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_hjson
5.34s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-_LRScheduler]
5.33s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupLR-params0]
5.31s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-None]
5.31s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05]
5.30s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scattered_init_dist
5.30s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_some_overflow
5.29s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-10]
5.29s call unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[500-30-0.0-0.2]
5.27s call unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredInsideNestingInit::test_new_class_declared_inside_nesting_init
5.26s call unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-Callable]
5.25s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-3]
5.25s call unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[600-300-0.1-0.0]
5.24s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-AdamW]
5.24s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.1]
5.24s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-True-False]
5.24s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-Adam]
5.23s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_ext_param_getattr
5.23s call unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[FusedAdam]
5.22s call unit/runtime/test_ds_config_dict.py::TestDistInit::test
5.22s call unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[600-550-0.0-0.0]
5.22s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_all_overflow
5.21s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-2]
5.21s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0]
5.21s call unit/runtime/zero/test_zero_nesting_init.py::TestNestingInit::test_nesting_init
5.20s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_all_overflow
5.20s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.1-0-10-0]
5.19s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[0.001-0.1-0-21-21]
5.19s call unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_no_overflow
5.19s call unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-160-128-2-3-True-True-0.1]
5.19s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None]
5.19s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0]
5.18s call unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-True-False]
5.18s call unit/runtime/zero/test_zero_context_ancestry.py::TestSerialParamInit::test_subclass_param_init
5.18s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-33]
5.18s call unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test
5.17s call unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[200-20-0.1-0.2]
5.17s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[True-2]
5.17s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-True]
5.16s call unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[3]
5.16s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_dict
5.15s call unit/runtime/zero/test_zero_dynamic_class.py::TestNewClassDeclaredNestingInit::test_new_class_declared_nesting_init
5.15s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-210]
5.14s call unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[LRRangeTest-params3]
5.14s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-0.001-10-True]
5.14s call unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[2]
5.13s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[False-2]
5.13s call unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-1]
5.13s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-AdamW]
5.12s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupDecayLR-params1]
5.12s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[LRRangeTest-params3]
5.11s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupLR-params0]
5.11s call unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test[True]
5.11s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[True-1]
5.10s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-100]
5.10s call unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-100]
5.10s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-15]
5.10s call unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_json
5.09s call unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[1]
5.09s call unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[False-3]
5.09s call unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[False-1]
5.09s call unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[1]
5.08s call unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test[False]
5.08s call unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-Adam]
5.08s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-False-False]
5.08s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-True-False]
5.08s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-False]
5.07s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[tensor]
5.07s call unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[OneCycle-params2]
5.07s call unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1001]
5.07s call unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[3]
5.06s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_subclass_param
5.06s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-19]
5.06s call unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1000]
5.06s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[1e-05-1e-05-1-False]
5.06s call unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[False]
5.05s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[dict]
5.05s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-True-False]
5.05s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-19]
5.05s call unit/runtime/zero/test_zero_context.py::TestSerialContext::test_scatter_halftype
5.04s call unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-1e-05-1-True]
5.03s call unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-False-False]
5.02s call unit/runtime/test_ds_config_dict.py::TestArgs::test_none_args
5.01s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-33]
5.01s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-10]
5.01s call unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_ext_param_return
4.97s call unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.9]
4.96s call unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-33]
4.94s call unit/runtime/test_ds_config_dict.py::TestArgs::test_no_args
4.93s call unit/runtime/zero/test_zero_context.py::TestZeroGatheredParametersFree::test
4.77s call unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[True]
4.73s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0]
4.67s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple]
4.54s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[list]
4.53s call unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[dict]
1.66s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-False-False]
1.62s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-True-False-True-resulting_optimizer6]
1.56s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-False-resulting_optimizer10]
1.56s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[128-fp32]
1.56s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-fp32]
1.51s call unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-True-False-True-resulting_optimizer14]
1.50s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1048576-bf16]
1.45s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1024-bf16]
1.44s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[128-bf16]
1.44s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_torch_adamw_equal[1024-fp32]
1.43s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-None]
1.35s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-True-False]
1.35s call unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[8-fp16]
1.34s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-True-False-True]
1.30s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-False]
1.28s call unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp16]
1.18s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-False]
1.16s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-False-True-True]
1.12s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[True-True-True]
1.11s call unit/ops/aio/test_aio.py::TestWrite::test_parallel_write[False-True-True]
1.11s call unit/ops/aio/test_aio.py::TestRead::test_parallel_read[True-False-False]
1.11s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[False-False-True-True]
1.09s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-None]
1.08s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-False-True]
1.05s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-True]
1.05s call unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[128-fp16]
1.05s call unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[128-bf16]
1.04s call unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero3]
1.03s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-False-True-False]
1.02s call unit/runtime/test_mup_optimizers.py::TestMuPOptimizers::test[True-MuAdam-Adam]
1.01s call unit/ops/aio/test_aio.py::TestRead::test_async_read[False-True-False-False]
1.01s call unit/ops/aio/test_aio.py::TestWrite::test_async_write[True-True-True-True]
(3062 durations < 1s hidden. Use -vv to show these durations.)
===================================================================================== short test summary info =====================================================================================
FAILED tests/unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_existing_latest - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_latest_checkpoint.py::TestLatestCheckpoint::test_missing_latest - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[True-1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[True-2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[False-1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestSaveTensorClone::test_save_tensor_clone[False-2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/checkpoint/test_zero_optimizer.py::TestZeRONonDistributed::test_chmod_exception_handling[3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[True-"I am 6' tall"] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[True-'I am 72" tall'] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[True-'"translate English to Romanian: "'] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[True-I'm going to tell them "DeepSpeed is the best"] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[False-"I am 6' tall"] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[False-'I am 72" tall'] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[False-'"translate English to Romanian: "'] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_user_args[False-I'm going to tell them "DeepSpeed is the best"] - FileNotFoundError: [Errno 2] No such file or directory: 'deepspeed'
FAILED tests/unit/launcher/test_user_args.py::test_bash_string_args - AssertionError: User args not parsed correctly: xargs: deepspeed: No such file or directory
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-160-128-2-24-False-True-0.2] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[64-1600-128-2-4-False-True-0.2] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-25-3-True-True-0.05] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-160-128-2-3-True-True-0.1] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_backward.py::TestCUDABackward::test_backward[8-1600-128-2-3-True-True-0.05] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-True-False] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-True-True] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-3-1024-512-16-3-False-False] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/accelerators/test_accelerator_forward.py::TestCUDAForwardSmallBatchSize::test_forward_with_small_bsz[8-7-1024-512-16-3-False-True] - RuntimeError: Error building extension 'transformer'
FAILED tests/unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-False-resulting_optimizer0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/ops/adam/test_adamw.py::TestAdamConfigs::test[AdamW-False-False-True-resulting_optimizer4] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-False-False-resulting_optimizer8] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_adamw.py::TestAdamConfigs::test[Adam-False-False-True-resulting_optimizer12] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[64-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[22-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[128-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1024-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_cpu_adam.py::TestCPUAdam::test_fused_adam_equal[1048576-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[8-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[8-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[8-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[16-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[16-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/adam/test_hybrid_adam.py::TestHybridAdam::test_hybrid_adam_equal[16-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/deepspeed4science/test_DS4Sci_EvoformerAttention.py::test_DS4Sci_EvoformerAttention[tensor_shape0-dtype0] - RuntimeError: Error building extension 'evoformer_attn'
FAILED tests/unit/ops/deepspeed4science/test_DS4Sci_EvoformerAttention.py::test_DS4Sci_EvoformerAttention[tensor_shape0-dtype1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/evoformer_attn/evoformer_attn.so: cannot open shared object file: No such file or di...
FAILED tests/unit/ops/deepspeed4science/test_DS4Sci_EvoformerAttention.py::test_DS4Sci_EvoformerAttention[tensor_shape1-dtype0] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/evoformer_attn/evoformer_attn.so: cannot open shared object file: No such file or di...
FAILED tests/unit/ops/deepspeed4science/test_DS4Sci_EvoformerAttention.py::test_DS4Sci_EvoformerAttention[tensor_shape1-dtype1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/evoformer_attn/evoformer_attn.so: cannot open shared object file: No such file or di...
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[64-fp16] - RuntimeError: Error building extension 'fused_lion'
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[64-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[64-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[22-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[22-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[22-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[128-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[128-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[128-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[1024-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[1024-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[1024-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[1048576-fp16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[1048576-bf16] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_cpu_lion.py::TestCPULion::test_fused_lion_equal[1048576-fp32] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/ops/lion/test_lion.py::TestLionConfigs::test[Lion-False-FusedLion] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_lion/fused_lion.so: cannot open shared object file: No such file or directory
FAILED tests/unit/profiling/flops_profiler/test_flops_profiler.py::TestFlopsProfiler::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_dict - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_json - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestConfigLoad::test_hjson - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestDeprecatedDeepScaleConfig::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestDistInit::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestArgs::test_none_args - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_config_dict.py::TestArgs::test_no_args - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_initialize.py::TestClientOptimizer::test[None] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[True] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_initialize.py::TestConfigOptimizer::test[False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-bf16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[None-fp32-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-bf16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp16-fp32-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-bf16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[bf16-fp32-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-bf16-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero1] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero2] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-zero3] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestOptimizerImplementation::test[fp32-fp32-None] - ImportError: /dev/shm/DeepSpeed/0.14.5/foss-2023a-CUDA-12.1.1/xdg-cache-home/torch_extensions/py311_cu121/fused_adam/fused_adam.so: cannot open shared object file: No such file or directory
FAILED tests/unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-None] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-_LRScheduler] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_ds_initialize.py::TestClientLrScheduler::test[None-Callable] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupLR-params0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[WarmupDecayLR-params1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[OneCycle-params2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestGetLrBeforeTrain::test[LRRangeTest-params3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-10] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-15] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-19] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[log-33] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-10] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-15] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-19] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_schedule[linear-33] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-10] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-15] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-19] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[log-33] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-10] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-15] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-19] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrSchedule::test_lr_warmup_decay_schedule[linear-33] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupLR-params0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[WarmupDecayLR-params1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[OneCycle-params2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestSchedulerOptimizerParity::test[LRRangeTest-params3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-1e-05-1-True] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrRange::test[1e-05-1e-05-1-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.0001-0.001-10-True] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.001-0.001-10-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-True] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestLrRange::test[0.01-0.01-19-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-100] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[0.001-0.1-0-21-21] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.01-0.001-10-101] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[0.001-0.1-0.1-21-21] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_lr[1e-05-0.1-0-10-0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-100] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-210] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0.001-101] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestOneCycle::test_mom[0.08-0.09-0-211] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[100-10-0.1-0.2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[200-20-0.1-0.2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[500-30-0.0-0.2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[600-300-0.1-0.0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_lr_schedulers.py::TestWarmupCosineLR::test_lr[600-550-0.0-0.0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_multi_output_model.py::TestTwoOutputModel::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_multi_output_model.py::TestThreeOutputModel::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_pld.py::TestPLDModel::test_pld_model[0.9] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_pld.py::TestPLDModel::test_pld_model[1.0] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/test_pld.py::TestNonPLDModel::test_non_pld_model - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_bf16.py::TestAdamBF16ZeroOneCycleCompatibility::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_bf16.py::TestZeroSupportedClientOptimizer::test[FusedAdam] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_no_overflow - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_all_overflow - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_dynamic_loss_scale.py::TestFused::test_some_overflow - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_no_overflow - RuntimeError: Error building extension 'fused_lamb'
FAILED tests/unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_all_overflow - RuntimeError: Error building extension 'fused_lamb'
FAILED tests/unit/runtime/half_precision/test_dynamic_loss_scale.py::TestUnfused::test_some_overflow - RuntimeError: Error building extension 'fused_lamb'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestAdamFP16ZeroOneCycleCompatibility::test[False-3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[False-1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[False-2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZeroStaticScale::test[False-3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZeroSupportedClientOptimizer::test[FusedAdam-3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-Adam] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[True-AdamW] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-Adam] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestFP16AdamTypes::test[False-AdamW] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/half_precision/test_fp16.py::TestZero3LazyScatter::test - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroUnbalancedGradients::test[3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test[True] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZero3RepeatForwardLoop::test[False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1000] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestIncorectAllgatherBucketSize::test[1001] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[tuple] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[list] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZero3DictFwd::test[dict] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[1] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[2] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroAdamOptimizerStepCount::test[3] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestZeroPartitionCache::test_training_partition_cache[True] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-True-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype0-False-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-True-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype1-False-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-True-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero.py::TestEmptyParameterGroup::test_empty_param_groups[dtype2-False-False] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero_context.py::TestSerialContext::test_throughput_calculation - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero_context.py::TestSerialContext::test_ext_param_getattr - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_ext_param_return - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[tensor] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[dict] - RuntimeError: Error building extension 'fused_adam'
FAILED tests/unit/runtime/zero/test_zero_context_return.py::TestReturnParam::test_stage_3_output_type[None] - RuntimeError: Error building extension 'fused_adam'
===================================================== 233 failed, 369 passed, 581 skipped, 4426 deselected, 88 warnings in 2208.85s (0:36:48) =====================================================
(at easybuild/tools/run.py:695 in parse_cmd_output)
== 2024-10-30 09:47:05,668 build_log.py:267 INFO ... (took 37 mins 15 secs)
== 2024-10-30 09:47:05,668 build_log.py:267 INFO ... (took 37 mins 42 secs)
== 2024-10-30 09:47:05,668 filetools.py:2025 INFO Removing lock /apps/Test/software/.locks/_apps_Test_software_DeepSpeed_0.14.5-foss-2023a-CUDA-12.1.1.lock...
== 2024-10-30 09:47:05,672 filetools.py:385 INFO Path /apps/Test/software/.locks/_apps_Test_software_DeepSpeed_0.14.5-foss-2023a-CUDA-12.1.1.lock successfully removed.
== 2024-10-30 09:47:05,672 filetools.py:2029 INFO Lock removed: /apps/Test/software/.locks/_apps_Test_software_DeepSpeed_0.14.5-foss-2023a-CUDA-12.1.1.lock
== 2024-10-30 09:47:05,672 easyblock.py:4297 WARNING build failed (first 300 chars): cmd "export PYTHONPATH=/cephyr/NOBACKUP/priv/c3-staff/eb-tmp/eb-t87e95ii/tmpltfa1fet/lib/python3.11/site-packages:$PYTHONPATH && mv deepspeed deepspeed.src && pytest tests/unit/ -k "not TestTensorBoard and not TestWandb and not TestCometMonitor" && mv deepspeed.src deepspeed " exited with exit code
== 2024-10-30 09:47:05,672 easyblock.py:326 INFO Closing log for application name DeepSpeed version 0.14.5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment