Skip to content

Instantly share code, notes, and snippets.

@sofianhamiti
Created March 22, 2022 13:14
Show Gist options
  • Save sofianhamiti/18b7de4c2369751ae605668ca7410a36 to your computer and use it in GitHub Desktop.
Save sofianhamiti/18b7de4c2369751ae605668ca7410a36 to your computer and use it in GitHub Desktop.
from sagemaker.workflow.retry import (
StepRetryPolicy,
StepExceptionTypeEnum,
SageMakerJobStepRetryPolicy,
SageMakerJobExceptionTypeEnum
)
step_retry_policy = StepRetryPolicy(
exception_types=[
StepExceptionTypeEnum.SERVICE_FAULT,
StepExceptionTypeEnum.THROTTLING,
],
backoff_rate=2.0,
interval_seconds=30,
expire_after_mins=240 # keep trying for for 4 hours max
)
job_retry_policy = SageMakerJobStepRetryPolicy(
exception_types=[SageMakerJobExceptionTypeEnum.RESOURCE_LIMIT],
failure_reason_types=[
SageMakerJobExceptionTypeEnum.INTERNAL_ERROR,
SageMakerJobExceptionTypeEnum.CAPACITY_ERROR,
],
backoff_rate=2.0,
interval_seconds=30,
expire_after_mins=240 # keep trying for for 4 hours max
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment