Skip to content

Instantly share code, notes, and snippets.

@vanpelt
Created January 11, 2019 20:27
Show Gist options
  • Save vanpelt/293ecb6b6a414009e9b9bac6e4ecdcf6 to your computer and use it in GitHub Desktop.
Save vanpelt/293ecb6b6a414009e9b9bac6e4ecdcf6 to your computer and use it in GitHub Desktop.
Sagemaker Sweep Monitor
import sagemaker
import wandb
tuner_name = "sagemaker-pytorch-190111-1056"
tuner = sagemaker.HyperparameterTuner.attach(tuner_name)
job_name = tuner.analytics().description()['TrainingJobDefinition']['StaticHyperParameters']['sagemaker_job_name']
client = tuner.sagemaker_session.sagemaker_client
api = wandb.Api()
runs = api.runs("wandb/sm-pytorch-cifar", {"config.sagemaker_job_name": job_name})
for run in runs:
if run.summary["_step"] > 100 and run.summary["Test Acc"] < 0.5:
client.stop_training_job(TrainingJobName=run.name.replace("algo-1", ""))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment