Skip to content

Instantly share code, notes, and snippets.

@yngtodd
Created September 22, 2017 01:21
Show Gist options
  • Save yngtodd/77e44c088fecab07a74ce59a58524e5a to your computer and use it in GitHub Desktop.
Save yngtodd/77e44c088fecab07a74ce59a58524e5a to your computer and use it in GitHub Desktop.
Checking reproducibility with Scikit-Optimize
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@yngtodd
Copy link
Author

yngtodd commented Sep 22, 2017

Notes:

There are two places where we set a random_state in order to manage the pseudorandom portions of the code:

  1. Inside the initialization of the gradient boosting machine
  2. Inside the Scikit-Optimize optimizer gp_minimize()

There could also be some potential for variation in the Scikit-Learn's cross_val_score, but running multiple trials does not seem to affect the cross folds.

Results:

Looks like everything is good to go. I ran two runs for each of two random_states. Each time the final return value of the objective function is the same, as are the parameter choices selected by gp_minimize. Finally, we find that the convergence_plots are consistent for both of the experiments. For each random state, the two runs return identical values of the objective at each iteration of the optimization. The plots are sitting on top of one another.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment