Checking reproducibility with Scikit-Optimize

Notes:

There are two places where we set a random_state in order to manage the pseudorandom portions of the code:

Inside the initialization of the gradient boosting machine
Inside the Scikit-Optimize optimizer gp_minimize()

There could also be some potential for variation in the Scikit-Learn's cross_val_score, but running multiple trials does not seem to affect the cross folds.

Results:

Looks like everything is good to go. I ran two runs for each of two random_states. Each time the final return value of the objective function is the same, as are the parameter choices selected by gp_minimize. Finally, we find that the convergence_plots are consistent for both of the experiments. For each random state, the two runs return identical values of the objective at each iteration of the optimization. The plots are sitting on top of one another.

yngtodd/reproducibility.ipynb

yngtodd commented Sep 22, 2017