Skip to content

Instantly share code, notes, and snippets.

@PhanDuc
Last active February 18, 2018 18:54
Show Gist options
  • Save PhanDuc/1ddfcdcf7cb2cb2176f638f145074edd to your computer and use it in GitHub Desktop.
Save PhanDuc/1ddfcdcf7cb2cb2176f638f145074edd to your computer and use it in GitHub Desktop.
from sklearn.ensemble import BaggingClassifier
def bootstrap_predictions(estimator, X, y, X_test, n_bootstrap=101):
"""Bootstrap a given classifier.
Parameters
----------
estimator : object
A classifier instance with sklearn-compatible interface.
X : array, shape = (n_samples, n_features)
The X part of the full dataset.
y : array, shape = (n_samples,)
The target labels of the full dataset.
X_test : array, shape(n_test_samples, n_features)
The test data to measure theoutput of bootstrapped estimators on.
n_bootstrap : int, nonnegative
The number of bootstrap replication of `estimator` to make.
Returns
-------
proba : array, shape=(n_test_samples, n_bootstrap), dtype=float
The matrix of bootstrapped outputs of the classifier.
bag : list
The list of bootstrap replications of the classifier.
Details
-------
The `(X, y)` full dataset is used to generate bootstrap samples. Each one
of `n_bootstrap` samples is used to train a separate copy of the provided
`estimator`. Each classifier form resulting set of bootstrapped estimators
is applied to `X_test` and the output is recorded in `proba` array.
"""
### BEGIN Solution
model = BaggingClassifier(base_estimator=estimator,
n_estimators=n_bootstrap,
bootstrap = True,
n_jobs = -1,
random_state=42)
model.fit(X, y)
# The list of bootstrap replications of the classifier.
bag = model.estimators_
# The matrix of bootstrapped outputs of the classifier.
proba = np.zeros(shape=(X_test.shape[0], n_bootstrap))
for index, estimator in enumerate(model.estimators_):
proba[:, index] = estimator.predict(X_test)
### END Solution
return proba, bag
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment