Skip to content

Instantly share code, notes, and snippets.

@Eligijus112
Created June 14, 2021 16:16
Show Gist options
  • Save Eligijus112/ed7fa600650e938d82ad9c5b1e581e97 to your computer and use it in GitHub Desktop.
Save Eligijus112/ed7fa600650e938d82ad9c5b1e581e97 to your computer and use it in GitHub Desktop.
Bootsrapping of a pandas dataframe
# self - an instance of the RandomForestClassifier class
# self.X - pandas dataframe containing feature information
# self.Y - a list of binary response variable
# self.X_obs_fraction - a float in range [0, 1]
def bootstrap_sample(self):
"""
Function that creates a bootstraped sample with the class instance parameters
"""
# Sampling the number of rows with repetition
Xbootstrap = self.X.sample(frac=self.X_obs_fraction, replace=True)
# Getting the index of samples
indexes = Xbootstrap.index
# Getting the corresponding Y variables
Ybootstrap = [self.Y[x] for x in indexes]
# Droping the index of X
Xbootstrap.reset_index(inplace=True, drop=True)
# Returning the X, Y pair
return Xbootstrap, Ybootstrap
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment