Created
June 14, 2021 16:16
-
-
Save Eligijus112/ed7fa600650e938d82ad9c5b1e581e97 to your computer and use it in GitHub Desktop.
Bootsrapping of a pandas dataframe
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # self - an instance of the RandomForestClassifier class | |
| # self.X - pandas dataframe containing feature information | |
| # self.Y - a list of binary response variable | |
| # self.X_obs_fraction - a float in range [0, 1] | |
| def bootstrap_sample(self): | |
| """ | |
| Function that creates a bootstraped sample with the class instance parameters | |
| """ | |
| # Sampling the number of rows with repetition | |
| Xbootstrap = self.X.sample(frac=self.X_obs_fraction, replace=True) | |
| # Getting the index of samples | |
| indexes = Xbootstrap.index | |
| # Getting the corresponding Y variables | |
| Ybootstrap = [self.Y[x] for x in indexes] | |
| # Droping the index of X | |
| Xbootstrap.reset_index(inplace=True, drop=True) | |
| # Returning the X, Y pair | |
| return Xbootstrap, Ybootstrap |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment