Created
January 12, 2020 11:23
-
-
Save jamescalam/3a6df1060ed1f2f59017bb8dee825f14 to your computer and use it in GitHub Desktop.
Example code snippet for Naive Bayes fundamentals article, part [2]
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# [2] now split into train/test set | |
# create our mask (70%) | |
mask = np.random.rand(len(dataset)) < 0.7 | |
train = dataset[mask] # get 70% of samples from mask indices | |
test = dataset[~mask] # get other 30% of samples | |
# we also need to split the data based on whether person earns | |
# more than or less than 50K | |
less = train[train['income'] == '<=50K'] | |
more = train[train['income'] == '>50K'] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment