Last active
March 11, 2020 13:52
-
-
Save LouisdeBruijn/b64bda87e72a0cc9df3ed51bb10205cb to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def shuffle_split(documents, labels, split): | |
| """Shuffle data to ensure random class distribution in train/test split.""" | |
| tuples = [[doc, label] for doc, label in zip(documents, labels)] | |
| random.shuffle(tuples) | |
| X, Y = zip(*tuples) | |
| split_point = int(split*len(X)) | |
| Xtrain = X[:split_point] | |
| Ytrain = Y[:split_point] | |
| Xtest = X[split_point:] | |
| Ytest = Y[split_point:] | |
| return Xtrain, Xtest, Ytrain, Ytest |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment