Created
July 15, 2022 16:02
-
-
Save udaylunawat/28ae3a3db11f9c446ecf30d12ee9f586 to your computer and use it in GitHub Desktop.
Split folders with files (e.g. images) into train, validation and test (dataset) folders. And then convert them to Tensorflow Datasets.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# https://stackoverflow.com/a/64006242/9292995 | |
# https://github.com/jfilter/split-folders | |
import splitfolders | |
# If your datasets is balanced (each class has the same number of samples), use ratio | |
# otherwise use fixed if dataset is imbalanced. | |
splitfolders.ratio('input_dir', output="output_dir", oversample=False, ratio = (0.8, 0.1, 0.1), | |
seed=1337) | |
# https://www.tensorflow.org/datasets/api_docs/python/tfds/folder_dataset/ImageFolder | |
builder = tfds.ImageFolder('data/4_tfds_dataset') | |
print(builder.info) # num examples, labels... are automatically calculated | |
data = builder.as_dataset(split=None, as_supervised=True) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment