Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jakelevi1996/3ac76f6700f50ecd9bbc078072f08dd0 to your computer and use it in GitHub Desktop.

Select an option

Save jakelevi1996/3ac76f6700f50ecd9bbc078072f08dd0 to your computer and use it in GitHub Desktop.
Download datasets using TensorFlow Datasets

Download datasets using TensorFlow Datasets

The TensorFlow Datasets module provides easy access to many useful machine learning data sets. It can be installed from pip using the following command:

pip install tensorflow-datasets

There are many datasets available from this module in various categories, including audio, image classification, object detection, text, translation, and more. The full list of data sets and categories is available here. One useful example is the imagenette data-set, a small version of imagenet with 10 classes, 9,469 training examples and 3,925 validation examples, available in 3 different resolutions, requiring 100 MB, 330 MB, and 1.5 GB respectively. Below is an example script which downloads imagenette if it isn't available already, and displays images from it one by one using matplotlib:

import tensorflow_datasets as tfds
import matplotlib.pyplot as plt

ds = tfds.load('imagenette', split='train')

for example in ds.take(100):
    image, label = example["image"], example["label"]
    print(label)
    print(image)
    plt.imshow(image)
    plt.show()

In the TensorFlow Datasets Guide, see the sections on numpy and batched Tensors for examples of how to convert the resulting object to different data types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment