Skip to content

Instantly share code, notes, and snippets.

Last active May 19, 2021 21:04
Show Gist options
  • Save robertlugg/52e42c4506c8abf6e3a51c62a68e6c5f to your computer and use it in GitHub Desktop.
Save robertlugg/52e42c4506c8abf6e3a51c62a68e6c5f to your computer and use it in GitHub Desktop.
TFRecords: from_tensor vs from_tensor_slices

from_tensors vs. from_tensor_slices

From the manual

TFRecord files using The module also provides tools for reading and writing data in TensorFlow.

Writing a TFRecord file

The easiest way to get the data into a dataset is to use the from_tensor_slices method.

Applied to an array, it returns a dataset of scalars:

<TensorSliceDataset shapes: (), types: tf.int64>

Applied to a tuple of arrays, it returns a dataset of tuples:

features_dataset =, feature1, feature2, feature3))

<TensorSliceDataset shapes: ((), (), (), ()), types: (tf.bool, tf.int64, tf.string, tf.float64)>

From stackoverflow

from_tensors combines the input and returns a dataset with a single element:

>>> t = tf.constant([[1, 2], [3, 4]])
>>> ds =
>>> [x for x in ds]
[<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4]], dtype=int32)>]

from_tensor_slices creates a dataset with a separate element for each row of the input tensor:

>>> t = tf.constant([[1, 2], [3, 4]])
>>> ds =
>>> [x for x in ds]
[<tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([3, 4], dtype=int32)>]

I think the source of confusion (at least for me it was), is the name. Since the from_tensor_slices creates slices from the original data...the ideal name should have been "to_tensor_slices" - Because you are taking your data and create tensor slices out of it. Once you think along those lines all documentation from TF2 became very clear for me !

A key piece of info for me that was absent from the docs was that multiple tensors are passed to these methods as a tuple, e.g. from_tensors((t1,t2,t3,)). With that knowledge, from_tensors makes a dataset where each input tensor is like a row of your dataset, and from_tensor_slices makes a dataset where each input tensor is column of your data; so in the latter case all tensors must be the same length, and the elements (rows) of the resulting dataset are tuples with one element from each column.

Another reference to study

Also, how to use this:

'time':              tf.FixedLenSequenceFeature([], tf.float32, allow_missing = True), 

Also useful:


The stuff here would work, but the is much easier:

tf.train.Example.FromString(ex.numpy()).ListFields()[0][1].feature['objects'] json.loads(tf.train.Example.FromString(ex.numpy()).ListFields()[0][1].feature['objects'].bytes_list.value[0].decode())[0] json.loads(tf.train.Example.FromString(ex.numpy()).features.feature['objects'].bytes_list.value[0].decode())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment