Skip to content

Instantly share code, notes, and snippets.

@gangliao
Last active March 6, 2020 09:06
Show Gist options
  • Save gangliao/5de9ec473bc6dc6ca6d3072ab22cdbb2 to your computer and use it in GitHub Desktop.
Save gangliao/5de9ec473bc6dc6ca6d3072ab22cdbb2 to your computer and use it in GitHub Desktop.
Using Tensorflow's tf.data to load data from HDFS
import tensorflow as tf
filenames = ["hdfs://10.152.104.73:8020/sogou/train_data/1_final.feature_transform"]
dataset = tf.data.TextLineDataset(filenames)
iterator = dataset.make_one_shot_iterator()
next_batch = iterator.get_next()
with tf.Session() as sess:
while True:
try:
print(sess.run(next_batch).decode())
except tf.errors.OutOfRangeError:
break
@amithbk12man
Copy link

You need to set the path of HDFS or install with libhdfs.so please check this https://www.tensorflow.org/deploy/hadoop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment