Skip to content

Instantly share code, notes, and snippets.

TensorFlow Serving in 10 minutes!

TensorFlow SERVING is Googles' recommended way to deploy TensorFlow models. Without proper computer engineering background, it can be quite intimidating, even for people who feel comfortable with TensorFlow itself. Few things that I've found particularly hard were:

  • Tutorial examples have C++ code (which I don't know)
  • Tutorials have Kubernetes, gRPG, Bezel (some of which I saw for the first time)
  • It needs to be compiled. That process takes forever!

After all, it worked just fine. Here I present an easiest possible way to deploy your models with TensorFlow Serving. You will have your self-built model running inside TF-Serving by the end of this tutorial. It will be scalable, and you will be able to query it via REST.

import tensorflow as tf
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x,W) + b
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())

for _ in range(1000):
    batch = mnist.train.next_batch(100)
 train_step.run(feed_dict={x: batch[0], y_: batch[1]})
%matplotlib inline
import matplotlib.pyplot as plt
number = mnist.train.next_batch(1)[0]
plt.imshow(number.reshape(28,28))
from tensorflow.contrib.session_bundle import exporter

saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
model_exporter.init(
    sess.graph.as_graph_def(),
    named_graph_signatures={
        'inputs': exporter.generic_signature({'x': x}),
 'outputs': exporter.generic_signature({'pred': pred})})
!ls -lhR /tmp/models
/tmp/models:
total 12K
drwxr-xr-x 2 root root 4.0K Mar 10 10:29 00000001
-rw-r--r-- 1 root root 7.6K Mar 10 10:29 model.log

/tmp/models/00000001:

!tail -n2 /tmp/models/model.log
2017-03-10 10:29:49.461339: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: default version: 1}
2017-03-10 10:29:49.464518: I tensorflow_serving/model_servers/main.cc:257] Running ModelServer at 0.0.0.0:9000 ...
import numpy as np
import cPickle as pickle
import requests
def test_flask_client(x):
URL = "http://localhost:8915/model_prediction"
s = pickle.dumps({"x":x}, protocol=0)
DATA = {"model_name": "default",
%matplotlib inline
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
number = mnist.train.next_batch(1)[0]

plt.imshow(number.reshape(28,28))
test_flask_client(number)
{u'outputs': {u'pred': {u'dtype': u'DT_INT64',
   u'int64Val': [u'7'],

u'tensorShape': {u'dim': [{u'size': u'1'}]}}}}