Introduction to RESTful API with Tensorflow Serving

Yu Ishikawa
4 min readJun 16, 2018


I described how to serve trained tensorflow models with tensorflow serving in Serving Pre-Modeled and Custom Tensorflow Estimator with Tensorflow Serving before. In the article, I explained how to make tensorflow models with estimator and how to serve the models with tensorflow serving and docker. And tensorflow serving starts supporting the RESTful API feature at the version 1.8 in addition to gRPC API. So, I would like to describe how to server RESTful APIs with tensorflow serving.
In this article, I will give you a hands-on about the RESTful API feature. The goal is to serve an iris classifier with HTTP/REST API with tensorflow model serving. To do that, we have the three main steps as below:

  • Create a tensorflow model for iris data with tensorflow estimator.
  • Launch tensorflow model server with HTTP/REST API on docker.
  • Request to the HTTP/REST API.

I extended the repository which was used to explain gRPC API with tensorflow serving before in order to support RESTful API.

Train Model

First of all, we will train a model to classify iris data. I explained how to train a tensorflow model for tensorflow serving before. So, I would like to focus on making the RESTful API.

First but not least, designing the input is the most important. As you probably know, iris data has fourfeatures: sepal length, sepal width, petal length and petal width. The four features should be the input items of RESTful API. Its serving_input_receiver_fn can be like below. We define the input items withreceiver_tensors . That is, the inputs are sepal_length , sepal_width, petal_length and petal_width. Each feature can be a list of float values. And to keep the format of features consistent with the training data, those four features are concatinated before putting into tensorflow graph. The full code is

def serving_input_receiver_fn():
receiver_tensors = {
tf.placeholder(tf.float32, [None, 1]),
tf.placeholder(tf.float32, [None, 1]),
tf.placeholder(tf.float32, [None, 1]),
tf.placeholder(tf.float32, [None, 1]),

# Convert give inputs to adjust to the model.
features = {
INPUT_FEATURE: tf.concat([

The following command enables us to train models. The checkpoints and exported models will be created under. /models/iris_premodeled_estimator . As well as a model for tensorflow model serving is under a directory whose path is like. /model/iris_premodeled_estimator/export/models/1529121297 .

python python/train/ \
--steps 100 \
--model_dir ./models/iris_premodeled_estimator/

As I described how to show signatures of tensorflow trained model before, enables us to show the specification of API. The result should be like below.

python tensorflow/python/tools/ show \
--dir ./models/iris_premodeled_estimator/ckpt/export/models/1529109907/ \
The given SavedModel SignatureDef contains the following input(s):
inputs['petal_length'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: Placeholder_2:0
inputs['petal_width'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: Placeholder_3:0
inputs['sepal_length'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: Placeholder:0
inputs['sepal_width'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: Placeholder_1:0
The given SavedModel SignatureDef contains the following output(s):
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: dnn/head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: dnn/head/predictions/str_classes:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 3)
name: dnn/logits/BiasAdd:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 3)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/predict

That is, the request body of HTTP/REST will be like below.


Offer RESTful API with Tensorflow Serving

As we offer the gRPC server with docker, we will offer HTTP/REST API with docker as well. All we have to extend the Dockerfile is just to add options of tensorflow_model_server. Tensorflow serving 1.8 supports — rest_api_port which is used to specify the port number of REST API. The default value is 0. That is, it doesn’t offer REST API by default. The Dockerfile is here.

CMD tensorflow_model_server \
--port=8500 \
--rest_api_port=8501 \
--model_name="$MODEL_NAME" \

After building a docker image with the Dockerfile, we will prepare for the served model. First, we make a directory to share between the local machine and a docker container. Next, we choose a trained model to serve and then copy them to the directory.

# Make a directory to store the served model.
mkdir -p ./model_for_serving/iris/1
# Copy the model files to the appropriate directory.
cp -R ./models/iris_premodeled_estimator/export/models/1529121297/ \

Now we are ready to offer not only gRPC API, but also HTTP/REST API with docker. The following command is an example to run a docker container, sharing the model directory between the local machine and the docker container. Where, MODEL_NAME is used to set model name and MODEL_PATH is used for specifying the path to trained models.

docker run --rm  -v /PATH/TO/models_for_serving:/models \
-e MODEL_NAME=iris_premodeled_estimator \
-e MODEL_PATH=/models/iris_premodeled_estimator \
-p 8500:8500 \ # gRPC port
-p 8501:8501 \ # REST port
--name tensorflow-serving-example \

Request to RESTful API with Tensorflow Serving

You can see the documentation about RESTful API specification of tensorflow serving in here. TensorFlow ModelServer running on host:port accepts following REST API requests:

POST http://host:port/<URI>:<VERB>

URI: /v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]
VERB: classify|regress|predict

We have launched the docker container to offer the RESTful API to classify iris data. Now let’s request to the HTTP/REST API with curl . Where the model name is iris and the version is 1. The JSON object is in the -d options. As a result, we can get the class IDs, probabilities to each class and so on.

curl -X POST \
http://${DOCKER_HOST}:8501/v1/models/iris/versions/1:predict \
-d '{"signature_name":"predict","instances":[{"sepal_length":[6.8],"sepal_width":[3.2],"petal_length":[5.9],"petal_width":[2.3]}]}'
"predictions": [
"class_ids": [2],
"probabilities": [9.6812e-06, 0.155927, 0.844063],
"classes": ["2"],
"logits": [-4.74621, 4.94074, 6.62958]


Thanks to tensorflow serving, once we got great models with tensorflow, tensorflow serving enables us to create not only gRPC API, but also RESTful API automatically. That is, we machine learning guys can more focus on building models.

If you would like to get familiar with it, the official document describes the request and response body of JSON object to request.



Yu Ishikawa

Data Engineering / Machine Learning / MLOps / Data Governance / Privacy Engineering