Machine learning as a microservice in python

Yu Ishikawa
5 min readDec 12, 2017

As you know, microservice is a hot topic those days. We machine learning engineers should follow the trend to provide an “API” for machine learning as a microservice. In this article, I would like to describe a minimum structure of machine learning as a microservice with gRPC in python and docker.

Step1: Train and persistent machine learning model

First of all, we must train a machine learning model to put in a microservice. Here, we will make a classification model for iris data with scikit-learn. Since we will put the saved model in a docker image, we persistent a classification model. We simply use LinearSVC to predict iris species by given features.

from sklearn import datasets
from sklearn import svm
from sklearn.externals import joblib
# load iris dataset
iris = datasets.load_iris()
X, y = iris.data, iris.target
# train model
clf = svm.LinearSVC()
clf.fit(X, y)
# persistent model
joblib.dump(clf, 'iris_model.pickle')

Step2: Define the protocol buffer

gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in last mile of distributed computing to connect devices, mobile applications and browsers to backend services.

The service recieves IrisPrdictRequest that includes properties for sepal length, sepal width, petal length and petal width. Meanwhile, the response is composed of species. As you can imagine, it would be good to put probability in the response, if you would like to get it. We save the code to iris.proto now.

syntax = "proto3";option java_multiple_files = true;
option java_package = "io.grpc.examples.ml";
option java_outer_classname = "IrisProto";
option objc_class_prefix = "HLW";
package ml;service IrisPredictor {
rpc PredictIrisSpecies (IrisPredictRequest) returns (IrisPredictReply) {}
}
message IrisPredictRequest {
double sepal_length = 1;
double sepal_width = 2;
double petal_length = 3;
double petal_width = 4;
}
message IrisPredictReply {
int32 species = 1;
}

Step 3: Generate python code for gRPC

We have defined the protocol buffer for the microservice. Now we will generate the python code with the definitions. To do that, we make a code like below, where we save it to codegen.py. Executingpython codegen.py , we finally generate iris_pb2.py and iris_pb_grpc.py .Of course, we can also generate them with shell command.

from grpc.tools import protocprotoc.main(
(
'',
'-I.',
'--python_out=.',
'--grpc_python_out=.',
'./iris.proto',
)
)

Step 4: Implemet the gRPC server in python

We have generated the python code for gRPC. Now, let’s implent the server part. We trained and saved a classification model to a serialized file as a pickle. We load the saved model in the server part. In order to reduce the overhead to load the trained model, we use Singleton pattern like get_or_create_model. As well as, we just call the predict API to the trained model. Besides, we expose the service at 50052 port.

import os
from concurrent import futures
import time
from pprint import pprint
from sklearn.externals import joblibimport grpcimport iris_pb2
import iris_pb2_grpc
_ONE_DAY_IN_SECONDS = 60 * 60 * 24class IrisPredictor(iris_pb2_grpc.IrisPredictorServicer):
_model = None
@classmethod
def get_or_create_model(cls):
"""
Get or create iris classification model.
"""
if cls._model is None:
path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'model', 'iris_model.pickle')
cls._model = joblib.load(path)
return cls._model
def PredictIrisSpecies(self, request, context):
model = self.__class__.get_or_create_model()
sepal_length = request.sepal_length
sepal_width = request.sepal_width
petal_length = request.petal_length
petal_width = request.petal_width
result = model.predict([[sepal_length, sepal_width, petal_length, petal_width]])
return iris_pb2.IrisPredictReply(species=result[0])
def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
iris_pb2_grpc.add_IrisPredictorServicer_to_server(IrisPredictor(), server)
server.add_insecure_port('[::]:50052')
server.start()
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)
if __name__ == '__main__':
serve()

Step 5: Implement the gRPC client

Since this is an example to serve a machine learning model as a microservice, we implement a client to call the gRPC API. We make a request object with iris_pb2.IrisPredictRequest . Here, we uses fixed values as an example.

from __future__ import print_function
import argparse
import grpc
import iris_pb2
import iris_pb2_grpc
def run(host, port):
channel = grpc.insecure_channel('%s:%d' % (host, port))
stub = iris_pb2_grpc.IrisPredictorStub(channel)
request = iris_pb2.IrisPredictRequest(
sepal_length=5.0,
sepal_width=3.6,
petal_length=1.3,
petal_width=0.25
)
response = stub.PredictIrisSpecies(request)
print("Predicted species number: " + str(response.species))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--host', help='host name', default='localhost', type=str)
parser.add_argument('--port', help='port number', default=50052, type=int)
args = parser.parse_args()
run(args.host, args.port)

Step 6: Bulid a Docker image

We have implemented the trained model, the gRPC for server and client. We will put them in a docker image. Then, we build a docker image dockerwith build . -t iris-predictor . As we run the server with 50052 port, we must expose the same port number on the docker image.

FROM ubuntu:16.04WORKDIR /root# Pick up some TF dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
build-essential \
curl \
pkg-config \
rsync \
software-properties-common \
unzip \
git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Install miniconda
RUN curl -LO http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh \
&& bash Miniconda-latest-Linux-x86_64.sh -p /miniconda -b \
&& rm Miniconda-latest-Linux-x86_64.sh
ENV PATH /miniconda/bin:$PATH
# Create a conda environment
ENV CONDA_ENV_NAME iris-predictor
COPY environment.yml ./environment.yml
RUN conda env create -f environment.yml -n $CONDA_ENV_NAME
ENV PATH /miniconda/envs/${CONDA_ENV_NAME}/bin:$PATH
# cleanup tarballs and downloaded package files
RUN conda clean -tp -y \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
EXPOSE 50052COPY . /root/
CMD ["python", "grpc_server.py"]

Step 7: Run a Docker container

Congraturation! Now, we are ready to serve the machine learning gRPC server on docker. We run a docker container with the below command. After running a docker container, we check the server with the client we made.

# run a docker container
docker run --rm -d -p 50052:50052 --name iris-predictor iris-predictor
# run a client
python iris_client.py --host 192.168.99.100 --port 50052

Summary

I explained a basics of implementing machine learning as a microservice with gRPC and docker. I am sure offering machine learning microservice could be much more important. Expanding what I described, we can adapt it to practical versions.

--

--

Yu Ishikawa

Data Engineering / Machine Learning / MLOps / Data Governance / Privacy Engineering