Machine learning as a microservice in python
As you know, microservice is a hot topic those days. We machine learning engineers should follow the trend to provide an “API” for machine learning as a microservice. In this article, I would like to describe a minimum structure of machine learning as a microservice with gRPC in python and docker.
Step1: Train and persistent machine learning model
First of all, we must train a machine learning model to put in a microservice. Here, we will make a classification model for iris data with scikit-learn. Since we will put the saved model in a docker image, we persistent a classification model. We simply use LinearSVC
to predict iris species by given features.
from sklearn import datasets
from sklearn import svm
from sklearn.externals import joblib# load iris dataset
iris = datasets.load_iris()
X, y = iris.data, iris.target# train model
clf = svm.LinearSVC()
clf.fit(X, y)# persistent model
joblib.dump(clf, 'iris_model.pickle')
Step2: Define the protocol buffer
gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication. It is also applicable in last mile of distributed computing to connect devices, mobile applications and browsers to backend services.
The service recieves IrisPrdictRequest
that includes properties for sepal length, sepal width, petal length and petal width. Meanwhile, the response is composed of species. As you can imagine, it would be good to put probability in the response, if you would like to get it. We save the code to iris.proto
now.
syntax = "proto3";option java_multiple_files = true;
option java_package = "io.grpc.examples.ml";
option java_outer_classname = "IrisProto";
option objc_class_prefix = "HLW";package ml;service IrisPredictor {
rpc PredictIrisSpecies (IrisPredictRequest) returns (IrisPredictReply) {}
}message IrisPredictRequest {
double sepal_length = 1;
double sepal_width = 2;
double petal_length = 3;
double petal_width = 4;
}message IrisPredictReply {
int32 species = 1;
}
Step 3: Generate python code for gRPC
We have defined the protocol buffer for the microservice. Now we will generate the python code with the definitions. To do that, we make a code like below, where we save it to codegen.py
. Executingpython codegen.py
, we finally generate iris_pb2.py
and iris_pb_grpc.py
.Of course, we can also generate them with shell command.
from grpc.tools import protocprotoc.main(
(
'',
'-I.',
'--python_out=.',
'--grpc_python_out=.',
'./iris.proto',
)
)
Step 4: Implemet the gRPC server in python
We have generated the python code for gRPC. Now, let’s implent the server part. We trained and saved a classification model to a serialized file as a pickle. We load the saved model in the server part. In order to reduce the overhead to load the trained model, we use Singleton pattern like get_or_create_model
. As well as, we just call the predict API to the trained model. Besides, we expose the service at 50052
port.
import os
from concurrent import futures
import time
from pprint import pprintfrom sklearn.externals import joblibimport grpcimport iris_pb2
import iris_pb2_grpc_ONE_DAY_IN_SECONDS = 60 * 60 * 24class IrisPredictor(iris_pb2_grpc.IrisPredictorServicer):
_model = None@classmethod
def get_or_create_model(cls):
"""
Get or create iris classification model.
"""
if cls._model is None:
path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'model', 'iris_model.pickle')
cls._model = joblib.load(path)
return cls._modeldef PredictIrisSpecies(self, request, context):
model = self.__class__.get_or_create_model()
sepal_length = request.sepal_length
sepal_width = request.sepal_width
petal_length = request.petal_length
petal_width = request.petal_width
result = model.predict([[sepal_length, sepal_width, petal_length, petal_width]])
return iris_pb2.IrisPredictReply(species=result[0])def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
iris_pb2_grpc.add_IrisPredictorServicer_to_server(IrisPredictor(), server)
server.add_insecure_port('[::]:50052')
server.start()
try:
while True:
time.sleep(_ONE_DAY_IN_SECONDS)
except KeyboardInterrupt:
server.stop(0)if __name__ == '__main__':
serve()
Step 5: Implement the gRPC client
Since this is an example to serve a machine learning model as a microservice, we implement a client to call the gRPC API. We make a request object with iris_pb2.IrisPredictRequest
. Here, we uses fixed values as an example.
from __future__ import print_function
import argparse
import grpcimport iris_pb2
import iris_pb2_grpcdef run(host, port):
channel = grpc.insecure_channel('%s:%d' % (host, port))
stub = iris_pb2_grpc.IrisPredictorStub(channel)
request = iris_pb2.IrisPredictRequest(
sepal_length=5.0,
sepal_width=3.6,
petal_length=1.3,
petal_width=0.25
)
response = stub.PredictIrisSpecies(request)
print("Predicted species number: " + str(response.species))if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--host', help='host name', default='localhost', type=str)
parser.add_argument('--port', help='port number', default=50052, type=int)args = parser.parse_args()
run(args.host, args.port)
Step 6: Bulid a Docker image
We have implemented the trained model, the gRPC for server and client. We will put them in a docker image. Then, we build a docker image docker
with build . -t iris-predictor
. As we run the server with 50052
port, we must expose the same port number on the docker image.
FROM ubuntu:16.04WORKDIR /root# Pick up some TF dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
build-essential \
curl \
pkg-config \
rsync \
software-properties-common \
unzip \
git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*# Install miniconda
RUN curl -LO http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh \
&& bash Miniconda-latest-Linux-x86_64.sh -p /miniconda -b \
&& rm Miniconda-latest-Linux-x86_64.sh
ENV PATH /miniconda/bin:$PATH# Create a conda environment
ENV CONDA_ENV_NAME iris-predictor
COPY environment.yml ./environment.yml
RUN conda env create -f environment.yml -n $CONDA_ENV_NAME
ENV PATH /miniconda/envs/${CONDA_ENV_NAME}/bin:$PATH# cleanup tarballs and downloaded package files
RUN conda clean -tp -y \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*EXPOSE 50052COPY . /root/
CMD ["python", "grpc_server.py"]
Step 7: Run a Docker container
Congraturation! Now, we are ready to serve the machine learning gRPC server on docker. We run a docker container with the below command. After running a docker container, we check the server with the client we made.
# run a docker container
docker run --rm -d -p 50052:50052 --name iris-predictor iris-predictor# run a client
python iris_client.py --host 192.168.99.100 --port 50052
Summary
I explained a basics of implementing machine learning as a microservice with gRPC and docker. I am sure offering machine learning microservice could be much more important. Expanding what I described, we can adapt it to practical versions.