Tutorial: Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2

Spread the love

In the previous parts of this series, we have explored the concept of ONNX model format and runtime. In the last and final tutorial, I will walk you through the steps of accelerating an ONNX model on an edge device powered by Intel Movidius Neural Compute Stick (NCS) 2 and Intel’s Distribution of OpenVINO Toolkit. We will run the Tiny YOLO2 model first on the desktop based on CPU and then on an edge device with almost no change to the code.
Quick Recap — ONNX Runtime
Apart from bringing interoperability across deep learning frameworks, ONNX promises optimized execution of neural network graph depending on the availability of hardware. The ONNX Runtime abstracts various hardware architectures such as AMD64 CPU, ARM64 CPU, GPU, FPGA, and VPU.
For example, the same ONNX model can deliver better inference performance when it is run against a GPU backend without any optimization done to the model. This is possible due to the plugin model of ONNX that supports multiple execution providers.

A hint provided to ONNX Runtime just before creating the inference session translates to a considerable performance boost.
The below code snippet is an example of such an optimization hint for the ONNX Runtime to utilize an Intel Integrated Graphics backend.import onnxruntime as rt
sess = rt.InferenceSession(‘TinyYOLO.onnx’)When the same model is used in a smart camera powered by an Intel NCS device, the backend can be changed to target the MYRIAD Vision Processing Unit (VPU).rt.capi._pybind_state.set_openvino_device(“MYRIAD_FP16”)In the below sections, we will build a simple object detection system based on the popular Tiny YOLO v2 model. We will first run this on a PC to execute the model against a CPU backend before moving it to the edge device with a VPU.
To finish this tutorial, you need the following:

PC/Mac with Python 3.x and a webcam
Ubuntu 18.04 machine with a webcam
Docker CE installed on Ubuntu
Intel NCS 2
Up Squared AI Vision X Kit with a USB camera(Optional)

Setting up the Environment
Start by creating a Python virtual environment for the project.python -m venv demoenv
source demoenv/bin/activateCreate a requirements.txt file with the required Python modules.onnxruntime
opencv-pythonSince we are going to detect up to 20 objects, create a file called labels.txt with the below labels:aeroplane,bicycle,bird,boat,bottle,bus,car,cat,chair,cow,diningtable,dog,horse,motorbike,person,pottedplant,sheep,sofa,train,tvmonitorFinally, download the Tiny YOLO v2 model from the ONNX Model Zoo.
Object Detection with Tiny YOLO V2 on Desktop
We are now ready to code the inference program based on Tiny YOLO v2 and ONNX Runtime. Create a file, infer.py with the below code:import cv2
import numpy as np
import onnxruntime as rt

def preprocess(msg):
inp = np.array(msg).reshape((len(msg),1))
frame = cv2.imdecode(inp.astype(np.uint8), 1)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame = np.array(frame).astype(np.float32)
frame = cv2.resize(frame, (416, 416))
frame = frame.transpose(2, 0, 1)
frame = np.reshape(frame, (1, 3, 416, 416))
return frame

def infer(frame, sess, conf_threshold):
input_name = sess.get_inputs()[0].name

def softmax(x):
return np.exp(x) / np.sum(np.exp(x), axis=0)

def sigmoid(x):
return 1/(1+np.exp(-x))

pred = sess.run(None, {input_name: frame})
pred = np.array(pred[0][0])

labels_file = open(“labels.txt”)
labels = labels_file.read().split(“,”)

tiny_yolo_cell_width = 13
tiny_yolo_cell_height = 13
num_boxes = 5
tiny_yolo_classes = 20

for bx in range (0, tiny_yolo_cell_width):
for by in range (0, tiny_yolo_cell_height):
for bound in range (0, num_boxes):
channel = bound*25
tx = pred[channel][by][bx]
ty = pred[channel+1][by][bx]
tw = pred[channel+2][by][bx]
th = pred[channel+3][by][bx]
tc = pred[channel+4][by][bx]

confidence = sigmoid(tc)
class_out = pred[channel+5:channel+5+tiny_yolo_classes][bx][by]
class_out = softmax(np.array(class_out))
class_detected = np.argmax(class_out)
display_confidence = class_out[class_detected]*confidence
if display_confidence > conf_threshold:
return output

def main():
sess = rt.InferenceSession(‘TinyYOLO.onnx’)
while (True):
cap = cv2.VideoCapture(cam)
ret, frame = cap.read()
ret, enc = cv2.imencode(‘.jpg’, frame)
enc = enc.flatten()

if __name__ == “__main__”:
main()If you are familiar with OpenCV and basic Convolutional Neural Networks (CNN), the code is self-explanatory.
It does three things:

Grabs the frame from the webcam
Converts and preprocesses the frame as expected by the model
Finally, it performs inference on the frame to detect objects that match the confidence level and pairs it with one of the labels from the CSV file

If you have multiple cameras attached to the machine, don’t forget to update the index appropriately by changing the value of cam variable.
Executing the code shows the objects it found along with the confidence score. Adjust the confidence threshold based on your requirement.{‘object’: ‘diningtable’, ‘confidence’: 0.1934369369567218}
{‘object’: ‘diningtable’, ‘confidence’: 0.12359955877868607}
{‘object’: ‘diningtable’, ‘confidence’: 0.11795787527541246}
{‘object’: ‘chair’, ‘confidence’: 0.13212954996625334}
{‘object’: ‘diningtable’, ‘confidence’: 0.1899228051957825}
{‘object’: ‘chair’, ‘confidence’: 0.1374235041020961}
{‘object’: ‘chair’, ‘confidence’: 0.1632368686534813}This scenario represents ONNX Runtime performing inference against a CPU backend. In the next step, we will port this code to run on an edge device powered by Intel NCS 2.
Object Detection with Tiny YOLO V2 at the Edge
Assuming you have an Ubuntu 18.04 machine connected to an Intel NCS 2 device running the latest version of Intel OpenVINO Toolkit, you are ready to execute the code at the edge. Otherwise, follow the steps to configure Intel NCS 2 and OpenVINO Toolkit as per the documentation.
If you have an Up Squared AI Vision X Kit, you can use it for this tutorial.
Even if you don’t install the entire OpenVINO Toolkit, ensure you install the Myriad rules drivers for NCS on the host machine according to the reference.
Microsoft has provided Docker images and Dockerfile for mainstream environments. Let’s start by downloading the container image for OpenVINO Toolkit with Myriad.docker pull mcr.microsoft.com/azureml/onnxruntime:latest-openvino-myriadCreate a directory, tinyyolo, on the Ubuntu machine and copy the files from your PC. Your directory should contain the below files:
Before we execute the code, let’s add a line that tells ONNX Runtime about the presence of the Intel NCS device.
Open infer.py and add the below line just before creating the inference session variable.rt.capi._pybind_state.set_openvino_device(“MYRIAD_FP16”)We are set to run the inference code within the Docker container based on the Myriad device.
Let’s launch the Docker container by mapping the /dev directory and mounting the tinyyolo directory. We also need to add the –privileged and –network host flags to provide appropriate permissions to access the camera and the NCS USB device.
While in the tinyyolo directory, execute the below command:docker run
-v /dev:/dev
-v $PWD:/tinyyolo
–network host
-it –rm mcr.microsoft.com/azureml/onnxruntime:latest-openvino-myriad /bin/bashAfter getting into the shell, let’s move into the directory and install the prerequisites.cd /tinyyolo
pip install -r requirements.txtExecute the code to see the inference output in the terminal.python infer.pyIt may take a few minutes for the graph to get loaded and warmed up. You should now see the objects detected by the camera in the terminal.
This scenario can be easily extended to publish the inference output to an MQTT channel configured locally or in the cloud. Refer to my previous AIoT tutorial and a video demo of this use case.
Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.
Feature Image by Robert Balog from Pixabay,
At this time, The New Stack does not allow comments directly on this website. We invite all readers who wish to discuss a story to visit us on Twitter or Facebook. We also welcome your news tips and feedback via email: [email protected]
The post Tutorial: Accelerate AI at Edge with ONNX Runtime and Intel Neural Compute Stick 2 appeared first on The New Stack.

X ITM Cloud News


Leave a Reply

Next Post

Generic docker containers for testing infrastructure?

Fri Jul 31 , 2020
Spread the love          I’m currently using this container to mirror 200 responses for any path which is handy when using path based target groups on an ALB https://github.com/eexit/mirror-http-server That container does not support altering the status code nor the response using environment variables which I would also find very handy. I’m […]

Cloud Computing – Consultancy – Development – Hosting – APIs – Legacy Systems

X-ITM Technology helps our customers across the entire enterprise technology stack with differentiated industry solutions. We modernize IT, optimize data architectures, and make everything secure, scalable and orchestrated across public, private and hybrid clouds.

This image has an empty alt attribute; its file name is x-itmdc.jpg

The enterprise technology stack includes ITO; Cloud and Security Services; Applications and Industry IP; Data, Analytics and Engineering Services; and Advisory.

Watch an animation of  X-ITM‘s Enterprise Technology Stack

We combine years of experience running mission-critical systems with the latest digital innovations to deliver better business outcomes and new levels of performance, competitiveness and experiences for our customers and their stakeholders.

X-ITM invests in three key drivers of growth: People, Customers and Operational Execution.

The company’s global scale, talent and innovation platforms serve 6,000 private and public-sector clients in 70 countries.

X-ITM’s extensive partner network helps drive collaboration and leverage technology independence. The company has established more than 200 industry-leading global Partner Network relationships, including 15 strategic partners: Amazon Web Services, AT&T, Dell Technologies, Google Cloud, HCL, HP, HPE, IBM, Micro Focus, Microsoft, Oracle, PwC, SAP, ServiceNow and VMware