YOLOv7-tiny VIM3 Demo Lite - 1

This demo need kernel version >= 5.15.

Introduction

We think VIM3 C++ Demo is too complex. It is not friendly for users. So we provide a lite version. This document will help you use this lite version.

YOLOv7-Tiny is an object detection model. It uses bounding boxes to precisely draw each object in image.

Inference results on VIM3.

Inference speed test: USB camera about 216ms per frame. MIPI camera about 200ms per frame.

Train the model

Download the YOLOv7 official code. WongKinYiu/yolov7

$ git clone https://github.com/WongKinYiu/yolov7

Refer README.md to create and train a YOLOv7 tiny model.

Convert the model

Build Docker Environment

We provided a docker image which contains the required environment to convert the model.

Follow Docker official docs to install Docker: Install Docker Engine on Ubuntu.

Follow the command below to get Docker image:

docker pull numbqq/npu-vim3

Get the conversion tool

$ git lfs install
$ git lfs clone https://github.com/khadas/aml_npu_sdk.git

$ cd aml_npu_sdk/acuity-toolkit/demo && ls
aml_npu_sdk/acuity-toolkit/demo$ ls
0_import_model.sh  1_quantize_model.sh  2_export_case_code.sh  data  dataset_npy.txt  dataset.txt  extractoutput.py  inference.sh  input.npy  model

Convert

After training the model, modify yolov7/models/yolo.py as follows.

diff --git a/models/yolo.py b/models/yolo.py
index 95a019c..a2e611d 100644
--- a/models/yolo.py
+++ b/models/yolo.py
@@ -144,7 +144,7 @@ class IDetect(nn.Module):
             x[i] = self.m[i](x[i])  # conv
             bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)
-            x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
+            # x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
 
             if not self.training:  # inference
                 if self.grid[i].shape[2:4] != x[i].shape[2:4]:

Then, run export.py to convert the model to ONNX.

$ python export.py

Enter aml_npu_sdk/acuity-toolkit/demo and put yolov7_tiny.onnx into demo/model. Modify 0_import_model.sh, 1_quantize_model.sh and 2_export_case_code.sh as follows.

0_import_model.sh

#!/bin/bash
 
NAME=yolov7_tiny
ACUITY_PATH=../bin/
 
pegasus=${ACUITY_PATH}pegasus
if [ ! -e "$pegasus" ]; then
    pegasus=${ACUITY_PATH}pegasus.py
fi
 
#Onnx
$pegasus import onnx \
    --model  ./model/${NAME}.onnx \
    --output-model ${NAME}.json \
    --output-data ${NAME}.data 
 
#generate inpumeta  --source-file dataset.txt
$pegasus generate inputmeta \
	--model ${NAME}.json \
	--input-meta-output ${NAME}_inputmeta.yml \
	--channel-mean-value "0 0 0 0.0039215"  \
	--source-file dataset.txt

1_quantize_model.sh

#!/bin/bash
 
NAME=yolov7_tiny
ACUITY_PATH=../bin/
 
pegasus=${ACUITY_PATH}pegasus
if [ ! -e "$pegasus" ]; then
    pegasus=${ACUITY_PATH}pegasus.py
fi
 
#--quantizer asymmetric_affine --qtype  uint8
#--quantizer dynamic_fixed_point  --qtype int8(int16,note s905d3 not support int16 quantize) 
# --quantizer perchannel_symmetric_affine --qtype int8(int16, note only T3(0xBE) can support perchannel quantize)
$pegasus  quantize \
	--quantizer dynamic_fixed_point \
	--qtype int8 \
	--rebuild \
	--with-input-meta  ${NAME}_inputmeta.yml \
	--model  ${NAME}.json \
	--model-data  ${NAME}.data

2_export_case_code.sh

#!/bin/bash
 
NAME=yolov7_tiny
ACUITY_PATH=../bin/
 
pegasus=$ACUITY_PATH/pegasus
if [ ! -e "$pegasus" ]; then
    pegasus=$ACUITY_PATH/pegasus.py
fi
 
$pegasus export ovxlib\
    --model ${NAME}.json \
    --model-data ${NAME}.data \
    --model-quantize ${NAME}.quantize \
    --with-input-meta ${NAME}_inputmeta.yml \
    --dtype quantized \
    --optimize VIPNANOQI_PID0X88  \
    --viv-sdk ${ACUITY_PATH}vcmdtools \
    --pack-nbg-unify
 
rm -rf ${NAME}_nbg_unify
 
mv ../*_nbg_unify ${NAME}_nbg_unify
 
cd ${NAME}_nbg_unify
 
mv network_binary.nb ${NAME}.nb
 
cd ..
 
#save normal case demo export.data 
mkdir -p ${NAME}_normal_case_demo
mv  *.h *.c .project .cproject *.vcxproj BUILD *.linux *.export.data ${NAME}_normal_case_demo
 
# delete normal_case demo source
#rm  *.h *.c .project .cproject *.vcxproj  BUILD *.linux *.export.data
 
rm *.data *.quantize *.json *_inputmeta.yml

If you use VIM3L, optimize use VIPNANOQI_PID0X99.

After modifying, return to aml_npu_sdk and run convert-in-docker.sh.

If run succeed, converted model and library will generate in demo/yolov7_tiny_nbg_unify.

$ cd ../../
$ bash convert-in-docker.sh
$ cd acuity-toolkit/demo/yolov7_tiny_nbg_unify
$ ls
BUILD  main.c  makefile.linux  nbg_meta.json  vnn_global.h  vnn_post_process.c  vnn_post_process.h  vnn_pre_process.c  vnn_pre_process.h  vnn_yolov7tiny.c  vnn_yolov7tiny.h  yolov7_tiny.nb  yolov7tiny.vcxproj

Run inference on the NPU

Get source code

Get the source code: khadas/vim3_npu_applications_lite

$ git clone https://github.com/khadas/vim3_npu_applications_lite

Install dependencies

$ sudo apt update
$ sudo apt install libopencv-dev python3-opencv cmake

Compile and run

Put yolov7_tiny.nb into vim3_npu_applications_lite/yolov7_tiny_demo_x11_usb/nn_data. Replace yolov7_tiny_demo_x11_usb/vnn_yolov7tiny.c and yolov7_tiny_demo_x11_usb/include/vnn_yolov7tiny.h with your generating vnn_yolov7tiny.c and vnn_yolov7tiny.h.

# Compile
$ cd vim3_npu_applications_lite/yolov7_tiny_demo_x11_usb
$ bash build_vx.sh
$ cd bin_r_cv4
 
# usb
$ ./yolov7_tiny_demo_x11_usb -m ../nn_data/yolov7_tiny_88.nb -t usb -d /dev/video0
 
# mipi
$ ./yolov7_tiny_demo_x11_usb -m ../nn_data/yolov7_tiny_88.nb -t mipi -d /dev/video50

Khadas Docs

Sidebar

Table of Contents

YOLOv7-tiny VIM3 Demo Lite - 1

Introduction

Train the model

Convert the model

Build Docker Environment

Get the conversion tool

Convert

Run inference on the NPU

Get source code

Install dependencies

Compile and run

Khadas Docs

User Tools

Site Tools

Sidebar

Table of Contents

YOLOv7-tiny VIM3 Demo Lite - 1

Introduction

Train the model

Convert the model

Build Docker Environment

Get the conversion tool

Convert

Run inference on the NPU

Get source code

Install dependencies

Compile and run

Page Tools