This demo need kernel version >= 5.15.
We think VIM3 C++ Demo is too complex. It is not friendly for users. So we provide a lite version. This document will help you use this lite version.
We will use a DenseNet model based on bubbliiiing/retinaface-pytorch.
$ git clone https://github.com/bubbliiiing/retinaface-pytorch
Before training, modify retinaface-pytorch/utils/utils.py
as follows.
diff --git a/utils/utils.py b/utils/utils.py index 87bb528..4a22f2a 100644 --- a/utils/utils.py +++ b/utils/utils.py @@ -25,5 +25,6 @@ def get_lr(optimizer): return param_group['lr'] def preprocess_input(image): - image -= np.array((104, 117, 123),np.float32) + image = image / 255.0 return image
We provided a docker image which contains the required environment to convert the model.
Follow Docker official docs to install Docker: Install Docker Engine on Ubuntu.
Follow the command below to get Docker image:
docker pull numbqq/npu-vim3
$ git lfs install $ git lfs clone https://github.com/khadas/aml_npu_sdk.git
$ cd aml_npu_sdk/acuity-toolkit/demo && ls aml_npu_sdk/acuity-toolkit/demo$ ls 0_import_model.sh 1_quantize_model.sh 2_export_case_code.sh data dataset_npy.txt dataset.txt extractoutput.py inference.sh input.npy model
After training the model, we should convert the PyTorch model into an ONNX model. Create the Python conversion script as follows and run.
import torch import numpy as np from nets.retinaface import RetinaFace from utils.config import cfg_mnet, cfg_re50 model_path = "logs/Epoch150-Total_Loss6.2802.pth" net = RetinaFace(cfg=cfg_mnet, mode='eval').eval() device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') net.load_state_dict(torch.load(model_path, map_location=device)) img = torch.zeros(1, 3, 640, 640) torch.onnx.export(net, img, "./retinaface.onnx", verbose=False, opset_version=12, input_names=['images'])
Enter aml_npu_sdk/acuity-toolkit/demo
and put retinaface.onnx
into demo/model
. Modify 0_import_model.sh
, 1_quantize_model.sh
and 2_export_case_code.sh
as follows.
#!/bin/bash NAME=retinaface ACUITY_PATH=../bin/ pegasus=${ACUITY_PATH}pegasus if [ ! -e "$pegasus" ]; then pegasus=${ACUITY_PATH}pegasus.py fi #Onnx $pegasus import onnx \ --model ./model/${NAME}.onnx \ --output-model ${NAME}.json \ --output-data ${NAME}.data #generate inpumeta --source-file dataset.txt $pegasus generate inputmeta \ --model ${NAME}.json \ --input-meta-output ${NAME}_inputmeta.yml \ --channel-mean-value "0 0 0 0.0039215" \ --source-file dataset.txt
#!/bin/bash NAME=retinaface ACUITY_PATH=../bin/ pegasus=${ACUITY_PATH}pegasus if [ ! -e "$pegasus" ]; then pegasus=${ACUITY_PATH}pegasus.py fi #--quantizer asymmetric_affine --qtype uint8 #--quantizer dynamic_fixed_point --qtype int8(int16,note s905d3 not support int16 quantize) # --quantizer perchannel_symmetric_affine --qtype int8(int16, note only T3(0xBE) can support perchannel quantize) $pegasus quantize \ --quantizer dynamic_fixed_point \ --qtype int8 \ --rebuild \ --with-input-meta ${NAME}_inputmeta.yml \ --model ${NAME}.json \ --model-data ${NAME}.data
#!/bin/bash NAME=retinaface ACUITY_PATH=../bin/ pegasus=$ACUITY_PATH/pegasus if [ ! -e "$pegasus" ]; then pegasus=$ACUITY_PATH/pegasus.py fi $pegasus export ovxlib\ --model ${NAME}.json \ --model-data ${NAME}.data \ --model-quantize ${NAME}.quantize \ --with-input-meta ${NAME}_inputmeta.yml \ --dtype quantized \ --optimize VIPNANOQI_PID0X88 \ --viv-sdk ${ACUITY_PATH}vcmdtools \ --pack-nbg-unify rm -rf ${NAME}_nbg_unify mv ../*_nbg_unify ${NAME}_nbg_unify cd ${NAME}_nbg_unify mv network_binary.nb ${NAME}.nb cd .. #save normal case demo export.data mkdir -p ${NAME}_normal_case_demo mv *.h *.c .project .cproject *.vcxproj BUILD *.linux *.export.data ${NAME}_normal_case_demo # delete normal_case demo source #rm *.h *.c .project .cproject *.vcxproj BUILD *.linux *.export.data rm *.data *.quantize *.json *_inputmeta.yml
If you use VIM3L, optimize
use VIPNANOQI_PID0X99
.
After modifying, return to aml_npu_sdk
and run convert-in-docker.sh
.
If run succeed, converted model and library will generate in demo/retinaface_nbg_unify
.
$ cd ../../ $ bash convert-in-docker.sh $ cd acuity-toolkit/demo/retinaface_nbg_unify $ ls BUILD main.c makefile.linux nbg_meta.json retinaface_99.nb retinaface.vcxproj vnn_global.h vnn_post_process.c vnn_post_process.h vnn_pre_process.c vnn_pre_process.h vnn_retinaface.c vnn_retinaface.h
Get the source code: khadas/vim3_npu_applications_lite
$ git clone https://github.com/khadas/vim3_npu_applications_lite
$ sudo apt update $ sudo apt install libopencv-dev python3-opencv cmake
Put retinaface.nb
into vim3_npu_applications_lite/retinaface_demo_x11_usb/nn_data
.
Replace retinaface_demo_x11_usb/vnn_retinaface.c
and retinaface_demo_x11_usb/include/vnn_retinaface.h
with your generating vnn_retinaface.c
and vnn_retinaface.h
.
# Compile $ cd vim3_npu_applications_lite/retinaface_demo_x11_usb $ bash build_vx.sh $ cd bin_r_cv4 $ ./retinaface_demo_x11_usb -m ../nn_data/retinaface_88.nb -d /dev/video0