Khadas Docs

Amazing Khadas, always amazes you!

User Tools

Site Tools


Sidebar

products:sbc:vim4:npu:demos:retinaface

This is an old revision of the document!


RetinaFace PyTorch VIM4 Demo - 5

Get source code

bubbliiiing/retinaface-pytorch

$ git clone https://github.com/bubbliiiing/retinaface-pytorch

Before training, modify retinaface-pytorch/utils/utils.py as follows.

diff --git a/utils/utils.py b/utils/utils.py
index 87bb528..4a22f2a 100644
--- a/utils/utils.py
+++ b/utils/utils.py
@@ -25,5 +25,6 @@ def get_lr(optimizer):
         return param_group['lr']
 
 def preprocess_input(image):
-    image -= np.array((104, 117, 123),np.float32)
+    image = image / 255.0
     return image

Convert the model

Build virtual environment

Follow Docker official documentation to install Docker: Install Docker Engine on Ubuntu.

Then fetch the prebuilt NPU Docker container and run it.

$ docker pull yanwyb/npu:v1
$ docker run -it --name vim4-npu1 -v $(pwd):/home/khadas/npu \
				-v /etc/localtime:/etc/localtime:ro \
				-v /etc/timezone:/etc/timezone:ro \
				yanwyb/npu:v1

Get conversion tool

Download Tool from khadas/vim4_npu_sdk.

$ git clone https://gitlab.com/khadas/vim4_npu_sdk

Convert

After training the model, we should convert the PyTorch model into an ONNX model.

Copy nets/retinaface.py and rename retinaface_export.py. Modify retinaface_export.py as follows.

class ClassHead(nn.Module):
    def __init__(self,inchannels=512,num_anchors=2):
        super(ClassHead,self).__init__()
        self.num_anchors = num_anchors
        self.conv1x1 = nn.Conv2d(inchannels,self.num_anchors*2,kernel_size=(1,1),stride=1,padding=0)
 
    def forward(self,x):
        out = self.conv1x1(x)
-       out = out.permute(0,2,3,1).contiguous()
+       out = out.contiguous()
 
-       return out.view(out.shape[0], -1, 2)
+       return out.view(out.shape[0], 4, -1)
 
class BboxHead(nn.Module):
    def __init__(self,inchannels=512,num_anchors=2):
        super(BboxHead,self).__init__()
        self.conv1x1 = nn.Conv2d(inchannels,num_anchors*4,kernel_size=(1,1),stride=1,padding=0)
 
    def forward(self,x):
        out = self.conv1x1(x)
-       out = out.permute(0,2,3,1).contiguous()
+       out = out.contiguous()
 
-       return out.view(out.shape[0], -1, 4)
+       return out.view(out.shape[0], 8, -1)
 
class LandmarkHead(nn.Module):
    def __init__(self,inchannels=512,num_anchors=2):
        super(LandmarkHead,self).__init__()
        self.conv1x1 = nn.Conv2d(inchannels,num_anchors*10,kernel_size=(1,1),stride=1,padding=0)
 
    def forward(self,x):
        out = self.conv1x1(x)
-       out = out.permute(0,2,3,1).contiguous()
+       out = out.contiguous()
 
-       return out.view(out.shape[0], -1, 10)
+       return out.view(out.shape[0], 20, -1)
-       bbox_regressions    = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=1)
-       classifications     = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)], dim=1)
-       ldm_regressions     = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=1)
+       bbox_regressions    = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=2)
+       classifications     = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)], dim=2)
+       ldm_regressions     = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=2)
 
        if self.mode == 'train':
            output = (bbox_regressions, classifications, ldm_regressions)
        else:
-           output = (bbox_regressions, F.softmax(classifications, dim=-1), ldm_regressions)
+           output = (bbox_regressions, classifications, ldm_regressions)
        return output

Create the Python conversion script as follows and run.

export.py
import torch
import numpy as np
from nets.retinaface_export import RetinaFace
from utils.config import cfg_mnet, cfg_re50
 
model_path = "logs/Epoch150-Total_Loss6.2802.pth"
net = RetinaFace(cfg=cfg_mnet, mode='eval').eval()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
net.load_state_dict(torch.load(model_path, map_location=device))
 
img = torch.zeros(1, 3, 640, 640)
torch.onnx.export(net, img, "./retinaface.onnx", verbose=False, opset_version=12, input_names=['images'])

Enter vim4_npu_sdk/demo and modify convert_adla.sh as follows.

convert_adla.sh
#!/bin/bash
 
ACUITY_PATH=../bin/
#ACUITY_PATH=../python/tvm/
adla_convert=${ACUITY_PATH}adla_convert
 
 
if [ ! -e "$adla_convert" ]; then
    adla_convert=${ACUITY_PATH}adla_convert.py
fi
 
$adla_convert --model-type onnx \
        --model ./model_source/retinaface/retinaface.onnx \
        --inputs "images" \
        --input-shapes  "3,640,640"  \
        --inference-input-type float32 \
	--inference-output-type float32 \
        --dtypes "float32" \
        --quantize-dtype int8 --outdir onnx_output  \
        --channel-mean-value "0,0,0,255"  \
        --source-file ./dataset.txt  \
        --iterations 500 \
        --disable-per-channel False \
        --batch-size 1 --target-platform PRODUCT_PID0XA003

Please prepare about 500 pictures for quantification. If the pictures size is smaller than model input size, please resize pictures to input size before quantification.

Run convert_adla.sh to generate the VIM4 model. The converted model is xxx.adla in onnx_output.

$ bash convert_adla.sh

Run inference on the NPU

Get source code

Clone the source code khadas/vim4_npu_applications.

$ git clone https://github.com/khadas/vim4_npu_applications

If your kernel version is 5.4 or earlier, please use tag ddk-1.7.5.5. Tag ddk-2.3.6.7 is for 5.15.

Install dependencies

$ sudo apt update
$ sudo apt install libopencv-dev python3-opencv cmake

Compile and run

Picture input demo

Put retinaface_int8.adla in vim4_npu_applications/retinaface/data/.

# Compile
$ cd vim4_npu_applications/retinaface
$ mkdir build
$ cd build
$ cmake ..
$ make
 
# Run
$ sudo ./retinaface -m ../data/retinaface_int8.adla -p ../data/timg.jpg

Camera input demo

Put retinaface_int8.adla in vim4_npu_applications/retinaface_cap/data/.

# Compile
$ cd vim4_npu_applications/retinaface_cap
$ mkdir build
$ cd build
$ cmake ..
$ make
 
# Run
$ sudo ./retinaface_cap -m ../data/retinaface_int8.adla -d 0 -w 1920 -h 1080

0 is the camera device index.

Last modified: 2024/01/04 05:16 by louis