Differences

This shows you the differences between two versions of the page.

--- products:sbc:vim4:npu:demos:retinaface [2024/01/04 05:16]
louis
+++ products:sbc:vim4:npu:demos:retinaface [2025/06/11 21:51] (current)
louis
@@ Line 1: / Line 1: @@
 ~~tag> NPU RetinaFace VIM4 PyTorch~~
+**Doc for version ddk-3.4.7.7**
 ====== RetinaFace PyTorch VIM4 Demo - 5 ======
 {{indexmenu_n>5}}
+===== Introduction =====
+RetinaFace is a face detection model. It can draw five key points on each face, including two eyes, nose and two corners of mouth.
+Inference results on VIM4.
+{{:products:sbc:vim4:npu:demos:retinaface-demo-output.webp?800|}}
+**Inference speed test**: USB camera about **78ms** per frame.
 ===== Get source code =====
@@ Line 34: / Line 47: @@
 Follow Docker official documentation to install Docker: [[https://docs.docker.com/engine/install/ubuntu/|Install Docker Engine on Ubuntu]].
-Then fetch the prebuilt NPU Docker container and run it.
+Follow the script below to get Docker image:
 ```shell
-$ docker pull yanwyb/npu:v1
+docker pull numbqq/npu-vim4
-$ docker run -it --name vim4-npu1 -v $(pwd):/home/khadas/npu \
-				-v /etc/localtime:/etc/localtime:ro \
-				-v /etc/timezone:/etc/timezone:ro \
-				yanwyb/npu:v1
 ```
-==== Get conversion tool ====
+==== Get Convert Tool ====
-Download Tool from [[gl>khadas/vim4_npu_sdk]].
+Download Tool from [[gh>khadas/vim4_npu_sdk]].
 ```shell
-$ git clone https://gitlab.com/khadas/vim4_npu_sdk
+$ git lfs install
+$ git lfs clone https://github.com/khadas/vim4_npu_sdk
+$ cd vim4_npu_sdk
+$ ls
+adla-toolkit-binary  adla-toolkit-binary-3.1.7.4  convert-in-docker.sh  Dockerfile  docs  README.md
 ```
-==== Convert ====
+  * ''adla-toolkit-binary/docs'' - SDK documentations
+  * ''adla-toolkit-binary/bin'' - SDK tools required for model conversion
+  * ''adla-toolkit-binary/demo'' - Conversion examples
-After training the model, we should convert the PyTorch model into an ONNX model.
+<WRAP important>
+If your kernel is older than 241129, please use branch npu-ddk-1.7.5.5.
+</WRAP>
-Copy ''nets/retinaface.py'' and rename ''retinaface_export.py''. Modify ''retinaface_export.py'' as follows.
+==== Convert ====
-```diff
-class ClassHead(nn.Module):
-    def __init__(self,inchannels=512,num_anchors=2):
-        super(ClassHead,self).__init__()
-        self.num_anchors = num_anchors
-        self.conv1x1 = nn.Conv2d(inchannels,self.num_anchors*2,kernel_size=(1,1),stride=1,padding=0)
-    def forward(self,x):
-        out = self.conv1x1(x)
--       out = out.permute(0,2,3,1).contiguous()
-+       out = out.contiguous()
--       return out.view(out.shape[0], -1, 2)
-+       return out.view(out.shape[0], 4, -1)
-class BboxHead(nn.Module):
-    def __init__(self,inchannels=512,num_anchors=2):
-        super(BboxHead,self).__init__()
-        self.conv1x1 = nn.Conv2d(inchannels,num_anchors*4,kernel_size=(1,1),stride=1,padding=0)
-    def forward(self,x):
-        out = self.conv1x1(x)
--       out = out.permute(0,2,3,1).contiguous()
-+       out = out.contiguous()
--       return out.view(out.shape[0], -1, 4)
-+       return out.view(out.shape[0], 8, -1)
-class LandmarkHead(nn.Module):
-    def __init__(self,inchannels=512,num_anchors=2):
-        super(LandmarkHead,self).__init__()
-        self.conv1x1 = nn.Conv2d(inchannels,num_anchors*10,kernel_size=(1,1),stride=1,padding=0)
-    def forward(self,x):
-        out = self.conv1x1(x)
--       out = out.permute(0,2,3,1).contiguous()
-+       out = out.contiguous()
--       return out.view(out.shape[0], -1, 10)
-+       return out.view(out.shape[0], 20, -1)
-```
-```diff
--       bbox_regressions    = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=1)
--       classifications     = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)], dim=1)
--       ldm_regressions     = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=1)
-+       bbox_regressions    = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=2)
-+       classifications     = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)], dim=2)
-+       ldm_regressions     = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=2)
-        if self.mode == 'train':
-            output = (bbox_regressions, classifications, ldm_regressions)
-        else:
--           output = (bbox_regressions, F.softmax(classifications, dim=-1), ldm_regressions)
-+           output = (bbox_regressions, classifications, ldm_regressions)
-        return output
-```
-Create the Python conversion script as follows and run.
+After training the model, we should convert the PyTorch model into an ONNX model. Create the Python conversion script as follows and run.
 ```python export.py
 import torch
 import numpy as np
-from nets.retinaface_export import RetinaFace
+from nets.retinaface import RetinaFace
 from utils.config import cfg_mnet, cfg_re50
@@ Line 151: / Line 110: @@
         --inputs "images" \
         --input-shapes  "3,640,640"  \
+        --dtypes "float32" \
         --inference-input-type float32 \
 	--inference-output-type float32 \
-        --dtypes "float32" \
         --quantize-dtype int8 --outdir onnx_output  \
         --channel-mean-value "0,0,0,255"  \
-        --source-file ./dataset.txt  \
+        --source-file ./retinaface_dataset.txt  \
         --iterations 500 \
         --disable-per-channel False \
         --batch-size 1 --target-platform PRODUCT_PID0XA003
 ```
-<WRAP important >
-Please prepare about 500 pictures for quantification. If the pictures size is smaller than model input size, please resize pictures to input size before quantification.
-</WRAP>
 Run ''convert_adla.sh'' to generate the VIM4 model. The converted model is ''xxx.adla'' in ''onnx_output''.
@@ Line 182: / Line 137: @@
 ```
-<WRAP important >
+<WRAP important>
-If your kernel version is 5.4 or earlier, please use tag ''ddk-1.7.5.5''. Tag ''ddk-2.3.6.7'' is for 5.15.
+If your kernel is older than 241129, please use version before tag ddk-3.4.7.7.
 </WRAP>
@@ Line 208: / Line 163: @@
 # Run
-$ sudo ./retinaface -m ../data/retinaface_int8.adla -p ../data/timg.jpg
+$ ./retinaface -m ../data/retinaface_int8.adla -p ../data/timg.jpg
 ```
@@ Line 214: / Line 169: @@
 Put ''retinaface_int8.adla'' in ''vim4_npu_applications/retinaface_cap/data/''.
+== Compile ==
 ```shell
-# Compile
+$ cd vim4_npu_applications/face_recognition_cap
-$ cd vim4_npu_applications/retinaface_cap
 $ mkdir build
 $ cd build
 $ cmake ..
 $ make
+```
-# Run
+== Run==
-$ sudo ./retinaface_cap -m ../data/retinaface_int8.adla -d 0 -w 1920 -h 1080
+**MIPI Camera**
+```
+$ ./retinaface_cap -m ../data/retinaface_int8.adla -t mipi
+```
+**USB Camera**
+```
+$ cd build
+$ ./retinaface_cap -m ../data/retinaface_int8.adla -t usb -d 0
 ```
-''0'' is the camera device index.
+**TIP**: Replace 0 as the number for your camera device. Such as ''/dev/video5'', it should be ''-d 5''.

Khadas Docs

User Tools

Site Tools

Differences

Page Tools