Khadas Docs

Amazing Khadas, always amazes you!

User Tools

Site Tools


products:sbc:vim4:npu:demos:retinaface

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
products:sbc:vim4:npu:demos:retinaface [2024/01/04 05:16]
louis
products:sbc:vim4:npu:demos:retinaface [2024/01/29 02:16] (current)
louis
Line 54: Line 54:
 ==== Convert ==== ==== Convert ====
  
-After training the model, we should convert the PyTorch model into an ONNX model. +After training the model, we should convert the PyTorch model into an ONNX model. Create the Python conversion script as follows and run.
- +
-Copy ''nets/retinaface.py'' and rename ''retinaface_export.py''. Modify ''retinaface_export.py'' as follows. +
- +
-```diff +
-class ClassHead(nn.Module): +
-    def __init__(self,inchannels=512,num_anchors=2): +
-        super(ClassHead,self).__init__() +
-        self.num_anchors = num_anchors +
-        self.conv1x1 = nn.Conv2d(inchannels,self.num_anchors*2,kernel_size=(1,1),stride=1,padding=0) +
- +
-    def forward(self,x): +
-        out = self.conv1x1(x) +
--       out = out.permute(0,2,3,1).contiguous() +
-+       out = out.contiguous() +
-         +
--       return out.view(out.shape[0], -1, 2) +
-+       return out.view(out.shape[0], 4, -1) +
- +
-class BboxHead(nn.Module): +
-    def __init__(self,inchannels=512,num_anchors=2): +
-        super(BboxHead,self).__init__() +
-        self.conv1x1 = nn.Conv2d(inchannels,num_anchors*4,kernel_size=(1,1),stride=1,padding=0) +
- +
-    def forward(self,x): +
-        out = self.conv1x1(x) +
--       out = out.permute(0,2,3,1).contiguous() +
-+       out = out.contiguous() +
- +
--       return out.view(out.shape[0], -1, 4) +
-+       return out.view(out.shape[0], 8, -1) +
- +
-class LandmarkHead(nn.Module): +
-    def __init__(self,inchannels=512,num_anchors=2): +
-        super(LandmarkHead,self).__init__() +
-        self.conv1x1 = nn.Conv2d(inchannels,num_anchors*10,kernel_size=(1,1),stride=1,padding=0) +
- +
-    def forward(self,x): +
-        out = self.conv1x1(x) +
--       out = out.permute(0,2,3,1).contiguous() +
-+       out = out.contiguous() +
- +
--       return out.view(out.shape[0], -1, 10) +
-+       return out.view(out.shape[0], 20, -1) +
-``` +
- +
-```diff +
--       bbox_regressions    = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=1) +
--       classifications     = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)], dim=1) +
--       ldm_regressions     = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=1) +
-+       bbox_regressions    = torch.cat([self.BboxHead[i](feature) for i, feature in enumerate(features)], dim=2) +
-+       classifications     = torch.cat([self.ClassHead[i](feature) for i, feature in enumerate(features)], dim=2) +
-+       ldm_regressions     = torch.cat([self.LandmarkHead[i](feature) for i, feature in enumerate(features)], dim=2) +
- +
-        if self.mode == 'train': +
-            output = (bbox_regressions, classifications, ldm_regressions) +
-        else: +
--           output = (bbox_regressions, F.softmax(classifications, dim=-1), ldm_regressions) +
-+           output = (bbox_regressions, classifications, ldm_regressions) +
-        return output +
-``` +
- +
-Create the Python conversion script as follows and run.+
  
 ```python export.py ```python export.py
 import torch import torch
 import numpy as np import numpy as np
-from nets.retinaface_export import RetinaFace+from nets.retinaface import RetinaFace
 from utils.config import cfg_mnet, cfg_re50 from utils.config import cfg_mnet, cfg_re50
  
Line 151: Line 89:
         --inputs "images" \         --inputs "images" \
         --input-shapes  "3,640,640"  \         --input-shapes  "3,640,640"  \
-        --inference-input-type float32 \ 
- --inference-output-type float32 \ 
         --dtypes "float32" \         --dtypes "float32" \
 +        --inference-input-type float32 \
 + --inference-output-type float32 \
         --quantize-dtype int8 --outdir onnx_output  \         --quantize-dtype int8 --outdir onnx_output  \
         --channel-mean-value "0,0,0,255"  \         --channel-mean-value "0,0,0,255"  \
-        --source-file ./dataset.txt  \+        --source-file ./retinaface_dataset.txt  \
         --iterations 500 \         --iterations 500 \
         --disable-per-channel False \         --disable-per-channel False \
         --batch-size 1 --target-platform PRODUCT_PID0XA003         --batch-size 1 --target-platform PRODUCT_PID0XA003
 ``` ```
- 
-<WRAP important > 
-Please prepare about 500 pictures for quantification. If the pictures size is smaller than model input size, please resize pictures to input size before quantification. 
-</WRAP> 
  
 Run ''convert_adla.sh'' to generate the VIM4 model. The converted model is ''xxx.adla'' in ''onnx_output''. Run ''convert_adla.sh'' to generate the VIM4 model. The converted model is ''xxx.adla'' in ''onnx_output''.
Line 181: Line 115:
 $ git clone https://github.com/khadas/vim4_npu_applications $ git clone https://github.com/khadas/vim4_npu_applications
 ``` ```
- 
-<WRAP important > 
-If your kernel version is 5.4 or earlier, please use tag ''ddk-1.7.5.5''. Tag ''ddk-2.3.6.7'' is for 5.15. 
-</WRAP> 
  
 ==== Install dependencies ==== ==== Install dependencies ====
Line 224: Line 154:
  
 # Run # Run
-$ sudo ./retinaface_cap -m ../data/retinaface_int8.adla -d 0 -w 1920 -h 1080+$ sudo ./retinaface_cap -m ../data/retinaface_int8.adla -d 0
 ``` ```
  
 ''0'' is the camera device index. ''0'' is the camera device index.
  
Last modified: 2024/01/04 05:16 by louis