Differences

This shows you the differences between two versions of the page.

@@ Line 1: / Line 1: @@
+~~tag> NPU YOLO OpenCV VIM4 ~~
+**Doc for version ddk-3.4.7.7**
+====== YOLOv8n-Pose OpenCV VIM4 Demo - 8 ======
+{{indexmenu_n>8}}
+===== Introduction =====
+YOLOv8n-Pose inherits the powerful object detection backbone and neck architecture of YOLOv8n. It extends the standard YOLOv8n object detection model by integrating dedicated pose estimation layers onto its head. This allows it to not only detect people (bboxes) but also simultaneously predict the spatial positions (keypoints) of their anatomical joints (e.g., shoulders, elbows, knees, ankles).
+Inference results on VIM4.
+{{:products:sbc:vim4:npu:demos:yolov8n-pose-vim4-c-result.jpg?800|}}
+**Inference speed test**: USB camera about **90ms** per frame.
+===== Get Source Code =====
+Download YOLOv8 official code [[gh>ultralytics/ultralytics]]
+```shell
+$ git clone https://github.com/ultralytics/ultralytics
+```
+Refer ''README.md'' to train a YOLOv8n-Pose model. My version ''torch==1.10.1'' and ''ultralytics==8.0.86''.
+===== Convert Model =====
+==== Build virtual environment ====
+Follow Docker official documentation to install Docker: [[https://docs.docker.com/engine/install/ubuntu/|Install Docker Engine on Ubuntu]].
+Follow the script below to get Docker image:
+```shell
+docker pull numbqq/npu-vim4
+```
+==== Get Model Conversion Tools ====
+Get source [[gh>khadas/vim4_npu_sdk]].
+```shell
+$ git lfs install
+$ git lfs clone https://github.com/khadas/vim4_npu_sdk
+$ cd vim4_npu_sdk
+$ ls
+adla-toolkit-binary  adla-toolkit-binary-3.1.7.4  convert-in-docker.sh  Dockerfile  docs  README.md
+```
+  * ''adla-toolkit-binary/docs'' - SDK documentations
+  * ''adla-toolkit-binary/bin'' - SDK tools required for model conversion
+  * ''adla-toolkit-binary/demo'' - Conversion examples
+<WRAP important>
+If your kernel is older than 241129, please use branch npu-ddk-1.7.5.5.
+</WRAP>
+==== Convert ====
+After training model, modify **Class Detect** and **Class Pose** in ''ultralytics/ultralytics/nn/modules/head.py'' as follows. (If you use ''ultralytics==8.0.86'', the class in ''ultralytics/ultralytics/nn/modules.py'')
+```diff head.py
+diff --git a/ultralytics/nn/modules/head.py b/ultralytics/nn/modules/head.py
+index 0b02eb3..0a6e43a 100644
+--- a/ultralytics/nn/modules/head.py
++++ b/ultralytics/nn/modules/head.py
+@@ -42,6 +42,9 @@ class Detect(nn.Module):
+     def forward(self, x):
+         """Concatenates and returns predicted bounding boxes and class probabilities."""
++        if torch.onnx.is_in_onnx_export():
++            return self.forward_export(x)
++
+         shape = x[0].shape  # BCHW
+         for i in range(self.nl):
+             x[i] = torch.cat((self.cv2[i](x[i]), self.cv3[i](x[i])), 1)
+@@ -80,6 +83,15 @@ class Detect(nn.Module):
+             a[-1].bias.data[:] = 1.0  # box
+             b[-1].bias.data[:m.nc] = math.log(5 / m.nc / (640 / s) ** 2)  # cls (.01 objects, 80 classes, 640 img)
++    def forward_export(self, x):
++        results = []
++        for i in range(self.nl):
++            dfl = self.cv2[i](x[i]).contiguous()
++            cls = self.cv3[i](x[i]).contiguous()
++            results.append(torch.cat([cls, dfl], 1))
++        return tuple(results)
++
+@@ -255,6 +283,16 @@ class Pose(Detect):
+     def forward(self, x):
+         """Perform forward pass through YOLO model and return predictions."""
+         bs = x[0].shape[0]  # batch size
+-        kpt = torch.cat([self.cv4[i](x[i]).view(bs, self.nk, -1) for i in range(self.nl)], -1)  # (bs, 17*3, h*w)
++        if torch.onnx.is_in_onnx_export():
++            kpt = [self.cv4[i](x[i]) for i in range(self.nl)]
++        else:
++            kpt = torch.cat([self.cv4[i](x[i]).view(bs, self.nk, -1) for i in range(self.nl)], -1)  # (bs, 17*3, h*w)
+         x = self.detect(self, x)
++
++        if torch.onnx.is_in_onnx_export():
++            output = []
++            for i in range(self.nl):
++                output.append((torch.cat([x[i], kpt[i]], dim=1)))
++            return output
+```
+<WRAP important>
+If you pip-installed ultralytics package, you should modify in package.
+</WRAP>
+Create a python file written as follows to export ONNX model.
+```python export.py
+from ultralytics import YOLO
+model = YOLO("./runs/pose/train/weights/best.pt")
+results = model.export(format="onnx")
+```
+```shell
+$ python export.py
+```
+<WRAP important>
+Use [[https://netron.app/ | Netron]] to check your model output like this. If not, please check your ''head.py''.
+{{:products:sbc:vim4:npu:demos:yolov8n-pose-vim4-output.png?600|}}
+</WRAP>
+Enter ''vim4_npu_sdk/demo'' and modify ''convert_adla.sh'' as follows.
+```sh convert_adla.sh
+#!/bin/bash
+ACUITY_PATH=../bin/
+#ACUITY_PATH=../python/tvm/
+adla_convert=${ACUITY_PATH}adla_convert
+if [ ! -e "$adla_convert" ]; then
+    adla_convert=${ACUITY_PATH}adla_convert.py
+fi
+$adla_convert --model-type onnx \
+        --model ./model_source/yolov8n_pose/yolov8n_pose.onnx \
+        --inputs "images" \
+        --input-shapes  "3,640,640"  \
+        --dtypes "float32" \
+        --quantize-dtype int16 --outdir onnx_output  \
+        --channel-mean-value "0,0,0,255"  \
+        --inference-input-type "float32" \
+        --inference-output-type "float32" \
+        --source-file dataset.txt  \
+        --batch-size 1 --target-platform PRODUCT_PID0XA003
+```
+Run ''convert_adla.sh'' to generate VIM4 model. The converted model is ''xxx.adla'' in ''onnx_output''.
+```shell
+$ bash convert_adla.sh
+```
+===== Run NPU =====
+==== Get source code ====
+Clone the source code from our [[gh>khadas/vim4_npu_applications]].
+```shell
+$ git clone https://github.com/khadas/vim4_npu_applications
+```
+<WRAP important>
+If your kernel is older than 241129, please use version before tag ddk-3.4.7.7.
+</WRAP>
+==== Install dependencies ====
+```shell
+$ sudo apt update
+$ sudo apt install libopencv-dev python3-opencv cmake
+```
+==== Compile and run ====
+=== Picture input demo ===
+Put ''yolov8n_pose_int8.adla'' in ''vim4_npu_applications/yolov8n_pose/data/''.
+```shell
+# Compile
+$ cd vim4_npu_applications/yolov8n_pose
+$ mkdir build
+$ cd build
+$ cmake ..
+$ make
+# Run
+$ ./yolov8n_pose -m ../data/yolov8n_pose_int8.adla -p ../data/bus.jpg
+```
+=== Camera input demo ===
+Put ''yolov8n_pose_int8.adla'' in ''vim4_npu_applications/yolov8n_pose_cap/data/''.
+```shell
+# Compile
+$ cd vim4_npu_applications/yolov8n_pose_cap
+$ mkdir build
+$ cd build
+$ cmake ..
+$ make
+# Run
+$ ./yolov8n_pose_cap -m ../data/yolov8n_pose_int8.adla -t usb -d 0
+```
+''0'' is camera device index.

Khadas Docs

User Tools

Site Tools

Differences

Page Tools