Khadas Docs

Amazing Khadas, always amazes you!

User Tools

Site Tools


Sidebar

products:sbc:vim4:npu:ksnn:demo:yolov8n

YOLOv8n KSNN Demo - 2

Train the model

Download the YOLOv8 official code. ultralytics/ultralytics

$ git clone https://github.com/ultralytics/ultralytics

Refer README.md to create and train a YOLOv8n model. My version torch==1.10.1 and ultralytics==8.0.86.

Convert the model

Get the conversion tool

$ git lfs install
$ git lfs clone https://gitlab.com/khadas/vim4_npu_sdk.git

Convert

After training the model, modify ultralytics/ultralytics/nn/modules/head.py as follows.

head.py
diff --git a/ultralytics/nn/modules/head.py b/ultralytics/nn/modules/head.py
index 0b02eb3..0a6e43a 100644
--- a/ultralytics/nn/modules/head.py
+++ b/ultralytics/nn/modules/head.py
@@ -42,6 +42,9 @@ class Detect(nn.Module):
 
     def forward(self, x):
         """Concatenates and returns predicted bounding boxes and class probabilities."""
+        if torch.onnx.is_in_onnx_export():
+            return self.forward_export(x)
+
         shape = x[0].shape  # BCHW
         for i in range(self.nl):
             x[i] = torch.cat((self.cv2[i](x[i]), self.cv3[i](x[i])), 1)
@@ -80,6 +83,15 @@ class Detect(nn.Module):
             a[-1].bias.data[:] = 1.0  # box
             b[-1].bias.data[:m.nc] = math.log(5 / m.nc / (640 / s) ** 2)  # cls (.01 objects, 80 classes, 640 img)
 
+    def forward_export(self, x):
+        results = []
+        for i in range(self.nl):
+            dfl = self.cv2[i](x[i]).contiguous()
+            cls = self.cv3[i](x[i]).contiguous()
+            results.append(torch.cat([cls, dfl], 1))
+        return tuple(results)
+

If you pip-installed ultralytics package, you should modify in package.

Create a python file written as follows to export ONNX model.

export.py
from ultralytics import YOLO
model = YOLO("./runs/detect/train/weights/best.pt")
results = model.export(format="onnx")
$ python export.py

Use Netron to check your model output like this. If not, please check your head.py.

Pull yolov8n.onnx model into vim4_npu_sdk/adla-toolkit-binary-1.2.0.9/python and then run convert-in-docker.sh

$ ./convert-in-docker.sh ksnn

Please remember to add a space at the end of each parameter.

If your yolov8n model parameters are different from ours, you can change parameters in ksnn_args.txt.

Run inference on the NPU by KSNN

Install KSNN

Download KSNN library and demo code. khadas/ksnn-vim4

$ git clone https://github.com/khadas/ksnn-vim4
$ cd ksnn/ksnn
$ sudo apt update
$ sudo apt install python3-pip
$ pip3 install ksnn_vim4-1.4-py3-none-any.whl

Put yolov8n.nb and libnn_yolov8n.so into ksnn/examples/yolov8n/models/VIM3 and ksnn/examples/yolov8n/libs

If your model's classes is not 80, please remember to modify the parameter, LISTSIZE.

LISTSIZE = classes number + 64

Picture input demo

$ cd ksnn/examples/yolov8n
$ python3 yolov8n-picture.py --model ./models/VIM4/yolov8n_int8.adla --library ./libs/libnn_yolov8n.so --picture ./data/horses.jpg --level 0

Camera input demo

$ cd ksnn/examples/yolov8n
$ python3 yolov8n-cap.py --model ./models/VIM4/yolov8n_int8.adla --library ./libs/libnn_yolov8n.so --device 0

0 is the camera device index.

For RGB input model.

    # yolov8n_int8
    orig_img = cv.imread(picture, cv.IMREAD_COLOR)
    img = cv.resize(orig_img, (640, 640))
    
    print('Done.')

    print('Start inference ...')
    start = time.time()
    data = yolov8.nn_inference(img, input_shape=(640, 640, 3), input_type="RGB", output_shape=[(80, 80, 144), (40, 40, 144), (20, 20, 144)], output_type="FLOAT")
    end = time.time()
print('Done. inference time: ', end - start)

If you want to use RAW input model, please use this input codes.

    # yolov8n_int8_raw
    img = cv.resize(orig_img, (640, 640)).astype(np.float32)
    img[:, :, 0] = img[:, :, 0] - mean[0]
    img[:, :, 1] = img[:, :, 1] - mean[1]
    img[:, :, 2] = img[:, :, 2] - mean[2]
    img = img / var[0]
    
    print('Done.')

    print('Start inference ...')
    start = time.time()
    data = yolov8.nn_inference(img, input_shape=(640, 640, 3), input_type="RAW", output_shape=[(80, 80, 144), (40, 40, 144), (20, 20, 144)], output_type="RAW")
    end = time.time()
    print('Done. inference time: ', end - start)
Last modified: 2024/06/14 04:37 by louis