~~tag> NPU YOLO KSNN VIM4 ~~ ====== YOLOv8n KSNN Demo - 2 ====== {{indexmenu_n>2}} ===== Train the model ===== Download the YOLOv8 official code. [[gh>ultralytics/ultralytics]] ```shell $ git clone https://github.com/ultralytics/ultralytics ``` Refer ''README.md'' to create and train a YOLOv8n model. My version ''torch==1.10.1'' and ''ultralytics==8.0.86''. ===== Convert the model ===== ==== Get the conversion tool ==== ```shell $ git lfs install $ git lfs clone https://gitlab.com/khadas/vim4_npu_sdk.git ``` ==== Convert ==== After training the model, modify ''ultralytics/ultralytics/nn/modules/head.py'' as follows. ```diff head.py diff --git a/ultralytics/nn/modules/head.py b/ultralytics/nn/modules/head.py index 0b02eb3..0a6e43a 100644 --- a/ultralytics/nn/modules/head.py +++ b/ultralytics/nn/modules/head.py @@ -42,6 +42,9 @@ class Detect(nn.Module): def forward(self, x): """Concatenates and returns predicted bounding boxes and class probabilities.""" + if torch.onnx.is_in_onnx_export(): + return self.forward_export(x) + shape = x[0].shape # BCHW for i in range(self.nl): x[i] = torch.cat((self.cv2[i](x[i]), self.cv3[i](x[i])), 1) @@ -80,6 +83,15 @@ class Detect(nn.Module): a[-1].bias.data[:] = 1.0 # box b[-1].bias.data[:m.nc] = math.log(5 / m.nc / (640 / s) ** 2) # cls (.01 objects, 80 classes, 640 img) + def forward_export(self, x): + results = [] + for i in range(self.nl): + dfl = self.cv2[i](x[i]).contiguous() + cls = self.cv3[i](x[i]).contiguous() + results.append(torch.cat([cls, dfl], 1)) + return tuple(results) + ``` If you pip-installed ultralytics package, you should modify in package. Create a python file written as follows to export ONNX model. ```python export.py from ultralytics import YOLO model = YOLO("./runs/detect/train/weights/best.pt") results = model.export(format="onnx") ``` ```shell $ python export.py ``` Use [[https://netron.app/ | Netron]] to check your model output like this. If not, please check your ''head.py''. {{:products:sbc:vim3:npu:ksnn:yolov8n-vim3-ksnn-output.png?600|}} Pull ''yolov8n.onnx'' model into ''vim4_npu_sdk/adla-toolkit-binary-1.2.0.9/python'' and then run ''convert-in-docker.sh'' ```shell $ ./convert-in-docker.sh ksnn ``` Please remember to add a space at the end of each parameter. {{:products:sbc:vim4:npu:ksnn:vim4_ksnn_1.png?300|}} If your yolov8n model parameters are different from ours, you can change parameters in ''ksnn_args.txt''. ===== Run inference on the NPU by KSNN ===== ==== Install KSNN ==== Download KSNN library and demo code. [[gh>khadas/ksnn-vim4]] ```shell $ git clone https://github.com/khadas/ksnn-vim4 $ cd ksnn/ksnn $ sudo apt update $ sudo apt install python3-pip $ pip3 install ksnn_vim4-1.4-py3-none-any.whl ``` Put ''yolov8n.nb'' and ''libnn_yolov8n.so'' into ''ksnn/examples/yolov8n/models/VIM3'' and ''ksnn/examples/yolov8n/libs'' If your model's classes is not 80, please remember to modify the parameter, ''LISTSIZE''. ```shell LISTSIZE = classes number + 64 ``` ==== Picture input demo ==== ```shell $ cd ksnn/examples/yolov8n $ python3 yolov8n-picture.py --model ./models/VIM4/yolov8n_int8.adla --library ./libs/libnn_yolov8n.so --picture ./data/horses.jpg --level 0 ``` === Camera input demo === ```shell $ cd ksnn/examples/yolov8n $ python3 yolov8n-cap.py --model ./models/VIM4/yolov8n_int8.adla --library ./libs/libnn_yolov8n.so --device 0 ``` ''0'' is the camera device index. For ''RGB'' input model. ``` # yolov8n_int8 orig_img = cv.imread(picture, cv.IMREAD_COLOR) img = cv.resize(orig_img, (640, 640)) print('Done.') print('Start inference ...') start = time.time() data = yolov8.nn_inference(img, input_shape=(640, 640, 3), input_type="RGB", output_shape=[(80, 80, 144), (40, 40, 144), (20, 20, 144)], output_type="FLOAT") end = time.time() print('Done. inference time: ', end - start) ``` If you want to use ''RAW'' input model, please use this input codes. ``` # yolov8n_int8_raw img = cv.resize(orig_img, (640, 640)).astype(np.float32) img[:, :, 0] = img[:, :, 0] - mean[0] img[:, :, 1] = img[:, :, 1] - mean[1] img[:, :, 2] = img[:, :, 2] - mean[2] img = img / var[0] print('Done.') print('Start inference ...') start = time.time() data = yolov8.nn_inference(img, input_shape=(640, 640, 3), input_type="RAW", output_shape=[(80, 80, 144), (40, 40, 144), (20, 20, 144)], output_type="RAW") end = time.time() print('Done. inference time: ', end - start) ```