PPOCR KSNN Demo - 10

Introduction

PPOCR is a state-of-the-art, highly efficient, open-source Optical Character Recognition system. It's designed to be practical, lightweight, and incredibly fast, making it ideal for deploying OCR capabilities directly on mobile or edge devices with limited computational resources (like smartphones or IoT devices), as well as high-volume cloud-based processing.

Inference results on VIM4.

Download the model

Download the PPOCR model from ppocr/model_list.md

Convert the model

Get the conversion tool

Get source khadas/vim4_npu_sdk.

$ git lfs install
$ git lfs clone https://github.com/khadas/vim4_npu_sdk
$ cd vim4_npu_sdk
$ ls
adla-toolkit-binary  adla-toolkit-binary-3.1.7.4  convert-in-docker.sh  Dockerfile  docs  README.md

adla-toolkit-binary/docs - SDK documentations
adla-toolkit-binary/bin - SDK tools required for model conversion
adla-toolkit-binary/demo - Conversion examples

Only convert tool version branch npu-ddk-3.4.7.7 or higher supports PPOCR.

Convert

First, convert model from Paddle to ONNX. Please refer the two doc. Paddle2ONNX/blob/develop/README_en.md, PaddleOCR/blob/main/docs/ppocr/infer_deploy/paddle2onnx.en.md.

Our version is paddlepaddle==2.6.1 and paddle2onnx==1.2.11 and our covnert command.

$ paddle2onnx --model_dir ./ch_PP-OCRv4_det_infer --model_filename inference.pdmodel --params_filename inference.pdiparams --save_file ppocr_det.onnx
$ paddle2onnx --model_dir ./ch_PP-OCRv4_rec_infer --model_filename inference.pdmodel --params_filename inference.pdiparams --save_file ppocr_rec.onnx

Pull ppocr_det.onnx model and ppocr_rec model into vim4_npu_sdk/adla-toolkit-binary-3.1.7.4/python and modify ksnn_args.txt.

ppocr_det

--model-name ppocr_det 
--model-type onnx 
--model ./ppocr_det.onnx 
--inputs "x" 
--input-shapes  "3,736,736" 
--dtypes "float32" 
--quantize-dtype int8 
--outdir onnx_output 
--channel-mean-value "123.675,116.28,103.53,57.375" 
--source-file ocr_det_dataset.txt 
--iterations 1 
--batch-size 1 
--kboard VIM4 
--inference-input-type "float32" 
--inference-output-type "float32"

ppocr_rec

--model-name ppocr_rec 
--model-type onnx 
--model ./ppocr_rec.onnx 
--inputs "x" 
--input-shapes  "3,48,320" 
--dtypes "float32" 
--quantize-dtype int16 
--outdir onnx_output 
--channel-mean-value "127.5,127.5,127.5,128" 
--source-file ocr_rec_dataset.txt 
--iterations 1 
--batch-size 1 
--kboard VIM4 
--inference-input-type "float32" 
--inference-output-type "float32"

$ ./convert-in-docker.sh ksnn

Run inference on the NPU by KSNN

Install KSNN

Download KSNN library and demo code. khadas/ksnn-vim4

$ git clone https://github.com/khadas/ksnn-vim4

Only KSNN demo tag ddk-3.4.7.7 or higher supports PPOCR. Only firmware newer than 241129 supports this PPOCR demo.

If you use Ubuntu 24.04, demo must run in python virtual environment.

$ sudo apt update
$ sudo apt install python3-venv
$ python3 -m venv myenv
$ source myenv/bin/activate

$ cd ksnn-vim4/ksnn
$ sudo apt update
$ sudo apt install python3-pip
$ pip3 install ksnn_vim4-1.4.1-py3-none-any.whl
$ pip3 install shapely pyclipper Pillow

Picture input demo

The demo use cn rec model. If you want to use other rec model, you need to modify the rec model output and character index dictionary.

ppocr-picture.py

# model input and output
det_input_size = (736, 736) # (model height, model width)
rec_input_size = ( 48, 320) # (model height, model width)
rec_output_size = (40, 97) # rec output

postprocess.py

character_str = ["blank"]
with open("./data/ppocr_keys_v1.txt", "rb") as fin: # the path for character index dictionary
    lines = fin.readlines()
    for line in lines:
        line = line.decode("utf-8").strip("\n").strip("\r\n")
        character_str.append(line)
character_str.append(" ")
ignored_token = [0]

You can find the dictionary txt file in PaddleOCR/ppocr/utils

$ cd ksnn-vim4/example/ppocr
$ export QT_QPA_PLATFORM=xcb
$ python3 ppocr-picture.py --det_model ./models/VIM4/ppocr_det_int8.adla --det_library ./libs/libnn_ppocr_det.so --rec_model ./models/VIM4/ppocr_rec_int16.adla --rec_library ./libs/libnn_ppocr_rec.so --picture ./data/test.png

Camera input demo

$ cd ksnn-vim4/example/ppocr
$ export QT_QPA_PLATFORM=xcb
 
# USB Camera
$ python3 ppocr-cap.py --det_model ./models/VIM4/ppocr_det_int8.adla --det_library ./libs/libnn_ppocr_det.so --rec_model ./models/VIM4/ppocr_rec_int16.adla --rec_library ./libs/libnn_ppocr_rec.so --type usb --device 0
 
# MIPI Camera
$ python3 ppocr-cap.py --det_model ./models/VIM4/ppocr_det_int8.adla --det_library ./libs/libnn_ppocr_det.so --rec_model ./models/VIM4/ppocr_rec_int16.adla --rec_library ./libs/libnn_ppocr_rec.so --type mipi --device 63

0 is the camera device index.

If you want to use MIPI Camera, you should remove OpenCV in python lib and then us apt install OpenCV.

$ pip3 uninstall opencv-python
$ sudo apt install python3-opencv

If you use virtual environment, you need to download the opencv source code and compile the GSTREAMER function manually. (If you do not know how to compile, you can ask help in KHADAS FORUM).

Khadas Docs

Sidebar

Table of Contents

PPOCR KSNN Demo - 10

Introduction

Download the model

Convert the model

Get the conversion tool

Convert

Run inference on the NPU by KSNN

Install KSNN

Picture input demo

Camera input demo

Khadas Docs

User Tools

Site Tools

Sidebar

Table of Contents

PPOCR KSNN Demo - 10

Introduction

Download the model

Convert the model

Get the conversion tool

Convert

Run inference on the NPU by KSNN

Install KSNN

Picture input demo

Camera input demo

Page Tools