Doc for version ddk-3.4.7.7

DenseNet CTC ONNX Keras VIM4 Demo - 3

Introduction

Densenet_CTC is a text recognition model. It only can recognize single line text. Therefore usually, it needs to be used in conjunction with a text detection model.

Recognition image and inference results on VIM4.

Get the source code

We will use a DenseNet model based on YCG09/chinese_ocr

git clone https://github.com/YCG09/chinese_ocr

Convert the model

Build a virtual environment

Follow Docker official documentation to install Docker: Install Docker Engine on Ubuntu.

Follow the script below to get Docker image:

docker pull numbqq/npu-vim4

Get the conversion tool

Download The conversion tool from khadas/vim4_npu_sdk.

$ git clone https://github.com/khadas/vim4_npu_sdk
$ cd vim4_npu_sdk
$ git lfs pull
$ ls
adla-toolkit-binary  adla-toolkit-binary-3.1.7.4  convert-in-docker.sh  Dockerfile  docs  README.md

adla-toolkit-binary/docs - SDK documentations
adla-toolkit-binary/bin - SDK tools required for model conversion
adla-toolkit-binary/demo - Conversion examples

If your kernel is older than 241129, please use branch npu-ddk-1.7.5.5

Convert

After training the model, run the scripts as follows to modify net input and output and convert the model to ONNX.

Keras model(.h5) can be converted into a VIM4 model directly. If you want to convert a Keras model, please use model.save to save the model with weight and network structure.

export.py

import onnx
from keras.models import *
import keras
import keras2onnx
from train import get_model
import densenet
 
basemodel, model = get_model(32, 88) # input height, classes number
basemodel.load_weights("models/weights_densenet-32-0.40.h5")
onnx_model = keras2onnx.convert_keras(basemodel, basemodel.name, target_opset=12)
onnx_model.graph.input[0].type.tensor_type.shape.dim[0].dim_value = int(1)
onnx_model.graph.input[0].type.tensor_type.shape.dim[1].dim_value = int(1)
onnx_model.graph.input[0].type.tensor_type.shape.dim[2].dim_value = int(32)
onnx_model.graph.input[0].type.tensor_type.shape.dim[3].dim_value = int(280)
onnx_model.graph.output[0].type.tensor_type.shape.dim[0].dim_value = int(1)
onnx_model.graph.node.remove(onnx_model.graph.node[0])
onnx_model.graph.node[0].input[0] = "the_input"
onnx.save_model(onnx_model, "./densenet_ctc.onnx")

Enter vim4_npu_sdk/demo and modify convert_adla.sh as follows.

convert_adla.sh

#!/bin/bash
 
ACUITY_PATH=../bin/
#ACUITY_PATH=../python/tvm/
adla_convert=${ACUITY_PATH}adla_convert
 
 
if [ ! -e "$adla_convert" ]; then
    adla_convert=${ACUITY_PATH}adla_convert.py
fi
 
$adla_convert --model-type onnx \
        --model ./model_source/densenet_ctc/densenet_ctc.onnx \
        --inputs "the_input" \
        --input-shapes  "1,32,280"  \
        --dtypes "float32" \
        --inference-input-type float32 \
	--inference-output-type float32 \
        --quantize-dtype int8 --outdir onnx_output  \
        --channel-mean-value "0,0,0,255"  \
        --source-file ./densenet_ctc_dataset.txt  \
        --iterations 500 \
        --disable-per-channel False \
        --batch-size 1 --target-platform PRODUCT_PID0XA003

Run convert_adla.sh to generate the VIM4 model. The converted model is xxx.adla in onnx_output.

$ bash convert_adla.sh

Run inference on the NPU

Get source code

Clone the source code khadas/vim4_npu_applications.

$ git clone https://github.com/khadas/vim4_npu_applications

If your kernel is older than 241129, please use version before tag ddk-3.4.7.7.

Install dependencies

$ sudo apt update
$ sudo apt install libopencv-dev python3-opencv cmake

Compile and run

Picture input demo

Put densenet_ctc_int8.adla in vim4_npu_applications/densenet_ctc/data/.

# Compile
$ cd vim4_npu_applications/densenet_ctc
$ mkdir build
$ cd build
$ cmake ..
$ make
 
# Run
$ ./densenet_ctc -m ../data/densenet_ctc_int8.adla -p ../data/KhadasTeam.png

If your densenet_ctc - DenseNet-CTC model classes are not the same, please change data/class_str.txt and the OBJ_CLASS_NUM in include/postprocess.h.

Table of Contents