Large Model - DeepSeek-R1-Distill-Qwen-1.5B/7B

Convert Model

Convert model should be done on Linux PC. Convert DeepSeek-R1-Distill-Qwen-1.5B need GPU memory or CPU memory at least 13G. Convert DeepSeek-R1-Distill-Qwen-7B at least 32G.

Build virtual environment

Follow this docs to install conda.

Then create a virtual environment.

$ conda create -n RKLLM-Toolkit python=3.8
$ conda activate RKLLM-Toolkit     #activate
$ conda deactivate                 #deactivate

Download Tool from airockchip/rknn-llm.

$ git clone https://github.com/airockchip/rknn-llm.git

Install dependencies

$ cd rknn-llm/rkllm-toolkit
$ pip3 install rkllm_toolkit-1.1.4-cp38-cp38-linux_x86_64.whl

Check whether install successfully.

$ python
$ from rkllm.api import RKLLM

Convert

Download Qwen-1.8B-Chat model in rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export

$ cd rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export
$ git lfs install
$ git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B # Download 1.5B model
$ git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B   # Download 7B model

Modify export_rkllm.py as follows.

diff --git a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py
index 2396f66..d1fdf01 100755
--- a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py
+++ b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py
@@ -8,7 +8,8 @@ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
 Download the DeepSeek R1 model from the above url.
 '''
 
-modelpath = '/path/to/DeepSeek-R1-Distill-Qwen-1.5B'
+modelpath = './DeepSeek-R1-Distill-Qwen-1.5B'
+# modelpath = './DeepSeek-R1-Distill-Qwen-7B'
 llm = RKLLM()
 
 # Load model

Run export_rkllm.py to generate rkllm model.

$ python export_rkllm.py

Model DeepSeek-R1-Distill-Qwen-1.5B_W8A8_RK3588.rkllm will generate in rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export.

Run NPU

Get source code

Run DeepSeek-R1-Distill-Qwen-1.5B need Edge2 8G DDR. Run DeepSeek-R1-Distill-Qwen-7B need 16G.

Download airockchip/rknn-llm into Edge2.

$ git clone https://github.com/airockchip/rknn-llm.git

Install dependencies

$ sudo apt update
$ sudo apt install cmake

Compile

Modify rknn-llm/rkllm-runtime/Linux/librkllm_api/include/rkllm.h as follows.

diff --git a/rkllm-runtime/Linux/librkllm_api/include/rkllm.h b/rkllm-runtime/Linux/librkllm_api/include/rkllm.h
index e565e6c..6d3623c 100644
--- a/rkllm-runtime/Linux/librkllm_api/include/rkllm.h
+++ b/rkllm-runtime/Linux/librkllm_api/include/rkllm.h
@@ -1,3 +1,4 @@
+#include <cstdint>
 #ifndef _RKLLM_H_
 #define _RKLLM_H_

Pull DeepSeek-R1-Distill-Qwen-1.5B_W8A8_RK3588.rkllm in rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy.

Modify rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh as follows.

diff --git a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh
index 4e74656..1f72ad9 100644
--- a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh
+++ b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh
@@ -4,7 +4,7 @@ if [[ -z ${BUILD_TYPE} ]];then
     BUILD_TYPE=Release
 fi
 
-GCC_COMPILER_PATH=~/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu
+GCC_COMPILER_PATH=aarch64-linux-gnu
 C_COMPILER=${GCC_COMPILER_PATH}-gcc
 CXX_COMPILER=${GCC_COMPILER_PATH}-g++
 STRIP_COMPILER=${GCC_COMPILER_PATH}-strip

Run build-linux.sh to compile.

$ bash build-linux.sh

Run

$ cd install/demo_Linux_aarch64
$ export LD_LIBRARY_PATH=./lib
$ export RKLLM_LOG_LEVEL=1 # print infer speed
$ ./llm_demo ../../DeepSeek-R1-Distill-Qwen-1.5B_W8A8_RK3588.rkllm 2048 4096

DeepSeek-R1-Distill-Qwen-1.5B

DeepSeek-R1-Distill-Qwen-7B

Khadas Docs

Sidebar

Table of Contents

Large Model - DeepSeek-R1-Distill-Qwen-1.5B/7B

Convert Model

Build virtual environment

Install dependencies

Convert

Run NPU

Get source code

Install dependencies

Compile

Run

Khadas Docs

User Tools

Site Tools

Sidebar

Table of Contents

Large Model - DeepSeek-R1-Distill-Qwen-1.5B/7B

Convert Model

Build virtual environment

Install dependencies

Convert

Run NPU

Get source code

Install dependencies

Compile

Run

Page Tools