{{indexmenu_n>4}} ~~tag> DeepSeek NPU Edge2 RK3588~~ ====== Large Model - DeepSeek-R1-Distill-Qwen-1.5B/7B ====== ===== Convert Model ===== Convert model should be done on Linux PC. Convert **DeepSeek-R1-Distill-Qwen-1.5B** need GPU memory or CPU memory at least 13G. Convert **DeepSeek-R1-Distill-Qwen-7B** at least 32G. ==== Build virtual environment ==== Follow this docs to install [[https://conda.io/projects/conda/en/stable/user-guide/install/linux.html | conda]]. Then create a virtual environment. ```shell $ conda create -n RKLLM-Toolkit python=3.8 $ conda activate RKLLM-Toolkit #activate $ conda deactivate #deactivate ``` Download Tool from [[gh>airockchip/rknn-llm]]. ```shell $ git clone https://github.com/airockchip/rknn-llm.git ``` ==== Install dependencies ==== ```shell $ cd rknn-llm/rkllm-toolkit $ pip3 install rkllm_toolkit-1.1.4-cp38-cp38-linux_x86_64.whl ``` Check whether install successfully. ```shell $ python $ from rkllm.api import RKLLM ``` ==== Convert ==== Download Qwen-1.8B-Chat model in ''rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export'' ```shell $ cd rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export $ git lfs install $ git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B # Download 1.5B model $ git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B # Download 7B model ``` Modify ''export_rkllm.py'' as follows. ```diff diff --git a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py index 2396f66..d1fdf01 100755 --- a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py +++ b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export/export_rkllm.py @@ -8,7 +8,8 @@ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B Download the DeepSeek R1 model from the above url. ''' -modelpath = '/path/to/DeepSeek-R1-Distill-Qwen-1.5B' +modelpath = './DeepSeek-R1-Distill-Qwen-1.5B' +# modelpath = './DeepSeek-R1-Distill-Qwen-7B' llm = RKLLM() # Load model ``` Run ''export_rkllm.py'' to generate rkllm model. ```shell $ python export_rkllm.py ``` Model ''DeepSeek-R1-Distill-Qwen-1.5B_W8A8_RK3588.rkllm'' will generate in ''rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/export''. ===== Run NPU ===== ==== Get source code ==== Run **DeepSeek-R1-Distill-Qwen-1.5B** need Edge2 8G DDR. Run **DeepSeek-R1-Distill-Qwen-7B** need 16G. Download [[gh>airockchip/rknn-llm]] into Edge2. ```shell $ git clone https://github.com/airockchip/rknn-llm.git ``` ==== Install dependencies ==== ``` $ sudo apt update $ sudo apt install cmake ``` ==== Compile ==== Modify ''rknn-llm/rkllm-runtime/Linux/librkllm_api/include/rkllm.h'' as follows. ```diff diff --git a/rkllm-runtime/Linux/librkllm_api/include/rkllm.h b/rkllm-runtime/Linux/librkllm_api/include/rkllm.h index e565e6c..6d3623c 100644 --- a/rkllm-runtime/Linux/librkllm_api/include/rkllm.h +++ b/rkllm-runtime/Linux/librkllm_api/include/rkllm.h @@ -1,3 +1,4 @@ +#include #ifndef _RKLLM_H_ #define _RKLLM_H_ ``` Pull ''DeepSeek-R1-Distill-Qwen-1.5B_W8A8_RK3588.rkllm'' in ''rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy''. Modify ''rknn-llm/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh'' as follows. ```diff diff --git a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh index 4e74656..1f72ad9 100644 --- a/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh +++ b/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build-linux.sh @@ -4,7 +4,7 @@ if [[ -z ${BUILD_TYPE} ]];then BUILD_TYPE=Release fi -GCC_COMPILER_PATH=~/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu +GCC_COMPILER_PATH=aarch64-linux-gnu C_COMPILER=${GCC_COMPILER_PATH}-gcc CXX_COMPILER=${GCC_COMPILER_PATH}-g++ STRIP_COMPILER=${GCC_COMPILER_PATH}-strip ``` Run ''build-linux.sh'' to compile. ``` $ bash build-linux.sh ``` ==== Run ==== ```shell $ cd install/demo_Linux_aarch64 $ export LD_LIBRARY_PATH=./lib $ export RKLLM_LOG_LEVEL=1 # print infer speed $ ./llm_demo ../../DeepSeek-R1-Distill-Qwen-1.5B_W8A8_RK3588.rkllm 2048 4096 ``` **DeepSeek-R1-Distill-Qwen-1.5B** {{:products:sbc:edge2:npu:deepseek-1-5b-1.webp?800|}} {{:products:sbc:edge2:npu:deepseek-1-5b-2.webp?800|}} **DeepSeek-R1-Distill-Qwen-7B** {{:products:sbc:edge2:npu:deepseek-7b-1.webp?800|}} {{:products:sbc:edge2:npu:deepseek-7b-2.webp?800|}}