728x90

출처

Deadsnakes PPA에서 Python 3.11 저장소에 추가

orangepi@orangepi5-plus:~$ sudo apt install -y software-properties-common
orangepi@orangepi5-plus:~$ sudo add-apt-repository ppa:deadsnakes/ppa -y
orangepi@orangepi5-plus:~$ sudo apt update

설치하기 전에 apt가 PPA에서 패키지를 가져오는지 확인

orangepi@orangepi5-plus:~$ sudo apt policy python3.11
python3.11:
  설치: (없음)
  후보: 3.11.14-1+noble1
  버전 테이블:
     3.11.14-1+noble1 500
        500 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu noble/main arm64 Packages

Python 3.11 설치

orangepi@orangepi5-plus:~$ sudo apt install python3.11 python3.11-venv python3.11-dev
orangepi@orangepi5-plus:~$ sudo apt install python3.11-dbg python3.11-gdbm python3.11-tk

Python 3.11 설치 확인

orangepi@orangepi5-plus:~$ python3.11 --version
Python 3.11.14
orangepi@orangepi5-plus:~$ python3.11 -c "import ssl, sqlite3, bz2; print('Source build is healthy')"
Source build is healthy

Python 3.11 가상 환경

orangepi@orangepi5-plus:~$ mkdir Llama
orangepi@orangepi5-plus:~$ cd Llama
orangepi@orangepi5-plus:~/Llama$ sudo apt install python3.11-venv
orangepi@orangepi5-plus:~/Llama$ python3.11 -m venv .venv
orangepi@orangepi5-plus:~/Llama$ source .venv/bin/activate
(.venv) orangepi@orangepi5-plus:~/Llama$ python -m pip --version
pip 24.0 from /home/orangepi/Llama/.venv/lib/python3.11/site-packages/pip (python 3.11)

Bootstrap Pip with get-pip.py

orangepi@orangepi5-plus:~$ wget https://bootstrap.pypa.io/get-pip.py
orangepi@orangepi5-plus:~$ python3.11 get-pip.py
orangepi@orangepi5-plus:~$ rm get-pip.py
728x90
728x90

출처

리눅스 버전 확인

orangepi@orangepi5plus:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy
orangepi@orangepi5plus:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:           7.7Gi       511Mi       6.7Gi        13Mi       592Mi       7.2Gi
Swap:          3.9Gi          0B       3.9Gi

NPU 드라이버 버전 확인

orangepi@orangepi5plus:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.8

python 설치

orangepi@orangepi5plus:~$ sudo apt install python3 python3-pip python3-venv

python 가상 환경 만들기

orangepi@orangepi5plus:~$ mkdir Llama
orangepi@orangepi5plus:~$ cd Llama
orangepi@orangepi5plus:~/Llama$ python3 -m venv .venv
orangepi@orangepi5plus:~/Llama$ source .venv/bin/activate
(.venv) orangepi@orangepi5plus:~/Llama$ 

RKNN-Toolkit2

cmake 설치

(.venv) orangepi@orangepi5plus:~/Llama$ sudo apt install cmake

rknn-toolkit2 설치

(.venv) orangepi@orangepi5plus:~/Llama$ pip install rknn-toolkit2

rknn-toolkit2 설치 확인

(.venv) orangepi@orangepi5plus:~/Llama$ python -c "import rknn.api.rknn_base as base; print(base.__file__)"
/home/orangepi/Llama/.venv/lib/python3.10/site-packages/rknn/api/rknn_base.cpython-310-aarch64-linux-gnu.so
(.venv) orangepi@orangepi5plus:~/Llama$ pip show rknn-toolkit2
Name: rknn-toolkit2
Version: 2.3.2
Summary: Rockchip Neural Network Toolkit2. (commit: c7d6ffcf)
Home-page: https://github.com/airockchip/rknn-toolkit2
Author: ai@rock-chips.com
Author-email: ai@rock-chips.com
License: 
Location: /home/orangepi/Llama/.venv/lib/python3.10/site-packages
Requires: fast-histogram, numpy, onnx, onnxoptimizer, onnxruntime, opencv-python, protobuf, psutil, ruamel.yaml, scipy, torch, tqdm
Required-by: 
(.venv) orangepi@orangepi5plus:~/Llama$  pip show torch
Name: torch
Version: 2.2.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /home/orangepi/Llama/.venv/lib/python3.10/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: rknn-toolkit2

torchvision 설치

(.venv) orangepi@orangepi5plus:~/Llama$ pip install torchvision==0.17.0 torchaudio==2.2.0
(.venv) orangepi@orangepi5plus:~/Llama$ pip show torchvision
Name: torchvision
Version: 0.17.0
Summary: image and video datasets and models for torch deep learning
Home-page: https://github.com/pytorch/vision
Author: PyTorch Core Team
Author-email: soumith@pytorch.org
License: BSD
Location: /home/orangepi/Llama/.venv/lib/python3.10/site-packages
Requires: numpy, pillow, requests, torch
Required-by: 
(.venv) orangepi@orangepi5plus:~/Llama$ pip show torchaudio
Name: torchaudio
Version: 2.2.0
Summary: An audio package for PyTorch
Home-page: https://github.com/pytorch/audio
Author: Soumith Chintala, David Pollack, Sean Naren, Peter Goldsborough, Moto Hira, Caroline Chen, Jeff Hwang, Zhaoheng Ni, Xiaohui Zhang
Author-email: soumith@pytorch.org
License: 
Location: /home/orangepi/Llama/.venv/lib/python3.10/site-packages
Requires: torch
Required-by: 
728x90
728x90

출처

리눅스 버전 확인

orangepi@orangepi5plus:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy
orangepi@orangepi5plus:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:           7.7Gi       522Mi       6.4Gi        13Mi       907Mi       7.1Gi
Swap:          3.9Gi          0B       3.9Gi

rknpu2 드라이버 버전 확인

orangepi@orangepi5plus:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.6

github에서 orangepi-build 다운로드

orangepi@orangepi5plus:~$ cd ~
orangepi@orangepi5plus:~$ git clone https://github.com/orangepi-xunlong/orangepi-build.git -b next

리눅스 5.10 커널 소스 코드 다운로드

orangepi@orangepi5plus:~$ cd orangepi-build
orangepi@orangepi5plus:~/orangepi-build$ mkdir kernel && cd kernel
orangepi@orangepi5plus:~/orangepi-build/kernel$ git clone https://github.com/orangepi-xunlong/linux-orangepi.git -b orange-pi-6.1-rk35xx
orangepi@orangepi5plus:~/orangepi-build/kernel$ mv linux-orangepi/ orange-pi-6.1-rk35xx

RKNPU 드라이버 무시함

orangepi@orangepi5plus:~/orangepi-build/kernel$ cd ~/orangepi-build
orangepi@orangepi5plus:~/orangepi-build$ git clone https://github.com/airockchip/rknn-llm.git
orangepi@orangepi5plus:~/orangepi-build$ tar -xvf rknn-llm/rknpu-driver/rknpu_driver_0.9.8_20241009.tar.bz2 
orangepi@orangepi5plus:~/orangepi-build$ cp -r drivers/ kernel/orange-pi-6.1-rk35xx/

컴파일 오류를 피하기 위해 몇 가지 파일을 수정

kernel/include/linux/mm.h 코드 추가

orangepi@orangepi5plus:~/orangepi-build$ vi kernel/orange-pi-6.1-rk35xx/include/linux/mm.h
static inline void vm_flags_set(struct vm_area_struct *vma, vm_flags_t flags) {
  vma->vm_flags |= flags;
}

static inline void vm_flags_clear(struct vm_area_struct *vma, vm_flags_t flags) {
  vma->vm_flags &= ~flags;
}

rknpu_devfreq.c 코드 수정

orangepi@orangepi5plus:~/orangepi-build$ vi kernel/orange-pi-6.1-rk35xx/drivers/rknpu/rknpu_devfreq.c

237행 주석 처리 : set_soc_info = rockchip_opp_set_low_length,

소스 코드 동기화 비활성화

kernel/orange-pi-5.10-rk35xx 디렉터리에 드라이버를 수동으로 덮어썼기 때문에, 컴파일을 직접 실행하면 스크립트가 클라우드 내 소스 코드와 불일치를 검사해 코드를 다시 풀어서 덮어쓰기 문제가 발생합니다. 따라서 설정 파일에서 소스 코드 동기화 기능을 비활성화해야 합니다.

빌드 스크립트를 실행하여 초기화 실행

orangepi@orangepi5plus:~/orangepi-build$ sudo ./build.sh

config-default.conf 수정

orangepi@orangepi5plus:~/orangepi-build$ sudo vi userpatches/config-default.conf
IGNORE_UPDATES="yes"

build.sh 실행 리눅스 커널 컴파일 시작

orangepi@orangepi5plus:~/orangepi-build$ sudo ./build.sh

orangepi@orangepi5plus:~/orangepi-build$ sudo ./build.sh
 
dpkg-deb: building package 'linux-headers-current-rockchip-rk3588' in '../linux-headers-current-rockchip-rk3588_1.2.0_arm64.deb'.
dpkg-deb: building package 'linux-dtb-current-rockchip-rk3588' in '../linux-dtb-current-rockchip-rk3588_1.2.0_arm64.deb'.
dpkg-deb: building package 'linux-image-current-rockchip-rk3588' in '../linux-image-current-rockchip-rk3588_1.2.0_arm64.deb'.
dpkg-deb: building package 'linux-image-current-rockchip-rk3588-dbg' in '../linux-image-current-rockchip-rk3588-dbg_1.2.0_arm64.deb'.
dpkg-genchanges: info: binary-only upload (no source code included)
dpkg-buildpackage: info: binary-only upload (no source included)
[ o.k. ] Kernel build done [ @host ]
[ o.k. ] Target directory [ /home/orangepi/orangepi-build/output/debs/ ]
[ o.k. ] File name [ linux-image-current-rockchip-rk3588_1.2.0_arm64.deb ]
[ o.k. ] Runtime [ 35 min ]
[ o.k. ] Repeat Build Options [ sudo ./build.sh  BOARD=orangepi5plus BRANCH=current BUILD_OPT=kernel KERNEL_CONFIGURE=no  ]

deb 패키지 설치

orangepi@orangepi5plus:~/orangepi-build$ ls output/debs/linux-*
output/debs/linux-dtb-current-rockchip-rk3588_1.2.0_arm64.deb       커널에서 사용하는 dtb 파일 포함
output/debs/linux-headers-current-rockchip-rk3588_1.2.0_arm64.deb   커널 헤더 포함
output/debs/linux-image-current-rockchip-rk3588_1.2.0_arm64.deb     커널 미러링 및 커널 모듈 포함
output/debs/linux-image-current-rockchip-rk3588-dbg_1.2.0_arm64.deb

linux-image-legacy-rockchip-rk3588_1.1.8_arm64.deb 설치

orangepi@orangepi5plus:~/orangepi-build$ sudo apt purge -y linux-image-current-rockchip-rk3588
orangepi@orangepi5plus:~/orangepi-build$ sudo dpkg -i output/debs/linux-image-current-rockchip-rk3588_1.2.0_arm64.deb
Selecting previously unselected package linux-image-current-rockchip-rk3588.
(Reading database ... 168054 files and directories currently installed.)
Preparing to unpack .../linux-image-current-rockchip-rk3588_1.2.0_arm64.deb ...
Unpacking linux-image-current-rockchip-rk3588 (1.2.0) ...
Setting up linux-image-current-rockchip-rk3588 (1.2.0) ...
 * dkms: running auto installation service for kernel 6.1.43-rockchip-rk3588
   ...done.
update-initramfs: Generating /boot/initrd.img-6.1.43-rockchip-rk3588
update-initramfs: Converting to u-boot format
Free space after deleting the package linux-image-current-rockchip-rk3588 in /boot: 936.9M

재부팅

orangepi@orangepi5plus:~/orangepi-build$ sudo reboot

rknpu2 드라이버 버전확인

orangepi@orangepi5plus:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.8
728x90
728x90

출처

리눅스 버전 확인

orangepi@orangepi5plus:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy
orangepi@orangepi5plus:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:           7.5Gi       559Mi       6.4Gi        26Mi       541Mi       6.9Gi
Swap:          3.8Gi          0B       3.8Gi

rknpu2 드라이버 버전 확인

orangepi@orangepi5plus:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.6

eMMC에 Linux설치

eMMC 장치 확인

orangepi@orangepi5plus:~$ ls /dev/mmcblk*boot0 | cut -c1-12
/dev/mmcblk0

dd 명령을 사용하여 eMMC를 지우기

orangepi@orangepi5plus:~$ sudo dd bs=1M if=/dev/zero of=/dev/mmcblk0 count=1000 status=progress
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 4.71944 s, 222 MB/s
orangepi@orangepi5plus:~$ sudo sync

eMMC에 Linux설치

orangepi@orangepi5plus:~$ sudo dd bs=1M if=Armbian_25.11.1_Orangepi5-plus_noble_vendor_6.1.115_xfce_desktop.img of=/dev/mmcblk0 status=progress
8772386816 bytes (8.8 GB, 8.2 GiB) copied, 378 s, 23.2 MB/s
8384+0 records in
8384+0 records out
8791261184 bytes (8.8 GB, 8.2 GiB) copied, 380.109 s, 23.1 MB/s
orangepi@orangepi5plus:~$ sudo sync

재부팅

orangepi@orangepi5plus:~$ sudo reboot

OS 업데이트 오류 수정

OS 오류 확인

orangepi@orangepi5plus:~$ sudo apt update
 
Fetched 28.2 MB in 6s (5,014 kB/s)  
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
326 packages can be upgraded. Run 'apt list --upgradable' to see them.
W: https://repo.huaweicloud.com/docker-ce/linux/ubuntu/dists/jammy/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details

저장소 설정파일에 keyring 참조 추가

orangepi@orangepi5plus:~$ sudo vi /etc/apt/sources.list.d/docker.list
# deb [arch=arm64] https://repo.huaweicloud.com/docker-ce/linux/ubuntu jammy stable
deb [arch=amd64 signed-by=/usr/share/keyrings/docker.gpg] https://repo.huaweicloud.com/docker-ce/linux/ubuntu jammy stable

Docker GPG 키 전부 제거 (key 오류시 적용)

orangepi@orangepi5plus:~$ sudo rm -f /usr/share/keyrings/docker.gpg
orangepi@orangepi5plus:~$ sudo rm -f /etc/apt/trusted.gpg.d/docker*.gpg

Docker GPG 키 다시 추가 (key 오류시 적용)

orangepi@orangepi5plus:~$ sudo mkdir -p /usr/share/keyrings
orangepi@orangepi5plus:~$ curl -fsSL https://repo.huaweicloud.com/docker-ce/linux/ubuntu/gpg \
 | sudo gpg --dearmor -o /usr/share/keyrings/docker.gpg

다시 업데이트

orangepi@orangepi5pro:~$ sudo apt update

XRDP 설치

XRDP 설치

orangepi@orangepi5-plus:~$ sudo apt install xrdp xorgxrdp

.xsession 생성

orangepi@orangepi5-plus:~$ vi ~/.xsession
exec startxfce4
728x90
728x90

출처

리눅스 버전 확인

orangepi@orangepi5pro:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy
orangepi@orangepi5pro:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       873Mi       5.4Gi        67Mi       9.3Gi        14Gi
Swap:          7.8Gi          0B       7.8Gi

rknpu2 드라이버 버전 확인

orangepi@orangepi5pro:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.6

SPI 플래시 삭제 (불필요)

OPI 5 시리즈의 SPI 플래시 크기는 16MB입니다 예를 들어 16 MiB를 가진 디스크: /dev/mtdblock0는 SPI 플래시입니다.

orangepi@orangepi5pro:~$ sudo apt install gdisk
orangepi@orangepi5pro:~$ 
orangepi@orangepi5pro:~$ sudo gdisk /dev/mtdblock0

eMMC에 Linux설치

eMMC 장치 확인

orangepi@orangepi5pro:~$ ls /dev/mmcblk*boot0 | cut -c1-12
/dev/mmcblk0

dd 명령을 사용하여 eMMC를 지우기

orangepi@orangepi5pro:~$ sudo dd bs=1M if=/dev/zero of=/dev/mmcblk0 count=1000 status=progress
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 4.56976 s, 229 MB/s
orangepi@orangepi5pro:~$ sudo sync

eMMC에 Linux설치

orangepi@orangepi5pro:~$ sudo dd bs=1M if=Armbian_community_26.2.0-trunk.130_Orangepi5pro_noble_vendor_6.1.115_gnome_desktop.img of=/dev/mmcblk0 status=progress
8762949632 bytes (8.8 GB, 8.2 GiB) copied, 23 s, 381 MB/s
8372+0 records in
8372+0 records out
8778678272 bytes (8.8 GB, 8.2 GiB) copied, 34.8036 s, 252 MB/s
orangepi@orangepi5pro:~$ sudo sync

재부팅

orangepi@orangepi5pro:~$ sudo reboot

OS 업데이트 오류 수정

OS 오류 확인

orangepi@orangepi5pro:~$ sudo apt update
 
Fetched 28.1 MB in 3s (10.2 MB/s)                            
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
326 packages can be upgraded. Run 'apt list --upgradable' to see them.
W: https://repo.huaweicloud.com/docker-ce/linux/ubuntu/dists/jammy/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.

저장소 설정파일에 keyring 참조 추가

orangepi@orangepi5pro:~$ sudo vi /etc/apt/sources.list.d/docker.list
# deb [arch=arm64] https://repo.huaweicloud.com/docker-ce/linux/ubuntu jammy stable
deb [arch=amd64 signed-by=/usr/share/keyrings/docker.gpg] https://repo.huaweicloud.com/docker-ce/linux/ubuntu jammy stable

Docker GPG 키 전부 제거 (key 오류시 적용)

orangepi@orangepi5pro:~$ sudo rm -f /usr/share/keyrings/docker.gpg
orangepi@orangepi5pro:~$ sudo rm -f /etc/apt/trusted.gpg.d/docker*.gpg

Docker GPG 키 다시 추가 (key 오류시 적용)

orangepi@orangepi5pro:~$ sudo mkdir -p /usr/share/keyrings
orangepi@orangepi5pro:~$ curl -fsSL https://repo.huaweicloud.com/docker-ce/linux/ubuntu/gpg \
 | sudo gpg --dearmor -o /usr/share/keyrings/docker.gpg

다시 업데이트

orangepi@orangepi5pro:~$ sudo apt update
728x90
728x90

출처

소스 다운로드

orangepi@orangepi5pro:~/Llama$ source .venv/bin/activate
(.venv) orangepi@orangepi5pro:~/Llama$ git clone https://github.com/NotPunchnox/web-client-rkllm.git

실행

(.venv) orangepi@orangepi5pro:~/Llama$ cd web-client-rkllm/
orangepi@orangepi5pro:~/Llama/web-client-rkllm$ ./start.sh
Node.js is already installed (version: v24.11.1).
Starting rkllama Web Server on port 3000...
 ERROR  Cannot copy server address to clipboard: Couldn't find the `xsel` binary and fallback didn't work. On Debian/Ubuntu you can install xsel with: sudo apt install xsel.
 
   ┌───────────────────────────────────────────┐
   │                                           │
   │   Serving!                                │
   │                                           │
   │   - Local:    http://localhost:3000       │
   │   - Network:  http://192.168.0.218:3000   │
   │                                           │
   └───────────────────────────────────────────┘
 
rkllama Web Server is now running at http://localhost:3000
Port used: 3000
Press Ctrl+C to stop the server.

script.js 수정

(.venv) orangepi@orangepi5pro:~/Llama/web-client-rkllm$ vi src/js/script.js
// const API_URL = 'http://localhost:8080/';
const API_URL = 'http://192.168.0.218:8080/';


// sendBtn.addEventListener('click', sendMessage);
sendBtn.addEventListener('click', sendChat);
userInput.addEventListener('keypress', (e) => {
    if (e.key === 'Enter') sendChat(); // sendMessage();
});


async function sendChat() {
    const modelName = modelSelect.value;
    if (!modelName) return;
    
    // Prevent double sending
    if (requestInProgress) return;

    const message = userInput.value.trim();
    const selectedModel = modelSelect.value;

    if (!message) return;
    if (!selectedModel) {
        addMessage('error', 'Please select a model first');
        return;
    }

    // Add user message to chat
    addMessage('user', message);
    HISTORY.push({ role: 'user', content: message });
    userInput.value = '';

    // Prepare for sending
    requestInProgress = true;
    chatLoading.style.display = 'flex';

    try {
        const response = await fetch(API_URL + '/v1/chat/completions', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ model: modelName, messages: HISTORY, stream: true })
        });

        if (!response.ok) {
            const errorData = await response.json();
            throw new Error(errorData.error || 'Failed to generate response');
        }

        // Create message container
        const messageDiv = document.createElement('div');
        messageDiv.className = 'message';

        // Add author
        const authorSpan = document.createElement('strong');
        authorSpan.textContent = `RKLLAMA (${selectedModel}): `;
        messageDiv.appendChild(authorSpan);

        // Add content container for streaming
        const contentContainer = document.createElement('div');
        contentContainer.className = 'message-content';
        messageDiv.appendChild(contentContainer);

        chatBox.appendChild(messageDiv);

        // Stream response - définir reader ici
        const reader = response.body.getReader();
        const decoder = new TextDecoder('utf-8');
        let assistantMessage = '';

        // Fonction pour lire le stream
        async function readStream() {
            let done = false;

            while (!done) {
                const { value, done: doneReading } = await reader.read();
                done = doneReading;

                if (done) break;

                const chunk = decoder.decode(value, { stream: true });
                try {
                    const lines = chunk.split('\n');

                    for (const line of lines) {
                        if (!line.trim()) continue;
                        
                        jsonChunk = '';
                        if (line.startsWith("data: ")) {
                          const jsonString = line.replace(/^data:\s*/, "");
                          jsonChunk = JSON.parse(jsonString);
                        } else {
                          jsonChunk = JSON.parse(line);
                        }

                        // const jsonChunk = JSON.parse(line);
                        console.log(jsonChunk);

                        if (jsonChunk.choices && jsonChunk.choices[0].delta && jsonChunk.choices[0].delta.content) {
                            assistantMessage += jsonChunk.choices[0].delta.content;
                            processContent(contentContainer, assistantMessage);
                            chatBox.scrollTop = chatBox.scrollHeight;
                        }
                    }
                } catch (e) {
                    console.error('Error parsing chunk:', e);
                }
            }
        }

        await readStream();

        // Add to history and save conversation
        HISTORY.push({ role: 'assistant', content: assistantMessage });
        saveConversation();
    } catch (error) {
        addMessage('error', error.message);
    } finally {
        requestInProgress = false;
        chatLoading.style.display = 'none';
    }
}

async function showModelInfo() {
    const modelName = modelSelect.value;
    if (!modelName) return;

    try {
        // Show loading state
        // modelSelect.disabled = true;

        const res = await fetch(API_URL + 'api/show', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ name: modelName })
        });
        
        if (!res.ok) throw new Error('Failed to fetch models');

        const data = await res.json();
        console.log('showApi.data = ' + data);
        
        /*
        modelSelect.innerHTML = '<option value="">Select a model</option>';

        if (data.models && data.models.length > 0) {
            data.models.forEach(model => {
                modelSelect.innerHTML += `<option value="${model}">${model}</option>`;
            });
            addMessage('system', `${data.models.length} models available. Select one to start chatting.`);
        } else {
            addMessage('system', 'No models available. Download a model to get started.');
        }
        */
    } catch (err) {
        console.error('Error loading models:', err);
        addMessage('error', `Could not load models: ${err.message}`);
    }
}

브라우저 실행

728x90
728x90

출처

소스 다운로드

orangepi@orangepi-desktop:~/Llama$ source .venv/bin/activate
(.venv) orangepi@orangepi-desktop:~/Llama$ git clone https://github.com/notpunchnox/rkllama

Python 패키지 설치 시스템을 최신 버전으로 업데이트

(.venv) orangepi@orangepi-desktop:~/Llama/rkllama$ pip install --upgrade pip setuptools wheel

RKLLama 설치

(.venv) orangepi@orangepi-desktop:~/Llama/rkllama$ python -m pip install .
 
Successfully built rkllama
Installing collected packages: zipp, Werkzeug, typing-inspection, safetensors, requests, regex, pyyaml, python-dotenv, pydantic-core, itsdangerous, hf-xet, h11, exceptiongroup, click, blinker, annotated-types, rknn-toolkit-lite2, pydantic, importlib_metadata, huggingface_hub, httpcore, Flask, anyio, tokenizers, httpx, flask-cors, transformers, diffusers, rkllama
  Attempting uninstall: requests
    Found existing installation: requests 2.32.5
    Uninstalling requests-2.32.5:
      Successfully uninstalled requests-2.32.5
Successfully installed Flask-2.3.2 Werkzeug-3.1.4 annotated-types-0.7.0 anyio-4.12.0 blinker-1.9.0 click-8.3.1 diffusers-0.36.0 exceptiongroup-1.3.1 flask-cors-6.0.1 h11-0.16.0 hf-xet-1.2.0 httpcore-1.0.9 httpx-0.28.1 huggingface_hub-0.36.0 importlib_metadata-8.7.0 itsdangerous-2.2.0 pydantic-2.12.5 pydantic-core-2.41.5 python-dotenv-1.2.1 pyyaml-6.0.3 regex-2025.11.3 requests-2.31.0 rkllama-0.0.52 rknn-toolkit-lite2-2.3.2 safetensors-0.7.0 tokenizers-0.22.1 transformers-4.57.3 typing-inspection-0.4.2 zipp-3.23.0

RKLLama 서버 실행

(.venv) orangepi@orangepi-desktop:~/Llama/rkllama$ mkdir ~/Llama/models
(.venv) orangepi@orangepi5pro:~/Llama/rkllama$ rkllama_server --debug --models ~/Llama/models
Disabling PyTorch because PyTorch >= 2.1 is required but found 2.0.1
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
2025-12-09 22:16:51,896 - rkllama.worker - INFO - Models Monitor running.
2025-12-09 22:16:51,911 - rkllama.config - INFO - Created directory: /home/orangepi/Llama/.venv/lib/python3.10/site-packages/rkllama/config/data
2025-12-09 22:16:51,911 - rkllama.config - INFO - Created directory: /home/orangepi/Llama/.venv/lib/python3.10/site-packages/rkllama/config/temp
Start the API at http://localhost:8080
 * Serving Flask app 'rkllama.server.server'
 * Debug mode: off
2025-12-09 22:16:51,914 - werkzeug - INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:8080
 * Running on http://192.168.0.217:8080
2025-12-09 22:16:51,914 - werkzeug - INFO - Press CTRL+C to quit

모델 다운로드

punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4

(.venv) orangepi@orangepi-desktop:~/Llama$ rkllama_client pull
Repo ID ( example: punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4 ): punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4
File ( example: TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm ): TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm
Custom Model Name ( example: tinyllama-chat:1.1b ): tinyllama-chat:1.1b
 
Downloading TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm (1126.29 MB)...
Progress: [##################################################] 100.00%
Download complete.
(.venv) orangepi@orangepi-desktop:~/Llama$ rkllama_client list
Available models:
- tinyllama-chat:1.1b

c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4

(.venv) orangepi@orangepi-desktop:~/Llama$ rkllama_client pull
Repo ID ( example: punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4 ): c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4
File ( example: TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm ): Qwen2.5-3B-Instruct-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm
Custom Model Name ( example: tinyllama-chat:1.1b ): Qwen2.5-3B-Instruct-RK3588:1.1.4
 
Downloading Qwen2.5-3B-Instruct-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm (3565.17 MB)...
Progress: [##################################################] 100.00%
Download complete.
 
(.venv) orangepi@orangepi-desktop:~/Llama$ rkllama_client list
Available models:
- Qwen2.5-3B-Instruct-RK3588:1.1.4
- tinyllama-chat:1.1b

c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4 오프라인용 tokenizer 파일 다운로드

(.venv) orangepi@orangepi5pro:~/Llama$ huggingface-cli download \
 c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4 \
 --local-dir ~/Llama/models/Qwen2.5-3B-Instruct-RK3588\:1.1.4/ \
 --include "tokenizer.*" "vocab.*" "merges.*" "special_tokens_map.json" "tokenizer_config.json" "config.json"

모델 실행

모델 실행 오류

2025-12-16 12:14:03,387 - werkzeug - INFO - 127.0.0.1 - - [16/Dec/2025 12:14:03] "GET / HTTP/1.1" 200 -
FROM: Qwen2.5-3B-Instruct-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm
HuggingFace Path: c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4
2025-12-16 12:14:03,440 - rkllama.rkllm - DEBUG - Initializing RKLLM model from /home/orangepi/Llama/models/Qwen2.5-3B-Instruct-RK3588:1.1.4/Qwen2.5-3B-Instruct-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm with options: {'temperature': '0.5', 'num_ctx': '16384', 'max_new_tokens': '16384', 'top_k': '7', 'top_p': '0.5', 'repeat_penalty': '1.1', 'frequency_penalty': '0.0', 'presence_penalty': '0.0', 'mirostat': '0', 'mirostat_tau': '3', 'mirostat_eta': '0.1', 'from': '"Qwen2.5-3B-Instruct-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm"', 'huggingface_path': '"c01zaut/Qwen2.5-3B-Instruct-RK3588-1.1.4"', 'system': '""', 'enable_thinking': 'False'}
I rkllm: rkllm-runtime version: 1.2.3, rknpu driver version: 0.9.8, platform: RK3588
I rkllm: loading rkllm model from /home/orangepi/Llama/models/Qwen2.5-3B-Instruct-RK3588:1.1.4/Qwen2.5-3B-Instruct-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm
E rkllm: max_context[16384] must be less than the model's max_context_limit[4096]
2025-12-16 12:14:04,137 - rkllama.worker - ERROR - Failed creating the worker for model 'Qwen2.5-3B-Instruct-RK3588:1.1.4': Failed to initialize RKLLM model: -1
2025-12-16 12:14:04,145 - werkzeug - INFO - 127.0.0.1 - - [16/Dec/2025 12:14:04] "POST /load_model HTTP/1.1" 400 -

Modelfile 수정

(.venv) orangepi@orangepi5pro:~/Llama$ vi models/Qwen2.5-3B-Instruct-RK3588\:1.1.4/Modelfile
NUM_CTX=4096

MAX_NEW_TOKENS=2048

모델 실행

(.venv) orangepi@orangepi5pro:~/Llama/models/Qwen2.5-3B-Instruct-RK3588:1.1.4$ rkllama_client run Qwen2.5-3B-Instruct-RK3588:1.1.4
Model Qwen2.5-3B-Instruct-RK3588:1.1.4 loaded successfully.
Available commands:
/help           : Displays this help menu.
/clear          : Clears the current conversation history.
/cls or /c      : Clears the console content.
/set stream     : Enables stream mode.
/unset stream   : Disables stream mode.
/set verbose    : Enables verbose mode.
/unset verbose  : Disables verbose mode.
/set system     : Modifies the system message.
exit            : Exits the conversation.
 
You: Hello
Assistant: Hello there! How can I assist you today? Whether it's answering questions, helping with tasks, or just chatting, feel free to let me know how I can help.
 
You: 안녕
Assistant: 글렌치니# 안녕하세요! 무엇을 도와드릴까요?
 
You: 세종대왕 알려죠
Assistant: Qwen는 역사적인 세부 사항에 대해 제공하거나 질문에 답변하지 못합니다. 하지만 세종대왕은 조선 시대의 유명한 왕으로, 그의 공과 업적을 알고 싶으시다면 말씀해 주세요. 다른 궁금하신 점이 있으신가요?
 
You: 

Offline 처리

.rkllm 모델은 토크나이저를 내부에 포함하지 않으며, 문자열 프롬프트를 그대로 전달하는 방식으로 동작합니다. 반면 AutoTokenizer는 Hugging Face API 호출이나 캐시 파일에 의존하기 때문에, 네트워크가 차단된 오프라인 환경에서는 여러 제약이 발생할 수 있습니다. 이러한 이유로 .rkllm 모델을 사용할 때는 AutoTokenizer를 배제하고, 오프라인 환경에 최적화된 프롬프트 처리 방식을 정리했습니다.

환경 설정 -> 서버 다시시작

$ export HF_HUB_OFFLINE=1
$ export TRANSFORMERS_OFFLINE=1

server_utils.py 소스 수정

(.venv) orangepi@orangepi5pro:~/Llama$ vi .venv/lib/python3.10/site-packages/rkllama/api/server_utils.py
    @staticmethod
    def prepare_prompt(model_name, messages, system="", tools=None, enable_thinking=False):
        """Prepare prompt with proper system handling"""
        
        # 수정시작
        # Get model specific tokenizer from Huggin Face specified in Modelfile
        hf_offline = os.getenv("HF_HUB_OFFLINE", "0")
        print(f"[ENV] HF_HUB_OFFLINE = {hf_offline}")
        if hf_offline == "1":
            print("👉 HuggingFace OFFLINE mode enabled")
            
            rkllama_config_models = rkllama.config.get_path("models")
            print(f"model_name = {model_name}, rkllama_config_models = {rkllama_config_models}")
        
            # tokenizer = AutoTokenizer.from_pretrained("/home/orangepi/Llama/models/Qwen2.5-3B-Instruct-RK3588:1.1.4", trust_remote_code=True, local_files_only=True)
        
            tokenizer_model_path = os.path.join(rkllama_config_models, model_name)
            print(f"tokenizer_model_path = {tokenizer_model_path}")
            if os.path.isdir(tokenizer_model_path):
                # 경로가 존재할 때 실행
                tokenizer = AutoTokenizer.from_pretrained(tokenizer_model_path, trust_remote_code=True, local_files_only=True)
            else:
                print(f"Model path not found: {tokenizer_model_path}")
        else:
            print("👉 HuggingFace ONLINE mode enabled")
            model_in_hf = get_property_modelfile(model_name, "HUGGINGFACE_PATH", rkllama.config.get_path("models")).replace('"', '').replace("'", "")

            # Get the tokenizer configured for the model
            tokenizer = AutoTokenizer.from_pretrained(model_in_hf, trust_remote_code=True)
        # 수정끝
            
        supports_system_role = "raise_exception('System role not supported')" not in tokenizer.chat_template
        
        if system and supports_system_role:
            prompt_messages = [{"role": "system", "content": system}] + messages
        else:
            prompt_messages = messages
        
        prompt_tokens = tokenizer.apply_chat_template(prompt_messages, tools=tools, tokenize=True, add_generation_prompt=True, enable_thinking=enable_thinking)

        return tokenizer, prompt_tokens, len(prompt_tokens)
728x90
728x90

출처

Armbian Linux v6.1

rknpu2 드라이버를 업데이트하려면 리눅스 커널을 직접 컴파일해야 해서, Armbian Linux v6.1을 새로 설치했습니다.

orangepi@orangepi5pro:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Armbian 26.2.0-trunk.71 noble
Release:        24.04
Codename:       noble

rknpu2 드라이버 버전확인

orangepi@orangepi5pro:~$ sudo cat /sys/kernel/debug/rknpu/version
RKNPU driver: v0.9.8

기존에 설치된 Python 설치 버전 확인

orangepi@orangepi5pro:~$ python3
Python 3.12.3 (main, Nov  6 2025, 13:44:16) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> quit()

기존에 설치된 Python 삭제

orangepi@orangepi5pro:~$ sudo apt remove python3 python3-pip python3-venv
orangepi@orangepi5pro:~$ sudo apt autoremove

Python 3.11 빌드에 필요한 패키지 설치

orangepi@orangepi5pro:~$ sudo apt update
orangepi@orangepi5pro:~$ sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev libsqlite3-dev wget libbz2-dev pkg-config lzma liblzma-dev cmake build-essential libx11-dev libxext-dev libxft-dev libxss-dev libxrender-dev libfontconfig1-dev libxinerama-dev libxrandr-dev libxcursor-dev libminizip-dev libbz2-dev liblzma-dev libzstd-dev libssl-dev zip unzip

tcl8.6 설치

orangepi@orangepi5pro:~/Llama$ tar xvf tcl8.6.16-src.tar.gz
orangepi@orangepi5pro:~/Llama$ cd tcl8.6.16/unix/
orangepi@orangepi5pro:~/Llama/tcl8.6.16/unix$ ./configure --prefix=/usr/local
orangepi@orangepi5pro:~/Llama/tcl8.6.16/unix$ make -j$(nproc)
orangepi@orangepi5pro:~/Llama/tcl8.6.16/unix$ sudo make install

tk8.6 설치

orangepi@orangepi5pro:~/Llama$ tar xvf tk8.6.16-src.tar.gz
orangepi@orangepi5pro:~/Llama$ cd tk8.6.16/unix/
orangepi@orangepi5pro:~/Llama/tk8.6.16/unix$ ./configure --prefix=/usr/local --with-tcl=/usr/local/lib
orangepi@orangepi5pro:~/Llama/tk8.6.16/unix$ make -j$(nproc)
orangepi@orangepi5pro:~/Llama/tk8.6.16/unix$ sudo make install

Python 3.11 소스 다운로드

orangepi@orangepi5pro:~/Llama$ wget https://www.python.org/ftp/python/3.11.14/Python-3.11.14.tgz

Python 3.11 소스 압축풀기

orangepi@orangepi5pro:~/Llama$ tar -xvf Python-3.11.14.tgz

configure 실행

orangepi@orangepi5pro:~/Llama$ cd Python-3.11.14/
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ export TCLTK_CFLAGS="-I/usr/local/include"
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ export TCLTK_LIBS="-L/usr/local/lib -ltcl8.6 -ltk8.6"
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ ./configure --enable-optimizations

Python 3.11 빌드

orangepi@orangepi5pro:~/Llama/Python-3.11.14$ grep -c processor /proc/cpuinfo
8
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ make -j$(nproc)
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ sudo make install
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ sudo ln -s /usr/local/bin/python3.11 /usr/local/bin/python

Python 3.11 설치 확인

orangepi@orangepi5pro:~/Llama/Python-3.11.14$ python --version
Python 3.11.14

pip 설치

orangepi@orangepi5pro:~/Llama/Python-3.11.14$ wget https://bootstrap.pypa.io/get-pip.py
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ python get-pip.py
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ pip3 install --upgrade pip
orangepi@orangepi5pro:~/Llama/Python-3.11.14$ pip3 --version
pip 25.3 from /home/orangepi/.local/lib/python3.11/site-packages/pip (python 3.11)

가상환경 만들기

orangepi@orangepi5pro:~/Llama$ source .venv/bin/activate
(.venv) orangepi@orangepi5pro:~/Llama$
728x90

+ Recent posts