728x90

작업 디렉토리 생성

(.venv) bluesanta@bluesanta-desktop:~/llm$ sudo mkdir -p /opt/llama.cpp
(.venv) bluesanta@bluesanta-desktop:~/llm$ sudo chown bluesanta:bluesanta -Rf /opt/llama.cpp/

서비스 파일 생성

(.venv) bluesanta@bluesanta-desktop:~/llm$ sudo vi /etc/systemd/system/llama.service
[Unit]
Description=Llama.cpp Server Service
After=network.target

[Service]
# 사용자 계정
User=bluesanta
Group=bluesanta
WorkingDirectory=/opt/llama.cpp

# 최적화된 실행 명령어
ExecStart=/usr/local/bin/llama-server \
    -m /home/bluesanta/llm/models/gemma-4-26b-Q4_K_M.gguf \
    --host 0.0.0.0 \
    --port 8080 \
    --ctx-size 4096 \
    --n-gpu-layers 99 \
    --threads 12 \
    --no-mmap \
    --mlock \
    --cont-batching \
    --metrics
    
# 프로세스 종료 시 자동 재시작 설정
# Restart=always
# RestartSec=5

[Install]
WantedBy=multi-user.target

서비스 등록

(.venv) bluesanta@bluesanta-desktop:~/llm$ sudo systemctl enable llama.service
Created symlink /etc/systemd/system/multi-user.target.wants/llama.service → /etc/systemd/system/llama.service.

서비스 실행

(.venv) bluesanta@bluesanta-desktop:~/llm$ sudo systemctl start llama

서비스 상태 확인

(.venv) bluesanta@bluesanta-desktop:~/llm$ sudo systemctl status llama
● llama.service - Llama.cpp Server Service
     Loaded: loaded (/etc/systemd/system/llama.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2026-04-28 10:58:28 KST; 3min 47s ago
   Main PID: 8449 (llama-server)
      Tasks: 16 (limit: 74767)
     Memory: 10.7G
        CPU: 8.245s
     CGroup: /system.slice/llama.service
             └─8449 /usr/local/bin/llama-server -m /home/bluesanta/llm/models/gemma-4-26b-Q4_K_M.gguf --host 0.0.0.0 --port 8080 --ctx-size 4096 --n-gpu-layers 99 --threads 12 --no-mmap --mlock --cont-batching --metrics
 
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: Hi there<turn|>
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: <|turn>user
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: How are you?<turn|>
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: <|turn>model
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: '
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: srv          init: init: chat template, thinking = 1
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: main: model loaded
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: main: server is listening on http://0.0.0.0:8080
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: main: starting the main loop...
 4월 28 10:58:37 bluesanta-desktop llama-server[8449]: srv  update_slots: all slots are idle

확인

bluesanta@bluesanta-desktop:~$ curl http://localhost:8080/completion \
-H "Content-Type: application/json" \
-d '{
  "prompt": "Jetson AGX Orin의 장점 3가지는?",
  "n_predict": 256
}'
 
{"index":0,"content":"\n\n            \n\n most major or im3 \n most major or im3     \n most          \n most       thought\n<channel|>NVIDIA **Jetson AGX Orin**은 임베디드 시스템(Edge AI) 분야에서 현존하는 가장 강력한 성능을 제공하는 모듈 중 하나입니다. 질문하신 내용에 대해 3가지 핵심 장점을 정리해 드립니다.\n\n### 1. 압도적인 AI 연산 성능 (High Through-put)\nJetson AGX Orin의 가장 큰 장점은 동급 임베디드 모듈 중 최고 수준의 **275 TOPS**(INT8 기준)라는 강력한 AI 연산 능력을 갖추고 있다는 점입니다.\n*   **장점:** 복잡한 딥러닝 모델(CNN, Transformer 등)을 지연 시간(Latency) 없이 실시간으로 실행할 수 있습니다. 이는 자율주행 로봇이나 정밀 의료 기기처럼 초 단위의 빠른 판단이 필요한 시스템에서 결정적인 역할을 합니다.\n\n### 2. 탁월한 전력 효율성 (Performance per Watt)\nAGX Orin은 고성능을 유지하면서도 전력 소모를 정밀하게 제어할 수","tokens":[],"id_slot":3,"stop":true,"model":"gemma-4-26b-Q4_K_M.gguf","tokens_predicted":256,"tokens_evaluated":15,"generation_settings":{"seed":4294967295,"temperature":1.0,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":64,"top_p":0.949999988079071,"min_p":0.05000000074505806,"top_n_sigma":-1.0,"xtc_probability":0.0,"xtc_threshold":0.10000000149011612,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":0.0,"frequency_penalty":0.0,"dry_multiplier":0.0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":4096,"dry_sequence_breakers":["\n",":","\"","*"],"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"stop":[],"max_tokens":256,"n_predict":256,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"","grammar_lazy":false,"grammar_triggers":[],"preserved_tokens":[],"chat_format":"Content-only","reasoning_format":"deepseek","reasoning_in_content":false,"generation_prompt":"","samplers":["penalties","dry","top_n_sigma","top_k","typ_p","top_p","min_p","xtc","temperature"],"speculative.n_max":16,"speculative.n_min":0,"speculative.p_min":0.75,"speculative.type":"none","speculative.ngram_size_n":1024,"speculative.ngram_size_m":1024,"speculative.ngram_m_hits":1024,"timings_per_token":false,"post_sampling_probs":false,"backend_sampling":false,"lora":[]},"prompt":"Jetson AGX Orin의 장점 3가지는?","has_new_line":true,"truncated":false,"stop_type":"limit","stopping_word":"","tokens_cached":270,"timings":{"cache_n":0,"prompt_n":15,"prompt_ms":9557.021,"prompt_per_token_ms":637.1347333333334,"prompt_per_second":1.5695267385098348,"predicted_n":256,"predicted_ms":7686.338,"predicted_per_token_ms":30.0247578125,"predicted_per_second":33.305847335883485}}
728x90
728x90

출처

jetpack 버전 확인

bluesanta@ubuntu:~$ sudo apt show nvidia-jetpack
Package: nvidia-jetpack
Version: 6.2.2+b24
Priority: standard
Section: metapackages
Source: nvidia-jetpack (6.2.2)
Maintainer: NVIDIA Corporation
Installed-Size: 199 kB
Depends: nvidia-jetpack-runtime (= 6.2.2+b24), nvidia-jetpack-dev (= 6.2.2+b24)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 29.3 kB
APT-Sources: https://repo.download.nvidia.com/jetson/common r36.5/main arm64 Packages
Description: NVIDIA Jetpack Meta Package

CUDA 버전 확인

bluesanta@ubuntu:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:14:07_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0

가상환경 만들기

bluesanta@bluesanta-desktop:~$ cd llm
bluesanta@bluesanta-desktop:~/llm$ python -m venv .venv
bluesanta@bluesanta-desktop:~/llm$ source .venv/bin/activate
(.venv) bluesanta@bluesanta-desktop:~/llm$ 

빌드 관련 페키지 설치

bluesanta@ubuntu:~$ sudo apt install -y git cmake build-essential libopenblas-dev

llama.cpp 빌드

(.venv) bluesanta@ubuntu:~/llm$ git clone https://github.com/ggerganov/llama.cpp
(.venv) bluesanta@ubuntu:~/llm$ cd llama.cpp
(.venv) bluesanta@ubuntu:~/llm/llama.cpp$ cmake -B build -DGGML_CUDA=ON -DCMAKE_BUILD_TYPE=Release
(.venv) bluesanta@ubuntu:~/llm/llama.cpp$ cmake --build build -j$(nproc)
(.venv) bluesanta@ubuntu:~/llm/llama.cpp$ sudo cmake --install build

llama.cpp 설치 확인

bluesanta@ubuntu:~/llm/llama.cpp$ llama-cli --version
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 62827 MiB):
  Device 0: Orin, compute capability 8.7, VMM: yes, VRAM: 62827 MiB
version: 8845 (037bfe38d)
built with GNU 11.4.0 for Linux aarch64
728x90
728x90

출처

jetpack 버전 확인

bluesanta@ubuntu:~$ sudo apt show nvidia-jetpack
Package: nvidia-jetpack
Version: 6.2.2+b24
Priority: standard
Section: metapackages
Source: nvidia-jetpack (6.2.2)
Maintainer: NVIDIA Corporation
Installed-Size: 199 kB
Depends: nvidia-jetpack-runtime (= 6.2.2+b24), nvidia-jetpack-dev (= 6.2.2+b24)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 29.3 kB
APT-Sources: https://repo.download.nvidia.com/jetson/common r36.5/main arm64 Packages
Description: NVIDIA Jetpack Meta Package

CUDA 버전 확인

bluesanta@ubuntu:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:14:07_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0

가상환경 만들기

bluesanta@bluesanta-desktop:~$ cd llm
bluesanta@bluesanta-desktop:~/llm$ python -m venv .venv
bluesanta@bluesanta-desktop:~/llm$ source .venv/bin/activate
(.venv) bluesanta@bluesanta-desktop:~/llm$ 

모델 다운로드 (Hugging Face)

(.venv) bluesanta@ubuntu:~/llm$ pip install -U "huggingface_hub[cli]"
(.venv) bluesanta@ubuntu:~/llm$ hf download google/gemma-4-26B-A4B-it --local-dir ~/llm/models/gemma-4-original
(.venv) bluesanta@ubuntu:~/llm$ hf download RedHatAI/gemma-4-26B-A4B-it-NVFP4 --local-dir ~/llm/models/gemma-4-26B-A4B-it-NVFP4
(.venv) bluesanta@ubuntu:~/llm$ hf download nvidia/Gemma-4-31B-IT-NVFP4 --local-dir ~/llm/models/gemma-4-31b-it-nvfp4

pytorch 설치

기존 pytorch 제거

bluesanta@ubuntu:~/llm$ pip uninstall -y torch torchvision torchaudio

pytorch 다운로드

https://pypi.jetson-ai-lab.io/jp6/cu126

 

jp6/cu126 index

hloc-1.5 hloc-1.5-py3-none-any.whl

pypi.jetson-ai-lab.io

pytorch 설치

(.venv) bluesanta@bluesanta-desktop:~/llm/download$ pip install torch-2.11.0-cp310-cp310-linux_aarch64.whl

pytorch 확인

(.venv) bluesanta@bluesanta-desktop:~/llm/download$ python -c "import torch; print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"
True
Orin

cmake 설치

bluesanta@ubuntu:~/llm/vllm$ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
bluesanta@ubuntu:~/llm/vllm$ echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null
bluesanta@ubuntu:~/llm/vllm$ sudo apt update
bluesanta@ubuntu:~/llm/vllm$ cmake --version
cmake version 4.3.1
 
CMake suite maintained and supported by Kitware (kitware.com/cmake).

vLLM 설치

(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ pip uninstall -y vllm
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ git clone https://github.com/vllm-project/vllm.git
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ cd vllm
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ pip install setuptools_scm
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ pip install --upgrade pip setuptools setuptools-scm wheel
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ sudo apt install -y ninja-build 
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ MAX_JOBS=$(nproc) pip install -e . --user
 
Installing vllm script to /home/bluesanta/llm/.venv/bin

gemma-4-31b-it-nvfp4 실행 (메모리 부족으로 --gpu-memory-utilization 0.8 옵션 필요)

오류로 실행 불가 : (EngineCore pid=2446) ERROR 04-19 16:35:23 [core.py:1132] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")

fp8e4nv는 H100 (Hopper) GPU에서만 지원

bluesanta@ubuntu:~/llm$ vllm serve ~/llm/models/gemma-4-31b-it-nvfp4 --quantization modelopt --tensor-parallel-size 1 --gpu-memory-utilization 0.8
728x90
728x90

출처

jetpack 버전 확인

bluesanta@ubuntu:~$ sudo apt show nvidia-jetpack
Package: nvidia-jetpack
Version: 6.2.2+b24
Priority: standard
Section: metapackages
Source: nvidia-jetpack (6.2.2)
Maintainer: NVIDIA Corporation
Installed-Size: 199 kB
Depends: nvidia-jetpack-runtime (= 6.2.2+b24), nvidia-jetpack-dev (= 6.2.2+b24)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 29.3 kB
APT-Sources: https://repo.download.nvidia.com/jetson/common r36.5/main arm64 Packages
Description: NVIDIA Jetpack Meta Package

CUDA 버전 확인

bluesanta@ubuntu:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:14:07_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0

가상환경 만들기

bluesanta@bluesanta-desktop:~$ cd llm
bluesanta@bluesanta-desktop:~/llm$ python -m venv .venv
bluesanta@bluesanta-desktop:~/llm$ source .venv/bin/activate
(.venv) bluesanta@bluesanta-desktop:~/llm$ 

pytorch 설치

기존 pytorch 제거

bluesanta@ubuntu:~/llm$ pip uninstall -y torch torchvision torchaudio

pytorch 다운로드

https://pypi.jetson-ai-lab.io/jp6/cu126

 

jp6/cu126 index

hloc-1.5 hloc-1.5-py3-none-any.whl

pypi.jetson-ai-lab.io

pytorch 설치

(.venv) bluesanta@bluesanta-desktop:~/llm/download$ pip install torch-2.11.0-cp310-cp310-linux_aarch64.whl

pytorch 확인

(.venv) bluesanta@bluesanta-desktop:~/llm/download$ python -c "import torch; print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"
True
Orin

torchvision, torchaudio, numpy 설치

(.venv) bluesanta@ubuntu:~/llm/download$ pip install torchvision-0.26.0-cp310-cp310-linux_aarch64.whl
(.venv) bluesanta@ubuntu:~/llm/download$ pip install torchaudio-2.10.0-cp310-cp310-linux_aarch64.whl
(.venv) bluesanta@ubuntu:~/llm/download$ pip install "numpy<2.0"
728x90
728x90

출처

jetpack 버전 확인

bluesanta@ubuntu:~$ sudo apt show nvidia-jetpack
Package: nvidia-jetpack
Version: 6.2.2+b24
Priority: standard
Section: metapackages
Source: nvidia-jetpack (6.2.2)
Maintainer: NVIDIA Corporation
Installed-Size: 199 kB
Depends: nvidia-jetpack-runtime (= 6.2.2+b24), nvidia-jetpack-dev (= 6.2.2+b24)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 29.3 kB
APT-Sources: https://repo.download.nvidia.com/jetson/common r36.5/main arm64 Packages
Description: NVIDIA Jetpack Meta Package

CUDA 버전 확인 (cuda-toolkit 설치 이후)

bluesanta@ubuntu:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:14:07_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0

libcudss0 설치 (torch 예제 실행 후 필요한 경우)

bluesanta@ubuntu:~$ wget https://developer.download.nvidia.com/compute/cudss/0.6.0/local_installers/cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
bluesanta@ubuntu:~$ sudo dpkg -i cudss-local-tegra-repo-ubuntu2204-0.6.0_0.6.0-1_arm64.deb
bluesanta@ubuntu:~$ sudo cp /var/cudss-local-tegra-repo-ubuntu2204-0.6.0/cudss-*-keyring.gpg /usr/share/keyrings/
bluesanta@ubuntu:~$ sudo apt update
bluesanta@ubuntu:~$ sudo apt install -y libcudss0-cuda-12 libcudss0-dev-cuda-12
bluesanta@ubuntu:~$ find /usr -name "libcudss.so.0" 2>/dev/null
/usr/lib/aarch64-linux-gnu/libcudss/12/libcudss.so.0
bluesanta@ubuntu:~$ export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/libcudss/12:/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH

가상환경 만들기

bluesanta@ubuntu:~$ mkdir llm
bluesanta@ubuntu:~$ cd llm
bluesanta@ubuntu:~/llm$ python -m venv .venv
bluesanta@ubuntu:~/llm$ source .venv/bin/activate
(.venv) bluesanta@ubuntu:~/llm$ 

cmake 최신버전 설치

(.venv) bluesanta@ubuntu:~/llm$ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
(.venv) bluesanta@ubuntu:~/llm$ echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null
(.venv) bluesanta@ubuntu:~/llm$ sudo apt update
(.venv) bluesanta@ubuntu:~/llm$ sudo apt install -y cmake ninja-build 
(.venv) bluesanta@ubuntu:~/llm$ cmake --version
cmake version 4.3.1
 
CMake suite maintained and supported by Kitware (kitware.com/cmake).

cuda-toolkit 설치

(.venv) bluesanta@ubuntu:~/llm/download$ sudo apt install cuda-toolkit-12-6

환경변수 추가(~/.bashrc 파일에 내용 추가)

# 1. CUDA 홈 설정
export CUDA_HOME=/usr/local/cuda-12.6

# 2. 컴파일러가 헤더 파일을 찾을 수 있게 인클루드 경로 추가 (가장 중요)
export CPATH=$CUDA_HOME/targets/aarch64-linux/include:$CPATH
export TRITON_PTXAS_PATH=/usr/local/cuda-12.6/bin/ptxas

# 3. 라이브러리 경로 추가
export LD_LIBRARY_PATH=$CUDA_HOME/targets/aarch64-linux/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=$CUDA_HOME/targets/aarch64-linux/lib:$LIBRARY_PATH

# 4. 실행 파일 경로 추가
export PATH=$CUDA_HOME/bin:$PATH

pytorch 설치

기존 pytorch 제거

bluesanta@ubuntu:~/llm$ pip uninstall -y torch torchvision torchaudio

pytorch 다운로드

https://pypi.jetson-ai-lab.io/jp6/cu126

 

jp6/cu126 index

hloc-1.5 hloc-1.5-py3-none-any.whl

pypi.jetson-ai-lab.io

pytorch 설치

(.venv) bluesanta@ubuntu:~/llm/download$ pip install torch-2.11.0-cp310-cp310-linux_aarch64.whl
(.venv) bluesanta@ubuntu:~/llm/download$ pip install "numpy<2.0"

pytorch 확인

(.venv) bluesanta@ubuntu:~/llm/download$ python -c "import torch; print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"
True
Orin

vLLM 설치

(.venv) bluesanta@ubuntu:~/llm$ pip uninstall -y vllm
(.venv) bluesanta@ubuntu:~/llm$ git clone https://github.com/vllm-project/vllm.git
(.venv) bluesanta@ubuntu:~/llm$ cd vllm
(.venv) bluesanta@ubuntu:~/llm/vllm$ pip install setuptools_scm
(.venv) bluesanta@ubuntu:~/llm/vllm$ pip install --upgrade pip setuptools setuptools-scm wheel
(.venv) bluesanta@ubuntu:~/llm/vllm$ sudo apt install -y ninja-build
(.venv) bluesanta@ubuntu:~/llm/vllm$ MAX_JOBS=$(nproc) pip install -e .
 
Installing vllm script to /home/bluesanta/llm/.venv/bin

vLLM 설치확인

(.venv) bluesanta@ubuntu:~/llm/download$ vllm --version
0.19.2rc1.dev7+g38907e439.cu126
728x90
728x90

출처

Ollama Server 설치

bluesanta@bluesanta-desktop:~$ curl -fsSL https://ollama.com/install.sh | sh
>>> Installing ollama to /usr/local
>>> Downloading ollama-linux-arm64.tar.zst
######################################################################## 100.0%
>>> Downloading ollama-linux-arm64-jetpack6.tar.zst
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> NVIDIA JetPack ready.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

Ollama 실행

bluesanta@bluesanta-desktop:~$ ollama
Ollama 0.20.6
 
▸ Chat with a model
    Start an interactive chat with a model
 
  Launch OpenClaw (install)
    Personal AI with 100+ skills
 
  Launch Claude Code (not installed)
    Anthropic's coding tool with subagents
 
  Launch OpenCode (not installed)
    Anomaly's open-source coding agent
 
  More...
    Show additional integrations
 
 
↑/↓ navigate • enter launch • → configure • esc quit

gemma4 실행

bluesanta@bluesanta-desktop:~$ ollama run gemma4:31b-it-q8_0

성능 최적화

최고 성능 모드(MAXN) 활성화

bluesanta@bluesanta-desktop:~$ sudo nvpmodel -m 0
NVPM WARN: Golden image context is already created
NVPM WARN: Reboot required for changing to this power mode: 0
NVPM WARN: DO YOU WANT TO REBOOT NOW? enter YES/yes to confirm:
yes

Swapfile보다 물리 메모리를 사용률을 늘리기

bluesanta@bluesanta-desktop:~$ sudo sysctl -w vm.swappiness=10
vm.swappiness = 10

Ollama 최적화 환경 변수

bluesanta@bluesanta-desktop:~$ vi ~/.bashrc

~/.bashrc 내용 추가

export OLLAMA_NUM_PARALLEL=1
export OLLAMA_MAX_LOADED_MODELS=1
export OLLAMA_NUM_THREADS=12
export OLLAMA_KEEP_ALIVE=5m
728x90
728x90

Maven 다운로드

bluesanta@bluesanta-AI-Series:~$ wget https://dlcdn.apache.org/maven/maven-3/3.9.12/binaries/apache-maven-3.9.12-bin.tar.gz

압축 해제 및 이동

bluesanta@bluesanta-AI-Series:~$ tar xvf apache-maven-3.9.12-bin.tar.gz
bluesanta@bluesanta-AI-Series:~$ sudo mv apache-maven-3.9.12 /opt

환경 변수 설정

~/.bashrc 수정

bluesanta@bluesanta-AI-Series:/opt/apache-maven-3.9.12$ vi ~/.bashrc

내용 추가

export MAVEN_HOME=/opt/apache-maven-3.9.12
export PATH=$PATH:$MAVEN_HOME/bin

-

728x90
728x90

출처

디스크 상세 포맷 및 파티션 구조 확인

radxa@radxa-dragon-q6a:~$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mmcblk1     179:0    0  29.1G  0 disk 
├─mmcblk1p1 179:1    0    16M  0 part /config
├─mmcblk1p2 179:2    0     1G  0 part /boot/efi
└─mmcblk1p3 179:3    0  28.1G  0 part /
zram0       252:0    0   5.6G  0 disk [SWAP]
nvme0n1     259:0    0 465.8G  0 disk 

OS 다운로드

radxa@radxa-dragon-q6a:~$ wget https://github.com/radxa-build/radxa-dragon-q6a/releases/download/rsdk-r2/radxa-dragon-q6a_noble_gnome_r2.output_512.img.xz

압축 풀기

radxa@radxa-dragon-q6a:~$ unxz radxa-dragon-q6a_noble_gnome_r2.output_512.img.xz

OS 설치

radxa@radxa-dragon-q6a:~$ sudo dd if=radxa-dragon-q6a_noble_gnome_r2.output_512.img of=/dev/nvme0n1 bs=4M status=progress
5268045824 bytes (5.3 GB, 4.9 GiB) copied, 3 s, 1.8 GB/s
1431+1 records in
1431+1 records out
6003352576 bytes (6.0 GB, 5.6 GiB) copied, 5.56137 s, 1.1 GB/s

설치후 디스크 상세 포맷 및 파티션 구조 확인

radxa@radxa-dragon-q6a:~$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mmcblk1     179:0    0  29.1G  0 disk 
├─mmcblk1p1 179:1    0    16M  0 part /config
├─mmcblk1p2 179:2    0     1G  0 part /boot/efi
└─mmcblk1p3 179:3    0  28.1G  0 part /
zram0       252:0    0   5.6G  0 disk [SWAP]
nvme0n1     259:0    0 465.8G  0 disk 
├─nvme0n1p1 259:1    0    16M  0 part 
├─nvme0n1p2 259:2    0     1G  0 part 
└─nvme0n1p3 259:3    0   4.6G  0 part 

SD 메모리 카드를 제거한 후 재부팅

raxda / raxda

728x90

+ Recent posts