출처
jetpack 버전 확인
bluesanta@ubuntu:~$ sudo apt show nvidia-jetpack
Package: nvidia-jetpack
Version: 6.2.2+b24
Priority: standard
Section: metapackages
Source: nvidia-jetpack (6.2.2)
Maintainer: NVIDIA Corporation
Installed-Size: 199 kB
Depends: nvidia-jetpack-runtime (= 6.2.2+b24), nvidia-jetpack-dev (= 6.2.2+b24)
Homepage: http://developer.nvidia.com/jetson
Download-Size: 29.3 kB
APT-Sources: https://repo.download.nvidia.com/jetson/common r36.5/main arm64 Packages
Description: NVIDIA Jetpack Meta Package
CUDA 버전 확인
bluesanta@ubuntu:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:14:07_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0
가상환경 만들기
bluesanta@bluesanta-desktop:~$ cd llm
bluesanta@bluesanta-desktop:~/llm$ python -m venv .venv
bluesanta@bluesanta-desktop:~/llm$ source .venv/bin/activate
(.venv) bluesanta@bluesanta-desktop:~/llm$
모델 다운로드 (Hugging Face)
(.venv) bluesanta@ubuntu:~/llm$ pip install -U "huggingface_hub[cli]"
(.venv) bluesanta@ubuntu:~/llm$ hf download google/gemma-4-26B-A4B-it --local-dir ~/llm/models/gemma-4-original
(.venv) bluesanta@ubuntu:~/llm$ hf download RedHatAI/gemma-4-26B-A4B-it-NVFP4 --local-dir ~/llm/models/gemma-4-26B-A4B-it-NVFP4
(.venv) bluesanta@ubuntu:~/llm$ hf download nvidia/Gemma-4-31B-IT-NVFP4 --local-dir ~/llm/models/gemma-4-31b-it-nvfp4
pytorch 설치
기존 pytorch 제거
bluesanta@ubuntu:~/llm$ pip uninstall -y torch torchvision torchaudio
pytorch 다운로드
https://pypi.jetson-ai-lab.io/jp6/cu126
jp6/cu126 index
hloc-1.5 hloc-1.5-py3-none-any.whl
pypi.jetson-ai-lab.io
pytorch 설치
(.venv) bluesanta@bluesanta-desktop:~/llm/download$ pip install torch-2.11.0-cp310-cp310-linux_aarch64.whl
pytorch 확인
(.venv) bluesanta@bluesanta-desktop:~/llm/download$ python -c "import torch; print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))"
True
Orin
cmake 설치
bluesanta@ubuntu:~/llm/vllm$ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
bluesanta@ubuntu:~/llm/vllm$ echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null
bluesanta@ubuntu:~/llm/vllm$ sudo apt update
bluesanta@ubuntu:~/llm/vllm$ cmake --version
cmake version 4.3.1
CMake suite maintained and supported by Kitware (kitware.com/cmake).
vLLM 설치
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ pip uninstall -y vllm
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ git clone https://github.com/vllm-project/vllm.git
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ cd vllm
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ pip install setuptools_scm
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ pip install --upgrade pip setuptools setuptools-scm wheel
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ sudo apt install -y ninja-build
(.venv) bluesanta@bluesanta-desktop:~/llm/vllm$ MAX_JOBS=$(nproc) pip install -e . --user
Installing vllm script to /home/bluesanta/llm/.venv/bin
gemma-4-31b-it-nvfp4 실행 (메모리 부족으로 --gpu-memory-utilization 0.8 옵션 필요)
오류로 실행 불가 : (EngineCore pid=2446) ERROR 04-19 16:35:23 [core.py:1132] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
fp8e4nv는 H100 (Hopper) GPU에서만 지원
bluesanta@ubuntu:~/llm$ vllm serve ~/llm/models/gemma-4-31b-it-nvfp4 --quantization modelopt --tensor-parallel-size 1 --gpu-memory-utilization 0.8