labunix's blog

labunixのラボUnix

debian bookwormでGPUドライバの認識を確認する

■debian bookwormでGPUドライバの認識を確認する
 リモートで使っているので、そのうち使いみちは考えるとして、、、

$ lsb_release -d
No LSB modules are available.
Description:	Debian GNU/Linux 12 (bookworm)

■amdgpuとして認識している

$ lspci | awk '{for(a=1;a<=NF;a++){if($a ~ /VGA/){print "lspci -v -s "$1}}}' | sh
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c1) (prog-if 00 [VGA controller])
	Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series]
	Flags: bus master, fast devsel, latency 0, IRQ 57, IOMMU group 12
	Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Memory at e0000000 (64-bit, prefetchable) [size=2M]
	I/O ports at d000 [size=256]
	Memory at fcc00000 (32-bit, non-prefetchable) [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: amdgpu
	Kernel modules: amdgpu

$ lsmod | grep amdgpu
amdgpu               9596928  3
gpu_sched              53248  1 amdgpu
drm_buddy              20480  1 amdgpu
i2c_algo_bit           16384  1 amdgpu
drm_display_helper    184320  1 amdgpu
drm_ttm_helper         16384  1 amdgpu
ttm                    94208  2 amdgpu,drm_ttm_helper
drm_kms_helper        204800  4 drm_display_helper,amdgpu
drm                   614400  10 gpu_sched,drm_kms_helper,drm_display_helper,drm_buddy,amdgpu,drm_ttm_helper,ttm
video                  65536  1 amdgpu

■/dev/fb0としてデバイスが見えている

$ sudo lshw -C display
  *-display                 
       description: VGA compatible controller
       product: Cezanne [Radeon Vega Series / Radeon Vega Mobile Series]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci@0000:04:00.0
       logical name: /dev/fb0
       version: c1
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi msix vga_controller bus_master cap_list fb
       configuration: depth=32 driver=amdgpu latency=0 resolution=1024,768
       resources: irq:57 memory:d0000000-dfffffff memory:e0000000-e01fffff ioport:d000(size=256) memory:fcc00000-fcc7ffff

■GPUの情報を確認する

$ apt-cache search ^rocm-smi
rocm-smi - ROCm System Management Interface (ROCm SMI) command-line interface

$ sudo apt-get install -y rocm-smi

$ apt-cache show rocm-smi | grep ^Version
Version: 5.2.3-2

■公式のマニュアル

 https://rocm.docs.amd.com/en/latest/

■オプションなしの基本情報

$ rocm-smi


======================= ROCm System Management Interface =======================
================================= Concise Info =================================
Exception caught: map::at
ERROR: GPU[0] 		: sclk clock is unsupported
================================================================================
ERROR: 2 GPU[0]:RSMI_STATUS_NOT_SUPPORTED: This function is not supported in the current environment.	
GPU  Temp   AvgPwr  SCLK  MCLK     Fan  Perf  PwrCap       VRAM%  GPU%  
0    43.0c  19.0W   None  1200Mhz  0%   auto  Unsupported   10%   0%    
================================================================================
============================= End of ROCm SMI Log ==============================

■GPUは1枚

$ rocm-smi --showproductname


======================= ROCm System Management Interface =======================
================================= Product Info =================================
GPU[0]		: Card series: 		Cezanne [Radeon Vega Series / Radeon Vega Mobile Series]
GPU[0]		: Card model: 		0x0123
GPU[0]		: Card vendor: 		Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]		: Card SKU: 		CEZANN
================================================================================
============================= End of ROCm SMI Log ==============================

■VBIOSのバージョン

$ rocm-smi -v


======================= ROCm System Management Interface =======================
==================================== VBIOS =====================================
GPU[0]		: VBIOS version: 113-CEZANNE-020
================================================================================
============================= End of ROCm SMI Log ==============================

■PID/プロセス名を確認

$ rocm-smi --showpids


======================= ROCm System Management Interface =======================
================================ KFD Processes =================================
No KFD PIDs currently running
================================================================================
============================= End of ROCm SMI Log ==============================

■使用しているGPUのIDを確認(複数枚実装している場合)

$ rocm-smi --showpidgpus


======================= ROCm System Management Interface =======================
============================= GPUs Indexed by PID ==============================
No KFD PIDs currently running
================================================================================
============================= End of ROCm SMI Log ==============================