UNVERIFIED YET - WAIT FOR UPDATES

pytorch/pytorch#94891 (comment) The pcie atomic issue should be fixed already. Currently i cannot verify.

Install PyTorch with ROCm support
Following offical installation guide: https://pytorch.org/get-started/locally/#linux-installation
Choose [Stable] -> [Linux] -> [Pip] -> [Python] -> [ROCm], It should be something like:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2

Remember the ROCm version here.

Install ROCm drivers

Download the installation script, MUST BE THE SAME VERSION AS PYTORCH
https://repo.radeon.com/amdgpu-install/5.4.2/ubuntu/focal/
File name should be: amdgpu-install_[version]_all.deb
Install the deb package

dpkg -i ./amdgpu-install*.deb

Run the installation script

amdgpu-install --usecase=graphics,rocm,opencl -y --accept-eula

Note: Ryzen 7 5825u iGPU architecture is Vega 8, which suppose to use legacy opencl.
If you are using other AMD GPU or APU, modifications may required.

Add current user to groups
To access device /dev/kfd, /dev/dri/card0 and /dev/dri/renderD*, current user must be added to group render and video.

sudo usermod -a -G render $LOGNAME
sudo usermod -a -G video $LOGNAME

If not added, only root is allowed to use ROCm

Reboot the system

Add environment variables in .bashrc
Ryzen 7 5825u is gfx90c, should be compatible with gfx900. We force ROCm to treat it as gfx900.

export PYTORCH_ROCM_ARCH=gfx900
export HSA_OVERRIDE_GFX_VERSION=9.0.0

Check iGPU status

rocm-smi

From the output, you can see GPU[0].

======================= ROCm System Management Interface =======================
================================= Concise Info =================================
ERROR: GPU[0]	: sclk clock is unsupported
================================================================================
GPU[0]		: Not supported on the given system
GPU  Temp (DieEdge)  AvgPwr  SCLK  MCLK     Fan  Perf  PwrCap       VRAM%  GPU%  
0    43.0c           0.003W  None  1200Mhz  0%   auto  Unsupported   43%   0%    
================================================================================
============================= End of ROCm SMI Log ==============================

Also, you can check OpenCL status

clinfo

From the output you can see GPU has been detected.

Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.1 AMD-APP (3513.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 
  Device Topology:				 PCI[ B#4, D#0, F#0 ]
  Max compute units:				 8
  Max work items dimensions:			 3
    Max work items[0]:				 1024
    Max work items[1]:				 1024
    Max work items[2]:				 1024
  Max work group size:				 256

Test run

import torch
print(torch.cuda.device_count())
for i in range(torch.cuda.device_count()):
   print(torch.cuda.get_device_properties(i))

Output:

1
_CudaDeviceProperties(name='AMD Radeon Graphics', major=9, minor=0, total_memory=1024MB, multi_processor_count=8)

warmonkey/rocm_pytorch_on_amd_integrated_gpu.md

UNVERIFIED YET - WAIT FOR UPDATES

warmonkey commented Nov 2, 2024

hemangjoshi37a commented Nov 2, 2024 •

edited

Loading

warmonkey/rocm_pytorch_on_amd_integrated_gpu.md

UNVERIFIED YET - WAIT FOR UPDATES

warmonkey commented Nov 2, 2024

hemangjoshi37a commented Nov 2, 2024 • edited Loading

hemangjoshi37a commented Nov 2, 2024 •

edited

Loading