Skip to content

Instantly share code, notes, and snippets.

@warmonkey
Last active November 2, 2024 03:09
Show Gist options
  • Save warmonkey/391721ee1b196773ed259baaab10f18b to your computer and use it in GitHub Desktop.
Save warmonkey/391721ee1b196773ed259baaab10f18b to your computer and use it in GitHub Desktop.
Use ROCm and PyTorch on AMD integrated graphics (iGPU, Ryzen 7 5825u)

UNVERIFIED YET - WAIT FOR UPDATES

pytorch/pytorch#94891 (comment) The pcie atomic issue should be fixed already. Currently i cannot verify.

  1. Install PyTorch with ROCm support
    Following offical installation guide: https://pytorch.org/get-started/locally/#linux-installation
    Choose [Stable] -> [Linux] -> [Pip] -> [Python] -> [ROCm], It should be something like:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2

Remember the ROCm version here.

  1. Install ROCm drivers
dpkg -i ./amdgpu-install*.deb
  • Run the installation script
amdgpu-install --usecase=graphics,rocm,opencl -y --accept-eula

Note: Ryzen 7 5825u iGPU architecture is Vega 8, which suppose to use legacy opencl.
If you are using other AMD GPU or APU, modifications may required.

  • Add current user to groups
    To access device /dev/kfd, /dev/dri/card0 and /dev/dri/renderD*, current user must be added to group render and video.
sudo usermod -a -G render $LOGNAME
sudo usermod -a -G video $LOGNAME

If not added, only root is allowed to use ROCm

  • Reboot the system
  1. Add environment variables in .bashrc
    Ryzen 7 5825u is gfx90c, should be compatible with gfx900. We force ROCm to treat it as gfx900.
export PYTORCH_ROCM_ARCH=gfx900
export HSA_OVERRIDE_GFX_VERSION=9.0.0
  1. Check iGPU status
rocm-smi

From the output, you can see GPU[0].

======================= ROCm System Management Interface =======================
================================= Concise Info =================================
ERROR: GPU[0]	: sclk clock is unsupported
================================================================================
GPU[0]		: Not supported on the given system
GPU  Temp (DieEdge)  AvgPwr  SCLK  MCLK     Fan  Perf  PwrCap       VRAM%  GPU%  
0    43.0c           0.003W  None  1200Mhz  0%   auto  Unsupported   43%   0%    
================================================================================
============================= End of ROCm SMI Log ==============================

Also, you can check OpenCL status

clinfo

From the output you can see GPU has been detected.

Number of platforms:				 1
  Platform Profile:				 FULL_PROFILE
  Platform Version:				 OpenCL 2.1 AMD-APP (3513.0)
  Platform Name:				 AMD Accelerated Parallel Processing
  Platform Vendor:				 Advanced Micro Devices, Inc.
  Platform Extensions:				 cl_khr_icd cl_amd_event_callback 


  Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 1
  Device Type:					 CL_DEVICE_TYPE_GPU
  Vendor ID:					 1002h
  Board name:					 
  Device Topology:				 PCI[ B#4, D#0, F#0 ]
  Max compute units:				 8
  Max work items dimensions:			 3
    Max work items[0]:				 1024
    Max work items[1]:				 1024
    Max work items[2]:				 1024
  Max work group size:				 256
  1. Test run
import torch
print(torch.cuda.device_count())
for i in range(torch.cuda.device_count()):
   print(torch.cuda.get_device_properties(i))

Output:

1
_CudaDeviceProperties(name='AMD Radeon Graphics', major=9, minor=0, total_memory=1024MB, multi_processor_count=8)
@warmonkey
Copy link
Author

@warmonkey Thank you but i am afraid to try it because last time i tried something similar and it bricked my Ubuntu system completely placing it in infinite boot loop . i hope AMD does something officially and then we can try that . AMD at the position who can only take on NVIDIA but it is lacking software support for their GPUs, I hope this improves and AMD starts supporting for all the GPUs including the integrated graphics ones.

then use windows + directml + torch

@hemangjoshi37a
Copy link

hemangjoshi37a commented Nov 2, 2024

@warmonkey I hate windows OS , I daily drive Linux (Ubuntu latest). I hate windows because it's closed and very much less optimized for high performance tasks such as machine learning etc. I dont know why people at AMD do not understand that they need to support Linux more than windows because Linux is where all the serious programming happens not at windows. Windows is the OS for noobs basically not for pros.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment