Skip to content

Instantly share code, notes, and snippets.

@Hermann-SW
Last active December 21, 2024 09:23
Show Gist options
  • Save Hermann-SW/e4cfaac17994e4860bb08b71078a5126 to your computer and use it in GitHub Desktop.
Save Hermann-SW/e4cfaac17994e4860bb08b71078a5126 to your computer and use it in GitHub Desktop.
Use gpuowl pre_c++20 branch with rocm/dev-ubuntu-18.04 for PRE_VEGA "Radeon RX 480" GPU
#!/bin/bash
set -x
apt-get update 2>&1 > /dev/null
apt-get install -y git 2>&1 > /dev/null
usermod -a -G video root
echo ROC_ENABLE_PRE_VEGA=1 >> /etc/environment
echo HSA_OVERRIDE_GFX_VERSION=8.0.3 >> /etc/environment
cat > /root/doit <<EOF
echo -e "\nNow executing /root/doit"
set -x
groups
/opt/rocm/bin/clinfo | grep "Number"
/opt/rocm/bin/clinfo | grep Board
apt install -y software-properties-common 2>&1 > /dev/null
add-apt-repository ppa:ubuntu-toolchain-r/test -y 2>&1 > /dev/null
apt-get install -y "g++-9" 2>&1 > /dev/null
apt-get install -y libgmp-dev 2>&1 > /dev/null
apt-get install -y vim 2>&1 > /dev/null
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 10
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-9 10
git clone -b pre_c++20 https://github.com/Hermann-SW/gpuowl.git 2>&1 > /dev/null
cd gpuowl/
sed -i "s#\(lOpenCL\)#\1 -L/opt/rocm/lib#" Makefile
make amd > /dev/null
echo "PRP=0,1,2,859433,-1,77,0" > worktodo.txt
build-release/prpll-amd
EOF
# execute created /root/doit script
chmod 755 /root/doit
su - root ./doit
echo -e "\nEnter login shell for further work"
su -l - root
@Hermann-SW
Copy link
Author

hermann@7600x:~$ ls -l /dev/dri/renderD*
crw-rw----+ 1 root render 226, 128 Dec 18 22:32 /dev/dri/renderD128
hermann@7600x:~$ 

apt installing, git cloning and compiling takes half a minute only:

hermann@7600x:~$ time ( docker run -it -v ./openowl:/openowl --device /dev/kfd --device /dev/dri/renderD128 rocm/dev-ubuntu-18.04 /openowl )
+ apt-get update
E: The repository 'https://repo.radeon.com/amdgpu/22.20.3/ubuntu bionic Release' does not have a Release file.
+ apt-get install -y git
debconf: delaying package configuration, since apt-utils is not installed
+ usermod -a -G video root
+ echo ROC_ENABLE_PRE_VEGA=1
+ echo HSA_OVERRIDE_GFX_VERSION=8.0.3
+ cat
+ chmod 755 /root/doit
+ su - root ./doit

Now executing /root/doit
+ groups
root video
+ /opt/rocm/bin/clinfo
+ grep Number
Number of platforms:				 1
Number of devices:				 1
+ /opt/rocm/bin/clinfo
+ grep Board
  Board name:					 AMD Radeon (TM) RX 480 Graphics
+ apt install -y software-properties-common

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

debconf: delaying package configuration, since apt-utils is not installed
+ add-apt-repository ppa:ubuntu-toolchain-r/test -y
E: The repository 'https://repo.radeon.com/amdgpu/22.20.3/ubuntu bionic Release' does not have a Release file.
+ apt-get install -y g++-9
debconf: delaying package configuration, since apt-utils is not installed
+ apt-get install -y libgmp-dev
debconf: delaying package configuration, since apt-utils is not installed
+ git clone https://github.com/preda/gpuowl.git
Cloning into 'gpuowl'...
remote: Enumerating objects: 10486, done.
remote: Counting objects: 100% (1857/1857), done.
remote: Compressing objects: 100% (266/266), done.
remote: Total 10486 (delta 1684), reused 1634 (delta 1591), pack-reused 8629 (from 2)
Receiving objects: 100% (10486/10486), 13.91 MiB | 18.55 MiB/s, done.
Resolving deltas: 100% (7742/7742), done.
+ cd gpuowl/
+ git checkout 41ab5bdf551ec9da354fe712380c0d1ca3ef6469
Note: checking out '41ab5bdf551ec9da354fe712380c0d1ca3ef6469'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 41ab5bd Argument parsing with regex and strings
+ sed -i 's#\(L\.\)#\1 -L/opt/rocm/lib#' Makefile
+ make
+ echo PRP=0,1,2,19937,-1,77,0
+ ./openowl
2024-12-18 21:43:32 gpuowl 6.4-41ab5bd-mod
2024-12-18 21:43:32 Args: 
2024-12-18 21:43:32 19937 FFT 8K: Width 8x8, Height 8x8; 2.43 bits/word
2024-12-18 21:43:32 using long carry kernels
2024-12-18 21:43:33 warning: argument unused during compilation: '-I .' [-Wunused-command-line-argument]
1 warning generated.

2024-12-18 21:43:33 OpenCL compilation in 710 ms, with "-DEXP=19937u -DWIDTH=64u -DSMALL_HEIGHT=64u -DMIDDLE=1u  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2024-12-18 21:43:33 19937.owl not found, starting from the beginning.
2024-12-18 21:43:33 19937 OK      800  4.00%; 0.12 ms/sq; ETA 0d 00:00; 7aa6c3340ce46bab (check 0.05s)
2024-12-18 21:43:34 19937       10000 50.00%; 0.12 ms/sq; ETA 0d 00:00; 6248f957ba3ee3c5
2024-12-18 21:43:35 PP    19936 / 19937, fffffffffffffffc
2024-12-18 21:43:36 19937 OK    20000 100.00%; 0.12 ms/sq; ETA 0d 00:00; f5eb5782c7855ffd (check 0.05s)
2024-12-18 21:43:36 {"exponent":"19937", "worktype":"PRP-3", "status":"P", "program":{"name":"gpuowl", "version":"6.4-41ab5bd-mod"}, "timestamp":"2024-12-18 21:43:36 UTC", "aid":"0", "fft-length":8192, "res64":"fffffffffffffffc", "residue-type":4}
2024-12-18 21:43:36 Bye
+ echo -e '\nEnter login shell for further work'

Enter login shell for further work
+ su -l - root
root@d21db38fc686:~# logout

real	0m32.124s
user	0m0.003s
sys	0m0.017s
hermann@7600x:~$ 

@Hermann-SW
Copy link
Author

Hermann-SW commented Dec 20, 2024

Revision 2 jumps forward 5 years to this commit from 4/25/2024:
preda/gpuowl@9ce7f68

Many changes needed, now minimal exponent that can be processed is 859433 and used in demo.
This version is with proof generation:

hermann@7600x:~$ docker run -it -v ./openowl:/openowl --device /dev/kfd --device /dev/dri/renderD128 rocm/dev-ubuntu-18.04 /openowl
+ apt-get update
E: The repository 'https://repo.radeon.com/amdgpu/22.20.3/ubuntu bionic Release' does not have a Release file.
+ apt-get install -y git
debconf: delaying package configuration, since apt-utils is not installed
+ usermod -a -G video root
+ echo ROC_ENABLE_PRE_VEGA=1
+ echo HSA_OVERRIDE_GFX_VERSION=8.0.3
+ cat
+ chmod 755 /root/doit
+ su - root ./doit

Now executing /root/doit
+ groups
root video
+ /opt/rocm/bin/clinfo
+ grep Number
Number of platforms:				 1
Number of devices:				 1
+ /opt/rocm/bin/clinfo
+ grep Board
  Board name:					 AMD Radeon (TM) RX 480 Graphics
+ apt install -y software-properties-common

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

debconf: delaying package configuration, since apt-utils is not installed
+ add-apt-repository ppa:ubuntu-toolchain-r/test -y
E: The repository 'https://repo.radeon.com/amdgpu/22.20.3/ubuntu bionic Release' does not have a Release file.
+ apt-get install -y g++-9
debconf: delaying package configuration, since apt-utils is not installed
+ apt-get install -y libgmp-dev
debconf: delaying package configuration, since apt-utils is not installed
+ update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 10
update-alternatives: using /usr/bin/gcc-9 to provide /usr/bin/gcc (gcc) in auto mode
+ update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-9 10
update-alternatives: using /usr/bin/g++-9 to provide /usr/bin/g++ (g++) in auto mode
+ git clone https://github.com/preda/gpuowl.git
Cloning into 'gpuowl'...
remote: Enumerating objects: 10498, done.
remote: Counting objects: 100% (1869/1869), done.
remote: Compressing objects: 100% (274/274), done.
remote: Total 10498 (delta 1694), reused 1640 (delta 1595), pack-reused 8629 (from 2)
Receiving objects: 100% (10498/10498), 13.91 MiB | 21.52 MiB/s, done.
Resolving deltas: 100% (7752/7752), done.
+ cd gpuowl/
+ git checkout 9ce7f683900e4cf0434d7d2354a3ee195f6bd561
Note: checking out '9ce7f683900e4cf0434d7d2354a3ee195f6bd561'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 9ce7f68 Add instance nb to Worktodo getTask()
+ sed -i 's#\(lOpenCL\)#\1 -L/opt/rocm/lib#' Makefile
+ make amd
+ echo PRP=0,1,2,859433,-1,77,0
+ build-release/prpll-amd
20241220 00:00:33  PRPLL 9ce7f68-dirty
20241220 00:00:33  PRPLL 9ce7f68-dirty
20241220 00:00:34  device 0, unique id '', driver '3452.0 (HSA1.1,LC)'
20241220 00:00:34 859433 FFT: 256K 256:2:256 (3.28 bpw)
20241220 00:00:34 859433 Using long carry
20241220 00:00:34 859433 Stats: 0
20241220 00:00:34 859433 OpenCL context 3452.0 (HSA1.1,LC):gfx803, args -cl-finite-math-only -cl-std=CL2.0  -DEXP=859433u -DWIDTH=256u -DSMALL_HEIGHT=256u -DMIDDLE=2u -DAMDGPU=1 -DWEIGHT_STEP=0.64892214771249701 -DIWEIGHT_STEP=-0.39354322980786466 -DIWEIGHTS={0,-0.26442037177624717,-0.45892261054220551,-0.20398899004471363,-0.41447051723507644,-0.13859288150746241,-0.36636647202996497,-0.067824170131393469,} -DFWEIGHTS={0,0.35947212460839689,0.84816445758726855,0.25626403089094474,0.70785593114442136,0.16089126561897926,0.5781993153103645,0.072758988120249279,}
20241220 00:00:36 859433 OpenCL compilation: 2.03s
20241220 00:00:36 859433 OK         0 on-load: blockSize 400, 0000000000000003
20241220 00:00:36 859433 Proof of power 6 requires about 0.0GB of disk space
20241220 00:00:36 859433 OK       800   0.09% 6f20f874c7a62afb  261 us/it + check 0.09s + save 0.00s; ETA 00:04
20241220 00:00:36 859433 Profile:
 25.56% tailSquare    49 x  1599  1.4 0.4
 16.34% fftMidOut     31 x  1630  1.4 0.4
 14.18% fftW          27 x  1630  1.4 0.5
 13.64% fftMidIn      25 x  1661  1.4 0.4
 13.36% fftP          25 x  1661  1.4 0.4
 10.76% kCarryA       20 x  1628  1.4 0.5
  3.55% carryB         7 x  1630  1.4 0.5
  1.07% bufSmallOut    4 x   801  0.0 0.2
  0.85% tailMul       85 x    31  1.9 0.7
  0.51% readResidue    2 x   801  0.0 0.2
20241220 00:00:39 859433     10000 21bc9a2e362200a7  337
20241220 00:00:43 859433     20000 a4f59592661e3ddb  366
20241220 00:00:47 859433     30000 1f938349adc812a4  386
20241220 00:00:51 859433     40000 7c18a1b78f82ab9c  392
20241220 00:00:55 859433     50000 5e91b506bb419030  406
20241220 00:00:59 859433     60000 ad61d39d3f3ef9eb  388
20241220 00:01:03 859433     70000 cdadd593d163ea7d  412
20241220 00:01:07 859433     80000 c430e19a95491108  415
20241220 00:01:11 859433     90000 454a2cbf2ee63940  415
20241220 00:01:15 859433    100000 e84d13846dceca1f  393
20241220 00:01:19 859433    110000 6170f0cd1a73a78b  417
20241220 00:01:24 859433    120000 284efc47168d3ae2  418
20241220 00:01:28 859433    130000 ad1450e725082626  414
20241220 00:01:32 859433    140000 8da7ecf5b99b1061  392
20241220 00:01:36 859433    150000 c03be3fe4af8ab4b  414
20241220 00:01:40 859433    160000 c70728a4550662e4  414
20241220 00:01:44 859433    170000 d67b1ae63863d670  415
20241220 00:01:48 859433    180000 bc69fb966badb751  391
20241220 00:01:52 859433    190000 5438c67d27cdd1f0  412
20241220 00:01:56 859433 OK    200000  23.27% e255caf2d5510c4c  414 us/it + check 0.09s + save 0.00s; ETA 00:05
20241220 00:01:56 859433 Profile:
 25.40% tailSquare    78 x199600  0.1 0.1
 16.23% fftMidOut     50 x200098  0.1 0.1
 14.68% fftMidIn      45 x200596  0.1 0.0
 13.90% fftP          43 x200596  0.0 0.0
 13.63% fftW          42 x200098  0.1 0.2
  9.71% kCarryA       30 x200097  0.1 0.2
  3.61% carryB        11 x200098  0.1 0.3
  1.83% bufSmallOut    6 x199200  0.1 0.3
  0.89% readResidue    3 x199200  0.1 0.3
20241220 00:02:00 859433    210000 84656e7129221c17  414
20241220 00:02:04 859433    220000 2336d58d7dfe3fc0  394
20241220 00:02:08 859433    230000 faace886636a3424  407
20241220 00:02:13 859433    240000 78a8e4a8e0763a0b  413
20241220 00:02:17 859433    250000 53e67da25691516e  411
20241220 00:02:21 859433    260000 d0e2a544a0c3f3cd  390
20241220 00:02:25 859433    270000 9e9d1fd4af6be12e  413
20241220 00:02:29 859433    280000 21ab4cd1773a3899  412
20241220 00:02:33 859433    290000 4718ff00bd3f17e9  390
20241220 00:02:37 859433    300000 8cf496f649c6b2c8  412
20241220 00:02:41 859433    310000 f72be17d57aba872  413
20241220 00:02:45 859433    320000 ac6cea57958f9ced  413
20241220 00:02:49 859433    330000 98e573923b73969b  392
20241220 00:02:53 859433    340000 8682ea3f3198e1a3  416
20241220 00:02:57 859433    350000 d4d61df2c11c2595  419
20241220 00:03:01 859433    360000 a44384929b2bfeb9  392
20241220 00:03:05 859433    370000 60c0b1be5cd558e0  415
20241220 00:03:10 859433    380000 431f6a333c41bfe9  416
20241220 00:03:14 859433    390000 7575938b49827771  411
20241220 00:03:18 859433 OK    400000  46.53% 25e36787131ad5fb  395 us/it + check 0.18s + save 0.00s; ETA 00:03
20241220 00:03:18 859433 Profile:
 25.36% tailSquare    79 x200400  0.1 0.1
 16.25% fftMidOut     51 x200900  0.1 0.1
 14.72% fftMidIn      46 x201400  0.1 0.0
 13.92% fftP          43 x201400  0.0 0.0
 13.62% fftW          42 x200900  0.1 0.2
  9.70% kCarryA       30 x200899  0.1 0.2
  3.61% carryB        11 x200900  0.1 0.3
  1.82% bufSmallOut    6 x200000  0.1 0.3
  0.88% readResidue    3 x200000  0.1 0.3
20241220 00:03:22 859433    410000 6458306451bb926c  406
20241220 00:03:26 859433    420000 27c8cfdbb1468075  416
20241220 00:03:30 859433    430000 d21960e53f9776b8  416
20241220 00:03:34 859433    440000 7be7f8b9b847a331  393
20241220 00:03:38 859433    450000 f075bd7363926d4d  413
20241220 00:03:42 859433    460000 ee536f078d5fde23  416
20241220 00:03:47 859433    470000 2d4531cdbfcae6d6  415
20241220 00:03:51 859433    480000 9972b4bd564f49f0  394
20241220 00:03:55 859433    490000 b4f5cc3a2f558dad  415
20241220 00:03:59 859433    500000 a2a3480ca547149d  416
20241220 00:04:03 859433    510000 57afd1242174cb5c  411
20241220 00:04:07 859433    520000 7983318fdd9f1e5b  398
20241220 00:04:11 859433    530000 c3a4c1d48d8a9904  418
20241220 00:04:15 859433    540000 25c246e2f4adf539  416
20241220 00:04:19 859433    550000 d2dce53a81be6439  408
20241220 00:04:23 859433    560000 3d83e11788355eb2  401
20241220 00:04:28 859433    570000 c1c4ab5da10bc46b  419
20241220 00:04:32 859433    580000 a5511ff3f2859951  418
20241220 00:04:36 859433    590000 fb47b2d812d79afa  410
20241220 00:04:40 859433 OK    600000  69.80% 46f9f51d665b1d5a  400 us/it + check 0.31s + save 0.00s; ETA 00:02
20241220 00:04:40 859433 Profile:
 25.33% tailSquare    80 x200400  0.1 0.1
 16.26% fftMidOut     51 x200900  0.1 0.1
 14.75% fftMidIn      46 x201400  0.1 0.0
 13.94% fftP          44 x201400  0.0 0.0
 13.61% fftW          43 x200900  0.1 0.2
  9.70% kCarryA       31 x200899  0.1 0.3
  3.62% carryB        11 x200900  0.1 0.3
  1.81% bufSmallOut    6 x200000  0.1 0.3
  0.88% readResidue    3 x200000  0.1 0.3
20241220 00:04:44 859433    610000 0ab495a58e1fbbc1  395
20241220 00:04:48 859433    620000 b7036b0e55ec15c6  418
20241220 00:04:52 859433    630000 e7efa89fe6e412a5  418
20241220 00:04:57 859433    640000 ab3efca20120fab1  419
20241220 00:05:01 859433    650000 47178347cac95589  402
20241220 00:05:05 859433    660000 2ecda57fccf777a0  409
20241220 00:05:09 859433    670000 e64063f729df4678  417
20241220 00:05:13 859433    680000 979ba9adeac90f18  419
20241220 00:05:17 859433    690000 e2100d6a025632f3  419
20241220 00:05:21 859433    700000 6f91aa5b881a0103  400
20241220 00:05:25 859433    710000 2fa34c075f8de721  414
20241220 00:05:30 859433    720000 45041e99a8446485  418
20241220 00:05:34 859433    730000 435d4405bb3faa95  419
20241220 00:05:38 859433    740000 7951c1c57533ec65  420
20241220 00:05:42 859433    750000 51a188fe1b779061  417
20241220 00:05:46 859433    760000 3a8c04e89ccecdae  404
20241220 00:05:50 859433    770000 4ac00c9275108f4e  423
20241220 00:05:55 859433    780000 8b0c6e36428e1291  422
20241220 00:05:59 859433    790000 6b4d41204a61297e  421
20241220 00:06:03 859433 OK    800000  93.07% 81f457ad5e2573af  406 us/it + check 0.26s + save 0.00s; ETA 00:00
20241220 00:06:03 859433 Profile:
 25.30% tailSquare    81 x200400  0.1 0.1
 16.27% fftMidOut     52 x200900  0.1 0.2
 14.78% fftMidIn      47 x201400  0.1 0.0
 13.95% fftP          44 x201400  0.0 0.0
 13.60% fftW          43 x200900  0.1 0.2
  9.70% kCarryA       31 x200899  0.1 0.3
  3.62% carryB        12 x200900  0.1 0.3
  1.81% bufSmallOut    6 x200000  0.1 0.3
  0.87% readResidue    3 x200000  0.1 0.3
20241220 00:06:07 859433    810000 7a9dbb7e1a91147a  410
20241220 00:06:11 859433    820000 82f2541a5735b613  409
20241220 00:06:16 859433    830000 2e3fdd53fd8f6717  423
20241220 00:06:20 859433    840000 fdf7eefe20330d3f  422
20241220 00:06:24 859433    850000 98954ce9f15bdc12  421
20241220 00:06:28 859433 PP   859433 / 859433, 0000000000000001
20241220 00:06:28 859433 OK    859600 100.00% f69dbe1c12d7020b  403 us/it + check 0.33s + save 0.00s; ETA 00:00
20241220 00:06:28 859433 Acquired memory lock 'memlock-0'
20241220 00:06:28 859433 proof level 0 : M 8836e74b21950cfe, h f5ad24828c4fc3e2
20241220 00:06:28 859433 proof level 1 : M bfc267e01dbe73d3, h 9ce35bf05f8802f8
20241220 00:06:28 859433 proof level 2 : M 18b4817e3ed82aa2, h d4d64e10c89a10ca
20241220 00:06:29 859433 proof level 3 : M 9746659f39020349, h a21a54de004896f6
20241220 00:06:29 859433 proof level 4 : M c003cb0babfba570, h a2664e2e21ad8ab6
20241220 00:06:30 859433 proof level 5 : M 92102516095d2c49, h 9d1a0dd52c817054
20241220 00:06:30 859433 Proof 'proof/859433-6.proof' generated
20241220 00:06:30 859433 Released memory lock 'memlock-0'
20241220 00:06:30 859433 {"status":"P", "exponent":"859433", "worktype":"PRP-3", "res64":"0000000000000001", "residue-type":"1", "errors":{"gerbicz":"0"}, "fft-length":"262144", "proof":{"version":"1", "power":"6", "hashsize":"64", "md5":"132a479cedea59ddaa16bfe50c2e9f9d"}, "program":{"name":"gpuowl", "version":"9ce7f68-dirty", "port":"8"}, "timestamp":"2024-12-20 00:06:30 UTC"}
20241220 00:06:30  Bye
+ echo -e '\nEnter login shell for further work'

Enter login shell for further work
+ su -l - root
root@fc1fb3ee46fb:~# 

@Hermann-SW
Copy link
Author

Revision 3 uses gpuowl pre_c++20 branch for upcoming dev work
(rocm/dev-ubuntu-18.04 for PRE_VEGA "Radeon RX 480" GPU).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment