Skip to content

Instantly share code, notes, and snippets.

@rajesh-s
Last active December 21, 2021 00:30
Show Gist options
  • Save rajesh-s/976669039691761b0fe4fc3a47a12669 to your computer and use it in GitHub Desktop.
Save rajesh-s/976669039691761b0fe4fc3a47a12669 to your computer and use it in GitHub Desktop.
Quick docker
@rajesh-s
Copy link
Author

make -j16 -C ./gpu-app-collection/src rodinia-3.1

@rajesh-s
Copy link
Author

Using logfiles ['/home/runner/accel-sim-framework/util/job_launching/../job_launching/logfiles/sim_log.RUNSPTX.21.12.13-Monday.txt']
procman.id Node App AppArgs Version Config RunningTime Mem JobStatus Basic GPGPU-Sim Stats

59 f84884333a47 b+tree-rodinia-3.1 file___data_mil_txt_ b+tree-rodinia-3.1.g QV100-PTX 0:31:03 2 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=204 K SIM_TIME=31 min, 4 sec (1864 sec) TOT_IPC=3 K TOT_INSN=379 M TOT_CYCLE=139 K
60 f84884333a47 backprop-rodinia-3.1 65536 backprop-rodinia-3.1 QV100-PTX 0:13:32 2 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=193 K SIM_TIME=13 min, 58 sec (838 sec) TOT_IPC=3 K TOT_INSN=162 M TOT_CYCLE=61 K
61 f84884333a47 bfs-rodinia-3.1 __data_graph4096_txt bfs-rodinia-3.1.gpgp QV100-PTX 0:02:00 428 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=13 K SIM_TIME=2 min, 6 sec (126 sec) TOT_IPC=13 TOT_INSN=2 M TOT_CYCLE=127 K
62 f84884333a47 bfs-rodinia-3.1 __data_graph65536_tx bfs-rodinia-3.1.gpgp QV100-PTX 0:08:01 699 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=63 K SIM_TIME=8 min, 6 sec (486 sec) TOT_IPC=172 TOT_INSN=31 M TOT_CYCLE=179 K
63 f84884333a47 bfs-rodinia-3.1 __data_graph1MW_6_tx bfs-rodinia-3.1.gpgp QV100-PTX 1:34:39 1 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=95 K SIM_TIME=1 hrs, 34 min, 50 sec (5690 sec) TOT_IPC=623 TOT_INSN=542 M TOT_CYCLE=870 K
64 f84884333a47 dwt2d-rodinia-3.1 __data_192_bmp__d_19 dwt2d-rodinia-3.1.gp QV100-PTX 0:03:30 710 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=33 K SIM_TIME=3 min, 51 sec (231 sec) TOT_IPC=48 TOT_INSN=8 M TOT_CYCLE=160 K
65 f84884333a47 dwt2d-rodinia-3.1 __data_rgb_bmp__d_10 dwt2d-rodinia-3.1.gp QV100-PTX 0:25:33 1 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=120 K SIM_TIME=25 min, 56 sec (1556 sec) TOT_IPC=510 TOT_INSN=187 M TOT_CYCLE=368 K
66 f84884333a47 gaussian-rodinia-3.1 _f___data_matrix4_tx gaussian-rodinia-3.1 QV100-PTX 0:00:30 361 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=552 SIM_TIME=32 sec (32 sec) TOT_IPC=0 TOT_INSN=18 K TOT_CYCLE=39 K
67 f84884333a47 gaussian-rodinia-3.1 _s_16 gaussian-rodinia-3.1 QV100-PTX 0:02:00 361 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=914 SIM_TIME=2 min, 18 sec (138 sec) TOT_IPC=1 TOT_INSN=126 K TOT_CYCLE=195 K
68 f84884333a47 gaussian-rodinia-3.1 f___data_matrix208 gaussian-rodinia-3.1 QV100-PTX 0:50:35 629 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=74 K SIM_TIME=50 min, 54 sec (3054 sec) TOT_IPC=87 TOT_INSN=226 M TOT_CYCLE=3 M
69 f84884333a47 gaussian-rodinia-3.1 _s_64 gaussian-rodinia-3.1 QV100-PTX 0:10:31 428 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=10 K SIM_TIME=10 min, 33 sec (633 sec) TOT_IPC=8 TOT_INSN=7 M TOT_CYCLE=787 K
70 f84884333a47 gaussian-rodinia-3.1 s_256 gaussian-rodinia-3.1 QV100-PTX 1:08:07 697 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=103 K SIM_TIME=1 hrs, 8 min, 12 sec (4092 sec) TOT_IPC=130 TOT_INSN=420 M TOT_CYCLE=3 M
71 f84884333a47 hotspot-rodinia-3.1 512_2_2___data_temp
hotspot-rodinia-3.1. QV100-PTX 0:05:31 2 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=259 K SIM_TIME=5 min, 47 sec (347 sec) TOT_IPC=4 K TOT_INSN=90 M TOT_CYCLE=23 K
72 f84884333a47 hotspot-rodinia-3.1 1024_2_2___data_temp hotspot-rodinia-3.1. QV100-PTX 0:25:33 2 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=232 K SIM_TIME=25 min, 51 sec (1551 sec) TOT_IPC=6 K TOT_INSN=360 M TOT_CYCLE=63 K
73 f84884333a47 hybridsort-rodinia-3 r hybridsort-rodinia-3 QV100-PTX 0 0 COMPLETE_ERR_FILE_HAS_CONTENTS
74 f84884333a47 hybridsort-rodinia-3 __data_500000_txt hybridsort-rodinia-3 QV100-PTX 0 0 COMPLETE_ERR_FILE_HAS_CONTENTS
75 f84884333a47 kmeans-rodinia-3.1 o__i___data_28k_4x kmeans-rodinia-3.1.g QV100-PTX 1:49:40 2 B RUNNING
76 f84884333a47 kmeans-rodinia-3.1 _o__i___data_kdd_cup kmeans-rodinia-3.1.g QV100-PTX 1:49:40 2 B RUNNING
77 f84884333a47 kmeans-rodinia-3.1 o__i___data_819200 kmeans-rodinia-3.1.g QV100-PTX 1:49:09 2 B RUNNING
78 f84884333a47 lavaMD-rodinia-3.1 _boxes1d_10 lavaMD-rodinia-3.1.g QV100-PTX 1:47:39 2 B RUNNING
79 f84884333a47 lud-rodinia-3.1 _s_256__v lud-rodinia-3.1.gpgp QV100-PTX 0:18:02 769 M COMPLETE_NO_OTHER_INFO SIMRATE_IPS=30 K SIM_TIME=18 min, 11 sec (1091 sec) TOT_IPC=31 TOT_INSN=32 M TOT_CYCLE=1 M
80 f84884333a47 lud-rodinia-3.1 _i___data_512_dat lud-rodinia-3.1.gpgp QV100-PTX 0:48:05 1 B COMPLETE_NO_OTHER_INFO SIMRATE_IPS=89 K SIM_TIME=48 min, 29 sec (2909 sec) TOT_IPC=127 TOT_INSN=258 M TOT_CYCLE=2 M
81 f84884333a47 myocyte-rodinia-3.1 100_1_0 myocyte-rodinia-3.1. QV100-PTX 0 0 ABORTED
82 f84884333a47 nn-rodinia-3.1 __data_filelist_4__r nn-rodinia-3.1.gpgpu QV100-PTX 0 0 COMPLETE_NO_OTHER_INFO SIMRATE_IPS=71 K SIM_TIME=17 sec (17 sec) TOT_IPC=164 TOT_INSN=1 M TOT_CYCLE=7 K
83 f84884333a47 nw-rodinia-3.1 2048_10 nw-rodinia-3.1.gpgpu QV100-PTX 1:43:09 486 M RUNNING
84 f84884333a47 particlefilter_float x_128__y_128__z_10 particlefilter_float QV100-PTX 0 0 COMPLETE_NO_OTHER_INFO
85 f84884333a47 particlefilter_naive x_128__y_128__z_10 particlefilter_naive QV100-PTX 0 0 COMPLETE_NO_OTHER_INFO
86 f84884333a47 pathfinder-rodinia-3 100000_100_20___resu pathfinder-rodinia-3 QV100-PTX 0 0 COMPLETE_NO_OTHER_INFO
87 f84884333a47 srad_v1-rodinia-3.1 100_0_5_502_458 srad_v1-rodinia-3.1. QV100-PTX 1:40:08 2 B RUNNING

failed job log written to /home/runner/accel-sim-framework/util/job_launching/../job_launching/logfiles/failed_job_log_sim_log.RUNSPTX.21.12.13-Monday.txt
Passed:0/29, No error:20/29, Failed/Error:3/29, Running:6/29, Waiting:0/29
Contents /home/runner/accel-sim-framework/util/job_launching/../job_launching/logfiles/failed_job_log_sim_log.RUNSPTX.21.12.13-Monday.txt:
73 f84884333a47 hybridsort-rodinia-3 r hybridsort-rodinia-3 QV100-PTX 0 0 COMPLETE_ERR_FILE_HAS_CONTENTS
74 f84884333a47 hybridsort-rodinia-3 __data_500000_txt hybridsort-rodinia-3 QV100-PTX 0 0 COMPLETE_ERR_FILE_HAS_CONTENTS
81 f84884333a47 myocyte-rodinia-3.1 100_1_0 myocyte-rodinia-3.1. QV100-PTX 0 0 ABORTED


hybridsort-rodinia-3.1-r--QV100-PTX. Status=COMPLETE_ERR_FILE_HAS_CONTENTS
Last 10 line of /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/r/QV100-PTX/hybridsort-rodinia-3.1-r.gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0.o73

Sorting on GPU...
Sorting list of 4194304 floats
doing: /home/runner/accel-sim-framework/gpu-app-collection/src/..//bin/11.0/release/hybridsort-rodinia-3.1 r
doing export CUDA_LAUNCH_BLOCKING=1
doing: export PATH=/home/runner/accel-sim-framework/gpu-simulator/gpgpu-sim/bin:/usr/local/cuda-11/bin:/usr/local/cuda-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
doing
doing: export OPENCL_REMOTE_GPU_HOST=REPLACE_REMOTE_HOST
doing: export OPENCL_CURRENT_TEST_PATH=/home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/r/QV100-PTX
doing: cd /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/r/QV100-PTX
doing: export LD_LIBRARY_PATH=/home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/gpgpu-sim-builds/gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0:/home/runner/accel-sim-framework/gpu-simulator/gpgpu-sim/lib/gcc-7.5.0/cuda-11000/release:

Contents of /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/r/QV100-PTX/hybridsort-rodinia-3.1-r.gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0.e73

CUDA error at bucketsort.cu:49 code=35(cudaErrorInsufficientDriver) "cudaMalloc((void**) &d_offsets, histosize * sizeof(unsigned int))"




hybridsort-rodinia-3.1-__data_500000_txt--QV100-PTX. Status=COMPLETE_ERR_FILE_HAS_CONTENTS
Last 10 line of /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/__data_500000_txt/QV100-PTX/hybridsort-rodinia-3.1-__data_500000_txt.gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0.o74

Sorting on GPU...
Sorting list of 500000 floats
doing: /home/runner/accel-sim-framework/gpu-app-collection/src/..//bin/11.0/release/hybridsort-rodinia-3.1 ./data/500000.txt
doing export CUDA_LAUNCH_BLOCKING=1
doing: export PATH=/home/runner/accel-sim-framework/gpu-simulator/gpgpu-sim/bin:/usr/local/cuda-11/bin:/usr/local/cuda-11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
doing
doing: export OPENCL_REMOTE_GPU_HOST=REPLACE_REMOTE_HOST
doing: export OPENCL_CURRENT_TEST_PATH=/home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/__data_500000_txt/QV100-PTX
doing: cd /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/__data_500000_txt/QV100-PTX
doing: export LD_LIBRARY_PATH=/home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/gpgpu-sim-builds/gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0:/home/runner/accel-sim-framework/gpu-simulator/gpgpu-sim/lib/gcc-7.5.0/cuda-11000/release:

Contents of /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/hybridsort-rodinia-3.1/__data_500000_txt/QV100-PTX/hybridsort-rodinia-3.1-__data_500000_txt.gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0.e74

CUDA error at bucketsort.cu:49 code=35(cudaErrorInsufficientDriver) "cudaMalloc((void**) &d_offsets, histosize * sizeof(unsigned int))"




myocyte-rodinia-3.1-100_1_0--QV100-PTX. Status=ABORTED
Last 10 line of /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/myocyte-rodinia-3.1/100_1_0/QV100-PTX/myocyte-rodinia-3.1-100_1_0.gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0.o81

GPGPU-Sim PTX: ... done pre-decoding instructions for '__internal_accurate_pow'.
GPGPU-Sim PTX: ... end of reconvergence points for __internal_accurate_pow
GPGPU-Sim PTX: immediate post dominator @ PC=0x3a770 (myocyte-rodinia-3.2.sm_62.ptx:20003) st.param.f64[func_retval0+0], %fd136;
GPGPU-Sim PTX: 6 (potential) branch divergence @ PC=0x3a760 (myocyte-rodinia-3.2.sm_62.ptx:19997) @%p8 bra BB2_10;
GPGPU-Sim PTX: immediate post dominator @ PC=0x3a770 (myocyte-rodinia-3.2.sm_62.ptx:20003) st.param.f64[func_retval0+0], %fd136;
GPGPU-Sim PTX: 5 (potential) branch divergence @ PC=0x3a748 (myocyte-rodinia-3.2.sm_62.ptx:19990) @%p7 bra BB2_9;
GPGPU-Sim PTX: immediate post dominator @ PC=0x3a730 (myocyte-rodinia-3.2.sm_62.ptx:19986) mov.b64 {%temp, %r45}, %fd136;
GPGPU-Sim PTX: 4 (potential) branch divergence @ PC=0x3a6a8 (myocyte-rodinia-3.2.sm_62.ptx:19961) @%p6 bra BB2_7;
GPGPU-Sim PTX: immediate post dominator @ PC=0x3a730 (myocyte-rodinia-3.2.sm_62.ptx:19986) mov.b64 {%temp, %r45}, %fd136;
GPGPU-Sim PTX: 3 (potential) branch divergence @ PC=0x3a680 (myocyte-rodinia-3.2.sm_62.ptx:19955) @%p4 bra BB2_7;

Contents of /home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/myocyte-rodinia-3.1/100_1_0/QV100-PTX/myocyte-rodinia-3.1-100_1_0.gpgpu-sim_git-commit-971dd69e8a6c355c8cc4c8bd997db51376ff520a_modified_0.0.e81

double free or corruption (fasttop)
/home/runner/accel-sim-framework/util/job_launching/../../sim_run_11.0/myocyte-rodinia-3.1/100_1_0/QV100-PTX/slurm.sim: line 51: 11161 Aborted (core dumped) /home/runner/accel-sim-framework/gpu-app-collection/src/..//bin/11.0/release/myocyte-rodinia-3.1 100 1 0



@rajesh-s
Copy link
Author

export CUDA_HOME=/s/cuda-9.1/amd64_ubu20
export CUDA_PATH=$CUDA_HOME
export CUDA_INSTALL_PATH=$CUDA_HOME
export LD_LIBRARY_PATH=/s/cuda-9.1/amd64_ubu20/lib64:/s/mpfr-3.1.6/amd64_ubu20/lib:/s/gcc-6.1/amd64_ubu20/lib64:/s/gcc-6.1/amd64_ubu20/lib:$LD_LIBRARY_PATH
export PATH=/s/cuda-9.1/amd64_ubu20/bin:/s/gcc-6.1/amd64_ubu20/bin:$PATH

source setup_environment release
source setup_environment debug -> Debug mode for better GDB access

Steps to run GPGPUsim with GDB on CSL machines:

  1. Use gcc6.1 (max allowed on CUDA9.1) instead of 5.4: export PATH=/s/cuda-9.1/amd64_ubu20/bin:/s/gcc-6.1/amd64_ubu20/bin:$PATH
  2. export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH"
  3. Optional: Recompile with debug option for detailed access. source setup_environment debug; make
  4. gdb ./application (assuming ldd application is correctly setup)

GDB commands:

p Eg: p mf->get_pc()
up: Go one step above broken point in execution
r: run program

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment