Skip to content

Instantly share code, notes, and snippets.

@matejaputic
Created March 12, 2016 01:26
Show Gist options
  • Select an option

  • Save matejaputic/6437e74ac7064e12aa77 to your computer and use it in GitHub Desktop.

Select an option

Save matejaputic/6437e74ac7064e12aa77 to your computer and use it in GitHub Desktop.
We can make this file beautiful and searchable if this error is corrected: It looks like row 6 should actually have 1 column, instead of 3 in line 5.
#ProfileFileVersion=3.1
#ProfilerVersion=3.1.7247
#Application=/home/users/mputic/persistent/Projects/clBLAS/build/samples/example_sgemm
#ApplicationArgs=
#WorkingDirectory=
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G Platform Vendor=Advanced Micro Devices, Inc.
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G Platform Name=AMD Accelerated Parallel Processing
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G Platform Version=OpenCL 1.2 AMD-APP (1445.5)
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G CLDriver Version=1445.5 (sse2,avx,fma4)
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G CLRuntime Version=OpenCL 1.2 AMD-APP (1445.5)
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G NumberAppAddressBits=64
#Device Hawaii Platform Vendor=Advanced Micro Devices, Inc.
#Device Hawaii Platform Name=AMD Accelerated Parallel Processing
#Device Hawaii Platform Version=OpenCL 1.2 AMD-APP (1445.5)
#Device Hawaii CLDriver Version=1445.5 (VM)
#Device Hawaii CLRuntime Version=OpenCL 1.2 AMD-APP (1445.5)
#Device Hawaii NumberAppAddressBits=64
#OS Version=Ubuntu 14.04.2 LTS \n \l
#DisplayName=
#ListSeparator=,
#ForceSinglePass=False
Method, ExecutionOrder, ThreadID, CallIndex, GlobalWorkSize, WorkGroupSize, Time, LocalMemSize, VGPRs, SGPRs, ScratchRegs, FCStacks, Wavefronts, VALUInsts, SALUInsts, VFetchInsts, SFetchInsts, VWriteInsts, LDSInsts, GDSInsts, VALUUtilization, VALUBusy, SALUBusy, FetchSize, WriteSize, CacheHit, MemUnitBusy, MemUnitStalled, WriteUnitStalled, LDSBankConflict
sgemm_Col_NN_B1_MX096_NX096_KX16__k1_Hawaii1, 1, 107896, 64, { 1360 1360 1}, { 16 16 1}, 15180.28756, 12416, 228, 37, 324, NA, 28900.00, 365115.00, 3099.00, 245786.00, 17.00, 58611.00, 104448.00, 0.00, 100.00, 1.72, 0.06, 1224019192.25, 423462947.44, 1.17, 31.81, 0.85, 0.00, 0.01
sgemm_Col_NN_B1_ML096_NX096_KX16__k2_Hawaii1, 2, 107896, 79, { 16 1360 1}, { 16 16 1}, 20.92904, 12288, 209, 38, 0, NA, 340.00, 65073.00, 10259.00, 6156.00, 17.00, 12.00, 47104.00, 0.00, 92.20, 2.38, 1.90, 273753.19, 1020.00, 21.84, 35.31, 0.00, 0.29, 0.83
sgemm_Col_NN_B1_MX096_NL096_KX16__k3_Hawaii1, 3, 107896, 94, { 1360 16 1}, { 16 16 1}, 5.46089, 12288, 212, 38, 0, NA, 340.00, 91210.00, 10289.00, 6156.00, 17.00, 12.00, 47104.00, 0.00, 94.41, 14.19, 6.32, 264196.12, 1020.00, 24.58, 22.47, 0.01, 0.01, 3.54
sgemm_Col_NN_B1_ML096_NL096_KX16__k4_Hawaii1, 4, 107896, 109, { 16 16 1}, { 16 16 1}, 3.94104, 12288, 213, 48, 0, NA, 4.00, 94225.00, 19495.00, 6148.00, 17.00, 4.00, 47104.00, 0.00, 89.25, 0.22, 0.20, 2058.81, 4.00, 0.19, 4.71, 0.00, 0.00, 0.05
We can make this file beautiful and searchable if this error is corrected: It looks like row 6 should actually have 1 column, instead of 3 in line 5.
#ProfileFileVersion=3.1
#ProfilerVersion=3.1.7247
#Application=/home/users/mputic/persistent/Projects/clBLAS/build/samples/example_sgemm
#ApplicationArgs=
#WorkingDirectory=
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G Platform Vendor=Advanced Micro Devices, Inc.
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G Platform Name=AMD Accelerated Parallel Processing
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G Platform Version=OpenCL 1.2 AMD-APP (1445.5)
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G CLDriver Version=1445.5 (sse2,avx,fma4)
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G CLRuntime Version=OpenCL 1.2 AMD-APP (1445.5)
#Device AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G NumberAppAddressBits=64
#Device Hawaii Platform Vendor=Advanced Micro Devices, Inc.
#Device Hawaii Platform Name=AMD Accelerated Parallel Processing
#Device Hawaii Platform Version=OpenCL 1.2 AMD-APP (1445.5)
#Device Hawaii CLDriver Version=1445.5 (VM)
#Device Hawaii CLRuntime Version=OpenCL 1.2 AMD-APP (1445.5)
#Device Hawaii NumberAppAddressBits=64
#OS Version=Ubuntu 14.04.2 LTS \n \l
#DisplayName=
#ListSeparator=,
#ForceSinglePass=False
Method, ExecutionOrder, ThreadID, CallIndex, GlobalWorkSize, WorkGroupSize, Time, LocalMemSize, VGPRs, SGPRs, ScratchRegs, FCStacks, Wavefronts, VALUInsts, SALUInsts, VFetchInsts, SFetchInsts, VWriteInsts, LDSInsts, GDSInsts, VALUUtilization, VALUBusy, SALUBusy, FetchSize, WriteSize, CacheHit, MemUnitBusy, MemUnitStalled, WriteUnitStalled, LDSBankConflict
sgemmBlock__k1_Hawaii1, 1, 107888, 86, { 1024 2048 1}, { 8 8 1}, 535.99452, 0, 116, 32, 0, NA, 32768.00, 311608.11, 4120.01, 24584.00, 16.00, 8.00, 0.00, 0.00, 100.00, 47.06, 2.49, 72342316.81, 263666.00, 61.39, 99.44, 0.07, 0.00, 0.00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment