Created
March 26, 2024 20:34
-
-
Save geohot/7dff8fd6259b1e6d57efb772b900fd69 to your computer and use it in GitHub Desktop.
MES Page Fault Crash
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[55883.721977] amdgpu: map VA 0x702eae9d2000 - 0x702eae9d3000 in entry 0000000072d2b750 | |
[55883.721996] amdgpu: INC mapping count 1 | |
[55883.722133] kfd kfd: amdgpu: ioctl cmd 0xc0184b0c (#0xc), arg 0x7ffe16172bef | |
[55883.722238] gmc_v11_0_process_interrupt: 6 callbacks suppressed | |
[55883.722250] amdgpu 0000:c3:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:8 pasid:32774, for process python3 pid 356134 thread python3 pid 356134) | |
[55883.722343] amdgpu 0000:c3:00.0: amdgpu: in page starting at address 0x00000000aabbc000 from client 10 | |
[55883.722391] amdgpu 0000:c3:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00800A30 | |
[55883.722429] amdgpu 0000:c3:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) | |
[55883.722466] amdgpu 0000:c3:00.0: amdgpu: MORE_FAULTS: 0x0 | |
[55883.722497] amdgpu 0000:c3:00.0: amdgpu: WALKER_ERROR: 0x0 | |
[55883.722528] amdgpu 0000:c3:00.0: amdgpu: PERMISSION_FAULTS: 0x3 | |
[55883.722561] amdgpu 0000:c3:00.0: amdgpu: MAPPING_ERROR: 0x0 | |
[55883.722592] amdgpu 0000:c3:00.0: amdgpu: RW: 0x0 | |
[55883.722621] amdgpu: client id 0xa, source id 0, vmid 8, pasid 0x8006. raw data: | |
[55883.722628] amdgpu: 818000A, F6ED6F02, 4CD, 8006, AABBC, 40, 0, 0. | |
[55883.722660] amdgpu: Evicting PASID 0x8006 queues | |
[55883.861108] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3 | |
[55883.861477] amdgpu: failed to remove hardware queue from MES, doorbell=0x1000 | |
[55883.861495] amdgpu: MES might be in unrecoverable state, issue a GPU reset | |
[55883.861514] amdgpu: Failed to evict queue 0 | |
[55883.862401] amdgpu 0000:c3:00.0: amdgpu: GPU reset begin! | |
[55883.862444] amdgpu: Free mem_obj = 00000000300c7743, range_start = 0, range_end = 0 | |
[55884.885195] amdgpu 0000:c3:00.0: amdgpu: IP block:gfx_v11_0 is hung! | |
[55884.885390] amdgpu 0000:c3:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0, for process pid 0 thread pid 0) | |
[55884.885469] amdgpu 0000:c3:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 | |
[55884.885515] amdgpu 0000:c3:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41 | |
[55884.885554] amdgpu 0000:c3:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) | |
[55884.885590] amdgpu 0000:c3:00.0: amdgpu: MORE_FAULTS: 0x1 | |
[55884.885620] amdgpu 0000:c3:00.0: amdgpu: WALKER_ERROR: 0x0 | |
[55884.885651] amdgpu 0000:c3:00.0: amdgpu: PERMISSION_FAULTS: 0x4 | |
[55884.885682] amdgpu 0000:c3:00.0: amdgpu: MAPPING_ERROR: 0x0 | |
[55884.885713] amdgpu 0000:c3:00.0: amdgpu: RW: 0x1 | |
[55884.885746] amdgpu 0000:c3:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0, for process pid 0 thread pid 0) | |
[55884.885801] amdgpu 0000:c3:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 | |
[55884.885846] amdgpu 0000:c3:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040C40 | |
[55884.885883] amdgpu 0000:c3:00.0: amdgpu: Faulty UTCL2 client ID: CPG (0x6) | |
[55884.885919] amdgpu 0000:c3:00.0: amdgpu: MORE_FAULTS: 0x0 | |
[55884.885949] amdgpu 0000:c3:00.0: amdgpu: WALKER_ERROR: 0x0 | |
[55884.885979] amdgpu 0000:c3:00.0: amdgpu: PERMISSION_FAULTS: 0x4 | |
[55884.886011] amdgpu 0000:c3:00.0: amdgpu: MAPPING_ERROR: 0x0 | |
[55884.886041] amdgpu 0000:c3:00.0: amdgpu: RW: 0x1 | |
[55884.886073] amdgpu 0000:c3:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0, for process pid 0 thread pid 0) | |
[55884.886128] amdgpu 0000:c3:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 | |
[55884.886172] amdgpu 0000:c3:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 | |
[55884.886209] amdgpu 0000:c3:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) | |
[55884.886245] amdgpu 0000:c3:00.0: amdgpu: MORE_FAULTS: 0x0 | |
[55884.886275] amdgpu 0000:c3:00.0: amdgpu: WALKER_ERROR: 0x0 | |
[55884.886305] amdgpu 0000:c3:00.0: amdgpu: PERMISSION_FAULTS: 0x0 | |
[55884.886337] amdgpu 0000:c3:00.0: amdgpu: MAPPING_ERROR: 0x0 | |
[55884.886368] amdgpu 0000:c3:00.0: amdgpu: RW: 0x0 | |
[55884.886401] amdgpu 0000:c3:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0, for process pid 0 thread pid 0) | |
[55884.888229] amdgpu 0000:c3:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 | |
[55884.889186] amdgpu 0000:c3:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 | |
[55884.890178] amdgpu 0000:c3:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) | |
[55884.891085] amdgpu 0000:c3:00.0: amdgpu: MORE_FAULTS: 0x0 | |
[55884.891970] amdgpu 0000:c3:00.0: amdgpu: WALKER_ERROR: 0x0 | |
[55884.892844] amdgpu 0000:c3:00.0: amdgpu: PERMISSION_FAULTS: 0x0 | |
[55884.893782] amdgpu 0000:c3:00.0: amdgpu: MAPPING_ERROR: 0x0 | |
[55884.894651] amdgpu 0000:c3:00.0: amdgpu: RW: 0x0 | |
[55885.154869] amdgpu: Free mem_obj = 00000000851f6ffb, range_start = 1, range_end = 8 | |
[55885.154880] amdgpu: GFXOFF is enabled | |
[55885.155291] amdgpu: Unmap VA 0x702e0da49000 - 0x702e0da51000 from vm 00000000baebac5e |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment