Skip to content

Instantly share code, notes, and snippets.

@wedesoft
Created June 15, 2023 19:59
Show Gist options
  • Save wedesoft/86036bb8ccff281c52c1b9dde3db2372 to your computer and use it in GitHub Desktop.
Save wedesoft/86036bb8ccff281c52c1b9dde3db2372 to your computer and use it in GitHub Desktop.
dmesg output of amdgpu rendering failure
[ 3789.603274] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=428555, emitted seq=428557
[ 3789.603742] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process java pid 7811 thread java:cs0 pid 7848
[ 3789.604177] amdgpu 0000:05:00.0: amdgpu: GPU reset begin!
[ 3789.705737] ------------[ cut here ]------------
[ 3789.705743] WARNING: CPU: 0 PID: 4769 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:656 amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 3789.706167] Modules linked in: ctr ccm rfcomm xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr overlay cmac algif_hash algif_skcipher af_alg bnep binfmt_misc nls_ascii nls_cp437 vfat fat iwlmvm btusb mac80211 btrtl btbcm btintel btmtk intel_rapl_msr libarc4 intel_rapl_common bluetooth edac_mce_amd uvcvideo videobuf2_vmalloc iwlwifi kvm_amd snd_hda_codec_realtek videobuf2_memops snd_hda_codec_generic videobuf2_v4l2 jitterentropy_rng ledtrig_audio kvm cfg80211 irqbypass videobuf2_common snd_hda_codec_hdmi drbg videodev ghash_clmulni_intel ansi_cprng snd_hda_intel sha512_ssse3 snd_intel_dspcfg sha512_generic snd_intel_sdw_acpi ecdh_generic mc aesni_intel ecc snd_hda_codec crypto_simd cryptd rapl snd_pci_acp6x snd_hda_core snd_pci_acp5x snd_hwdep wdat_wdt pcspkr clevo_wmi(OE)
[ 3789.706264] snd_pcm snd_timer snd_rn_pci_acp3x sp5100_tco ccp snd_acp_config k10temp snd_soc_acpi snd watchdog soundcore snd_pci_acp3x rfkill rng_core clevo_acpi(OE) tuxedo_io(OE) tuxedo_keyboard(OE) led_class_multicolor ac sparse_keymap acpi_cpufreq joydev hid_multitouch serio_raw evdev msr parport_pc ppdev lp parport fuse dm_mod loop efi_pstore configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic amdgpu gpu_sched drm_buddy i2c_algo_bit xhci_pci drm_display_helper nvme xhci_hcd cec nvme_core rc_core drm_ttm_helper t10_pi ttm r8169 sdhci_pci usbcore crc64_rocksoft drm_kms_helper cqhci crc64 realtek crc_t10dif crct10dif_generic sdhci mdio_devres video crc32_pclmul crct10dif_pclmul hid_generic drm psmouse crc32c_intel i2c_piix4 usb_common libphy crct10dif_common mmc_core battery i2c_hid_acpi i2c_hid wmi hid button
[ 3789.706363] CPU: 0 PID: 4769 Comm: kworker/u32:4 Tainted: G OE 6.1.0-9-amd64 #1 Debian 6.1.27-1
[ 3789.706368] Hardware name: TUXEDO TUXEDO Aura 15 Gen1/NL5xRU, BIOS 1.07.11RTR2 01/30/2023
[ 3789.706372] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[ 3789.706387] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 3789.706783] Code: 48 8b 4e 10 48 83 39 00 74 2c 89 d1 48 8d 04 88 8b 08 85 c9 74 14 f0 ff 08 b8 00 00 00 00 74 05 e9 00 d8 70 db e9 8b fd ff ff <0f> 0b b8 ea ff ff ff e9 ef d7 70 db b8 ea ff ff ff e9 e5 d7 70 db
[ 3789.706787] RSP: 0018:ffffa2fa42747c88 EFLAGS: 00010246
[ 3789.706791] RAX: ffff8e433aa65c70 RBX: 0000000000000001 RCX: 0000000000000000
[ 3789.706793] RDX: 0000000000000000 RSI: ffff8e4339aaf3e0 RDI: ffff8e4339aa0000
[ 3789.706796] RBP: ffff8e4339aa0000 R08: 0000000000000000 R09: ffff8e460f3160d0
[ 3789.706798] R10: ffffa2fa42747b80 R11: 0000000000000000 R12: ffff8e4339aaf3e0
[ 3789.706800] R13: ffff8e4339ab73f0 R14: ffff8e44ec6de800 R15: 0000000000000000
[ 3789.706803] FS: 0000000000000000(0000) GS:ffff8e45ff400000(0000) knlGS:0000000000000000
[ 3789.706806] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3789.706809] CR2: 000039e608772000 CR3: 0000000019610000 CR4: 0000000000350ef0
[ 3789.706812] Call Trace:
[ 3789.706816] <TASK>
[ 3789.706819] sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu]
[ 3789.707226] amdgpu_device_ip_suspend_phase2+0x107/0x1a0 [amdgpu]
[ 3789.707596] ? amdgpu_device_ip_suspend_phase1+0x75/0xe0 [amdgpu]
[ 3789.707965] amdgpu_device_ip_suspend+0x32/0x70 [amdgpu]
[ 3789.708334] amdgpu_device_pre_asic_reset+0xcf/0x290 [amdgpu]
[ 3789.708703] amdgpu_device_gpu_recover.cold+0x607/0xad4 [amdgpu]
[ 3789.709255] amdgpu_job_timedout+0x1d8/0x220 [amdgpu]
[ 3789.709695] ? psi_group_change+0x145/0x360
[ 3789.709704] ? __switch_to+0x106/0x410
[ 3789.709711] drm_sched_job_timedout+0x76/0x110 [gpu_sched]
[ 3789.709724] process_one_work+0x1c7/0x380
[ 3789.709732] worker_thread+0x4d/0x380
[ 3789.709738] ? rescuer_thread+0x3a0/0x3a0
[ 3789.709743] kthread+0xe9/0x110
[ 3789.709748] ? kthread_complete_and_exit+0x20/0x20
[ 3789.709753] ret_from_fork+0x22/0x30
[ 3789.709763] </TASK>
[ 3789.709765] ---[ end trace 0000000000000000 ]---
[ 3789.710227] ------------[ cut here ]------------
[ 3789.710229] WARNING: CPU: 0 PID: 4769 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:656 amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 3789.710630] Modules linked in: ctr ccm rfcomm xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr overlay cmac algif_hash algif_skcipher af_alg bnep binfmt_misc nls_ascii nls_cp437 vfat fat iwlmvm btusb mac80211 btrtl btbcm btintel btmtk intel_rapl_msr libarc4 intel_rapl_common bluetooth edac_mce_amd uvcvideo videobuf2_vmalloc iwlwifi kvm_amd snd_hda_codec_realtek videobuf2_memops snd_hda_codec_generic videobuf2_v4l2 jitterentropy_rng ledtrig_audio kvm cfg80211 irqbypass videobuf2_common snd_hda_codec_hdmi drbg videodev ghash_clmulni_intel ansi_cprng snd_hda_intel sha512_ssse3 snd_intel_dspcfg sha512_generic snd_intel_sdw_acpi ecdh_generic mc aesni_intel ecc snd_hda_codec crypto_simd cryptd rapl snd_pci_acp6x snd_hda_core snd_pci_acp5x snd_hwdep wdat_wdt pcspkr clevo_wmi(OE)
[ 3789.710721] snd_pcm snd_timer snd_rn_pci_acp3x sp5100_tco ccp snd_acp_config k10temp snd_soc_acpi snd watchdog soundcore snd_pci_acp3x rfkill rng_core clevo_acpi(OE) tuxedo_io(OE) tuxedo_keyboard(OE) led_class_multicolor ac sparse_keymap acpi_cpufreq joydev hid_multitouch serio_raw evdev msr parport_pc ppdev lp parport fuse dm_mod loop efi_pstore configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic amdgpu gpu_sched drm_buddy i2c_algo_bit xhci_pci drm_display_helper nvme xhci_hcd cec nvme_core rc_core drm_ttm_helper t10_pi ttm r8169 sdhci_pci usbcore crc64_rocksoft drm_kms_helper cqhci crc64 realtek crc_t10dif crct10dif_generic sdhci mdio_devres video crc32_pclmul crct10dif_pclmul hid_generic drm psmouse crc32c_intel i2c_piix4 usb_common libphy crct10dif_common mmc_core battery i2c_hid_acpi i2c_hid wmi hid button
[ 3789.710808] CPU: 0 PID: 4769 Comm: kworker/u32:4 Tainted: G W OE 6.1.0-9-amd64 #1 Debian 6.1.27-1
[ 3789.710813] Hardware name: TUXEDO TUXEDO Aura 15 Gen1/NL5xRU, BIOS 1.07.11RTR2 01/30/2023
[ 3789.710815] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[ 3789.710829] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 3789.711219] Code: 48 8b 4e 10 48 83 39 00 74 2c 89 d1 48 8d 04 88 8b 08 85 c9 74 14 f0 ff 08 b8 00 00 00 00 74 05 e9 00 d8 70 db e9 8b fd ff ff <0f> 0b b8 ea ff ff ff e9 ef d7 70 db b8 ea ff ff ff e9 e5 d7 70 db
[ 3789.711223] RSP: 0018:ffffa2fa42747c80 EFLAGS: 00010246
[ 3789.711226] RAX: ffff8e433aa656c0 RBX: ffff8e4339aa0000 RCX: 0000000000000000
[ 3789.711228] RDX: 0000000000000000 RSI: ffff8e4339aabb80 RDI: ffff8e4339aa0000
[ 3789.711231] RBP: ffff8e4339aa0000 R08: 0000000000000000 R09: ffff8e460f3160d0
[ 3789.711233] R10: ffffa2fa42747b80 R11: 0000000000000000 R12: 0000000000001050
[ 3789.711235] R13: ffff8e4339ab73f0 R14: ffff8e44ec6de800 R15: 0000000000000000
[ 3789.711238] FS: 0000000000000000(0000) GS:ffff8e45ff400000(0000) knlGS:0000000000000000
[ 3789.711241] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3789.711243] CR2: 000039e608772000 CR3: 0000000019610000 CR4: 0000000000350ef0
[ 3789.711246] Call Trace:
[ 3789.711249] <TASK>
[ 3789.711250] gfx_v9_0_hw_fini+0x1c/0x6e0 [amdgpu]
[ 3789.711646] amdgpu_device_ip_suspend_phase2+0x107/0x1a0 [amdgpu]
[ 3789.712016] ? amdgpu_device_ip_suspend_phase1+0x75/0xe0 [amdgpu]
[ 3789.712384] amdgpu_device_ip_suspend+0x32/0x70 [amdgpu]
[ 3789.712752] amdgpu_device_pre_asic_reset+0xcf/0x290 [amdgpu]
[ 3789.713121] amdgpu_device_gpu_recover.cold+0x607/0xad4 [amdgpu]
[ 3789.713669] amdgpu_job_timedout+0x1d8/0x220 [amdgpu]
[ 3789.714107] ? psi_group_change+0x145/0x360
[ 3789.714114] ? __switch_to+0x106/0x410
[ 3789.714120] drm_sched_job_timedout+0x76/0x110 [gpu_sched]
[ 3789.714133] process_one_work+0x1c7/0x380
[ 3789.714140] worker_thread+0x4d/0x380
[ 3789.714146] ? rescuer_thread+0x3a0/0x3a0
[ 3789.714151] kthread+0xe9/0x110
[ 3789.714156] ? kthread_complete_and_exit+0x20/0x20
[ 3789.714161] ret_from_fork+0x22/0x30
[ 3789.714170] </TASK>
[ 3789.714171] ---[ end trace 0000000000000000 ]---
[ 3789.720480] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
[ 3789.720486] amdgpu 0000:05:00.0: amdgpu: free PSP TMR buffer
[ 3789.748490] ------------[ cut here ]------------
[ 3789.748493] WARNING: CPU: 1 PID: 4769 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:656 amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 3789.748916] Modules linked in: ctr ccm rfcomm xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables libcrc32c nfnetlink br_netfilter bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr overlay cmac algif_hash algif_skcipher af_alg bnep binfmt_misc nls_ascii nls_cp437 vfat fat iwlmvm btusb mac80211 btrtl btbcm btintel btmtk intel_rapl_msr libarc4 intel_rapl_common bluetooth edac_mce_amd uvcvideo videobuf2_vmalloc iwlwifi kvm_amd snd_hda_codec_realtek videobuf2_memops snd_hda_codec_generic videobuf2_v4l2 jitterentropy_rng ledtrig_audio kvm cfg80211 irqbypass videobuf2_common snd_hda_codec_hdmi drbg videodev ghash_clmulni_intel ansi_cprng snd_hda_intel sha512_ssse3 snd_intel_dspcfg sha512_generic snd_intel_sdw_acpi ecdh_generic mc aesni_intel ecc snd_hda_codec crypto_simd cryptd rapl snd_pci_acp6x snd_hda_core snd_pci_acp5x snd_hwdep wdat_wdt pcspkr clevo_wmi(OE)
[ 3789.749016] snd_pcm snd_timer snd_rn_pci_acp3x sp5100_tco ccp snd_acp_config k10temp snd_soc_acpi snd watchdog soundcore snd_pci_acp3x rfkill rng_core clevo_acpi(OE) tuxedo_io(OE) tuxedo_keyboard(OE) led_class_multicolor ac sparse_keymap acpi_cpufreq joydev hid_multitouch serio_raw evdev msr parport_pc ppdev lp parport fuse dm_mod loop efi_pstore configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic amdgpu gpu_sched drm_buddy i2c_algo_bit xhci_pci drm_display_helper nvme xhci_hcd cec nvme_core rc_core drm_ttm_helper t10_pi ttm r8169 sdhci_pci usbcore crc64_rocksoft drm_kms_helper cqhci crc64 realtek crc_t10dif crct10dif_generic sdhci mdio_devres video crc32_pclmul crct10dif_pclmul hid_generic drm psmouse crc32c_intel i2c_piix4 usb_common libphy crct10dif_common mmc_core battery i2c_hid_acpi i2c_hid wmi hid button
[ 3789.749119] CPU: 1 PID: 4769 Comm: kworker/u32:4 Tainted: G W OE 6.1.0-9-amd64 #1 Debian 6.1.27-1
[ 3789.749125] Hardware name: TUXEDO TUXEDO Aura 15 Gen1/NL5xRU, BIOS 1.07.11RTR2 01/30/2023
[ 3789.749128] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[ 3789.749143] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu]
[ 3789.749592] Code: 48 8b 4e 10 48 83 39 00 74 2c 89 d1 48 8d 04 88 8b 08 85 c9 74 14 f0 ff 08 b8 00 00 00 00 74 05 e9 00 d8 70 db e9 8b fd ff ff <0f> 0b b8 ea ff ff ff e9 ef d7 70 db b8 ea ff ff ff e9 e5 d7 70 db
[ 3789.749595] RSP: 0018:ffffa2fa42747c98 EFLAGS: 00010246
[ 3789.749599] RAX: ffff8e433aa65820 RBX: ffff8e4339aa0000 RCX: 0000000000000000
[ 3789.749602] RDX: 0000000000000000 RSI: ffff8e4339aa24d8 RDI: ffff8e4339aa0000
[ 3789.749604] RBP: ffff8e4339aa0000 R08: 0000000000000000 R09: 0000000000000000
[ 3789.749607] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000001050
[ 3789.749609] R13: ffff8e4339ab73f0 R14: ffff8e44ec6de800 R15: 0000000000000000
[ 3789.749611] FS: 0000000000000000(0000) GS:ffff8e45ff440000(0000) knlGS:0000000000000000
[ 3789.749614] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3789.749617] CR2: 00007f5a70ff5000 CR3: 000000010d2c0000 CR4: 0000000000350ee0
[ 3789.749620] Call Trace:
[ 3789.749624] <TASK>
[ 3789.749626] gmc_v9_0_hw_fini+0x60/0xa0 [amdgpu]
[ 3789.750019] amdgpu_device_ip_suspend_phase2+0x107/0x1a0 [amdgpu]
[ 3789.750390] ? amdgpu_device_ip_suspend_phase1+0x75/0xe0 [amdgpu]
[ 3789.750759] amdgpu_device_ip_suspend+0x32/0x70 [amdgpu]
[ 3789.751129] amdgpu_device_pre_asic_reset+0xcf/0x290 [amdgpu]
[ 3789.751498] amdgpu_device_gpu_recover.cold+0x607/0xad4 [amdgpu]
[ 3789.752003] amdgpu_job_timedout+0x1d8/0x220 [amdgpu]
[ 3789.752448] ? psi_group_change+0x145/0x360
[ 3789.752456] ? __switch_to+0x106/0x410
[ 3789.752462] drm_sched_job_timedout+0x76/0x110 [gpu_sched]
[ 3789.752476] process_one_work+0x1c7/0x380
[ 3789.752483] worker_thread+0x4d/0x380
[ 3789.752490] ? rescuer_thread+0x3a0/0x3a0
[ 3789.752495] kthread+0xe9/0x110
[ 3789.752499] ? kthread_complete_and_exit+0x20/0x20
[ 3789.752505] ret_from_fork+0x22/0x30
[ 3789.752514] </TASK>
[ 3789.752515] ---[ end trace 0000000000000000 ]---
[ 3789.752551] amdgpu 0000:05:00.0: amdgpu: MODE2 reset
[ 3789.752618] amdgpu 0000:05:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 3789.752773] [drm] PCIE GART of 1024M enabled.
[ 3789.752775] [drm] PTB located at 0x000000F41FC00000
[ 3789.752854] [drm] PSP is resuming...
[ 3789.772722] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
[ 3790.047891] amdgpu 0000:05:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 3790.056777] amdgpu 0000:05:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 3790.061365] [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
[ 3790.061600] [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
[ 3790.061606] amdgpu 0000:05:00.0: amdgpu: Secure display: Generic Failure.
[ 3790.061611] amdgpu 0000:05:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
[ 3790.061617] amdgpu 0000:05:00.0: amdgpu: SMU is resuming...
[ 3790.062111] amdgpu 0000:05:00.0: amdgpu: SMU is resumed successfully!
[ 3790.062668] [drm] DMUB hardware initialized: version=0x01010026
[ 3790.667814] [drm] kiq ring mec 2 pipe 1 q 0
[ 3790.670306] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 3790.670353] [drm] JPEG decode initialized successfully.
[ 3790.670356] amdgpu 0000:05:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[ 3790.670358] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 3790.670359] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 3790.670360] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 3790.670361] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 3790.670362] amdgpu 0000:05:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 3790.670363] amdgpu 0000:05:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 3790.670364] amdgpu 0000:05:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 3790.670365] amdgpu 0000:05:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 3790.670366] amdgpu 0000:05:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 3790.670367] amdgpu 0000:05:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[ 3790.670368] amdgpu 0000:05:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[ 3790.670369] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[ 3790.670370] amdgpu 0000:05:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[ 3790.670371] amdgpu 0000:05:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[ 3790.672513] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow start
[ 3790.672515] amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow done
[ 3790.672518] [drm] Skip scheduling IBs!
[ 3790.672519] [drm] Skip scheduling IBs!
[ 3790.672530] amdgpu 0000:05:00.0: amdgpu: GPU reset(2) succeeded!
[ 3790.674981] [drm] Skip scheduling IBs!
[ 3790.674987] [drm] Skip scheduling IBs!
[ 3790.685118] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment