https://bugs.freedesktop.org/show_bug.cgi?id=107689
Bug ID: 107689 Summary: System freezes on shutdown. [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: blocker Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: john-s-84@gmx.net
Common shutdown seems to be ok. After any suspend (pressing stand-by button or close lit), I am not able to shutdown successful. The system hangs on shutdown.
Sorry, for double posting. I do not know, the right place for these issue.
https://lists.freedesktop.org/archives/amd-gfx/2018-August/025818.html
Error-Log:
This is the bad output: Shutdown after pressing stand-by button:
[ 294.651066] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD)
This is the bad output: Closing the lit:
[ 71.696123] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD) [ 71.696378] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v8_0> failed -22 [ 71.696421] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22). [ 87.431032] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 87.521991] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD)
https://bugs.freedesktop.org/show_bug.cgi?id=107689
--- Comment #1 from john-s-84@gmx.net --- This log is after the suspend:
[ 3091.111509] amdgpu: [powerplay] [ 3091.600889] amdgpu: [powerplay] [ 3092.085174] amdgpu: [powerplay] [ 3092.547237] amdgpu: [powerplay] [ 3093.005634] amdgpu: [powerplay] [ 3093.464712] amdgpu: [powerplay] [ 3093.928337] amdgpu: [powerplay] [ 3094.391342] amdgpu: [powerplay] [ 3094.850455] amdgpu: [powerplay] [ 3095.774119] amdgpu: [powerplay] [ 3096.239871] amdgpu: [powerplay] [ 3096.711250] amdgpu: [powerplay] [ 3097.172032] amdgpu: [powerplay] [ 3097.631737] amdgpu: [powerplay] [ 3098.090704] amdgpu: [powerplay] [ 3098.550846] amdgpu: [powerplay] [ 3099.013701] amdgpu: [powerplay] [ 3099.476970] amdgpu: [powerplay] [ 3099.941099] amdgpu: [powerplay] [ 3100.404389] amdgpu: [powerplay] [ 3100.867675] amdgpu: [powerplay] [ 3101.326540] amdgpu: [powerplay] [ 3101.785426] amdgpu: [powerplay] [ 3102.709452] amdgpu: [powerplay] [ 3103.167897] amdgpu: [powerplay] [ 3104.091096] amdgpu: [powerplay] [ 3104.554677] amdgpu: [powerplay] [ 3105.018251] amdgpu: [powerplay] [ 3105.481543] amdgpu: [powerplay] [ 3106.397859] amdgpu: [powerplay] [ 3106.859070] amdgpu: [powerplay] [ 3107.319476] amdgpu: [powerplay] [ 3107.778301] amdgpu: [powerplay] [ 3108.696209] amdgpu: [powerplay] [ 3109.155393] amdgpu: [powerplay] [ 3110.071528] amdgpu: [powerplay] [ 3110.529575] amdgpu: [powerplay] [ 3110.989401] amdgpu: [powerplay] [ 3111.447587] amdgpu: [powerplay] [ 3112.363547] amdgpu: [powerplay] [ 3112.824429] amdgpu: [powerplay] [ 3113.741639] amdgpu: [powerplay] [ 3114.202019] amdgpu: [powerplay] [ 3114.665248] amdgpu: [powerplay] [ 3115.127638] amdgpu: [powerplay] [ 3116.045559] amdgpu: [powerplay] [ 3116.503855] amdgpu: [powerplay] [ 3116.966372] amdgpu: [powerplay] [ 3117.424757] amdgpu: [powerplay] [ 3118.351174] amdgpu: [powerplay] [ 3118.814380] amdgpu: [powerplay] [ 3119.740415] amdgpu: [powerplay] [ 3120.204136] amdgpu: [powerplay] [ 3120.414945] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD) [ 3120.414962] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v8_0> failed -22 [ 3120.414978] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22). [ 3125.907682] amdgpu: [powerplay] [ 3126.371235] amdgpu: [powerplay] [ 3127.636961] amdgpu: [powerplay] [ 3128.109298] amdgpu: [powerplay] [ 3129.044901] amdgpu: [powerplay] [ 3129.504295] amdgpu: [powerplay] [ 3130.429692] amdgpu: [powerplay] [ 3130.893790] amdgpu: [powerplay] [ 3131.815757] amdgpu: [powerplay] [ 3132.274641] amdgpu: [powerplay] [ 3133.193550] amdgpu: [powerplay] [ 3133.651888] amdgpu: [powerplay] [ 3134.568326] amdgpu: [powerplay] [ 3135.028265] amdgpu: [powerplay] [ 3135.957009] amdgpu: [powerplay] [ 3136.437691] amdgpu: [powerplay] [ 3136.658150] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3136.875305] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3137.092574] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3137.314115] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3137.540010] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3137.765326] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3137.982629] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3138.199617] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD) [ 3138.678639] amdgpu: [powerplay] [ 3139.171428] amdgpu: [powerplay] [ 3139.668258] amdgpu: [powerplay] [ 3140.164675] amdgpu: [powerplay] [ 3140.657299] amdgpu: [powerplay] [ 3141.151370] amdgpu: [powerplay] [ 3142.138768] amdgpu: [powerplay] [ 3142.602942] amdgpu: [powerplay] [ 3143.062798] amdgpu: [powerplay] [ 3143.521840] amdgpu: [powerplay] [ 3144.442769] amdgpu: [powerplay] [ 3144.906261] amdgpu: [powerplay] [ 3145.858408] amdgpu: [powerplay] [ 3146.320076] amdgpu: [powerplay]
https://bugs.freedesktop.org/show_bug.cgi?id=107689
--- Comment #2 from john-s-84@gmx.net --- Created attachment 141300 --> https://bugs.freedesktop.org/attachment.cgi?id=141300&action=edit full dmesg log (includes closing lit)
https://bugs.freedesktop.org/show_bug.cgi?id=107689
--- Comment #3 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- (In reply to john-s-84 from comment #2)
Created attachment 141300 [details] full dmesg log (includes closing lit)
I tried to reproduce the issues you report using Lexa card but encountered other, different bugs. But using Baffin ASIC i was able to reproduce something that looks what you experience. Will investigate.
https://bugs.freedesktop.org/show_bug.cgi?id=107689
--- Comment #4 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- Created attachment 141310 --> https://bugs.freedesktop.org/attachment.cgi?id=141310&action=edit 0001-drm-amdgpu-Only-retrieve-GPU-address-of-GART-table-a.patch
Please try with our latest kernel from here https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next + the attached patch on top. Also just to be sure try to use latest firmware for amdgpu from here https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/...
P.S Don't forget to update your initramfs after you copy the firmware files to your /lib/firmware/XXX locatiotion
https://bugs.freedesktop.org/show_bug.cgi?id=107689
john-s-84@gmx.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #141300|0 |1 is obsolete| |
--- Comment #5 from john-s-84@gmx.net --- Created attachment 141365 --> https://bugs.freedesktop.org/attachment.cgi?id=141365&action=edit dmesg_applied_patch_0001-drm-amdgpu-Only-retrieve-GPU-address-of-GART-table
Tried the amd-staging-drm-next kernel, firmware and applied your patch. The patch does not have any effect. Shutdown still does not work. Please see the dmesg log.
https://bugs.freedesktop.org/show_bug.cgi?id=107689
--- Comment #6 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- I noticed amdgpu 0000:01:00.0: GPU pci config reset print, long before the suspend. Did you manually trigger device reset before the suspend ?
https://bugs.freedesktop.org/show_bug.cgi?id=107689
--- Comment #7 from john-s-84@gmx.net --- What I have done: - Applied the patches - Pressed the power on button for suspend - Pressed the power on button again for resume - Copy and paste the dmesg log
How critical are these errors? Respectively, which error causes the error state?
1. amdgpu: [powerplay] Voltage value looks like a Leakage ID but it's not patched
2. amdgpu: [powerplay] failed to send message 254 ret is 0
3. [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v8_0> failed -22 [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22). [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 9 test failed (scratch(0xC040)=0xCAFEDEAD) [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
4. amdgpu 0000:01:00.0: kfd not supported on this ASIC
5. amdgpu: [powerplay] Failed to retrieve minimum clocks. amdgpu: [powerplay] Error in phm_get_clock_info
6. [drm:dc_create [amdgpu]] *ERROR* DC: Number of connectors is zero!
https://bugs.freedesktop.org/show_bug.cgi?id=107689
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |MOVED Status|NEW |RESOLVED
--- Comment #8 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/491.
dri-devel@lists.freedesktop.org