https://bugs.freedesktop.org/show_bug.cgi?id=100979
Bug ID: 100979 Summary: Radeon r4 on a6-6310(BEEMA) APU hard lockup on hibernate and on second resume from suspend Product: DRI Version: DRI git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: major Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: soprwa@gmail.com
I'm using kernel 4.11 on gentoo with SI/CIK enabled on Lenovo G50-45 Notebook. Machine has AMD APU A6 6130 with Radeon r4 graphics card (Beema/Mullins). This CPU supports olny AMD IOMMU v1. There is no discrete graphic card on it, APU only.
When I try to hibernate this notebook it doesn't turning off. I have to press power button to reboot the machine. Similar situation is on "radeon" driver, and I had submitted bug report about this on kernel's bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=191571
But in this situation I'm unable to bisect because as I remember correctly problem always occur on amdgpu driver, so I think those two can correlate with each other. (dmesg form hibernation process attached).
As for suspend. I can suspend/resume machine successfully only once in a row. Second time machine suspends correctly, but on resume I have hard lockup, fans are spinning on full rpm's and I cannot do anything but pressing power button to reboot(cold boot) the netbook. Moreover in dmesg after first suspend I've got error messages:
[drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status failed [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed
Attachments: 1. Log after clean start. 2. Kernel's config file. 3. Log after hibernation process. 4. Log after first suspend/resume. 5. Log after second suspend/resume.
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #1 from Przemek soprwa@gmail.com --- Created attachment 131281 --> https://bugs.freedesktop.org/attachment.cgi?id=131281&action=edit system log after clean boot
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #2 from Przemek soprwa@gmail.com --- Created attachment 131282 --> https://bugs.freedesktop.org/attachment.cgi?id=131282&action=edit system log after performing hibernate
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #3 from Przemek soprwa@gmail.com --- Created attachment 131283 --> https://bugs.freedesktop.org/attachment.cgi?id=131283&action=edit kernel config file
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #4 from Przemek soprwa@gmail.com --- Created attachment 131284 --> https://bugs.freedesktop.org/attachment.cgi?id=131284&action=edit system log after first suspend
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #5 from Przemek soprwa@gmail.com --- Created attachment 131285 --> https://bugs.freedesktop.org/attachment.cgi?id=131285&action=edit system log after second suspend and hard lockup
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #6 from Daniel daniel@danieltippmann.de --- The situation over here is maybe related:
Hibernating/suspend-to-disk/s4 on my new AMD Carrizo box fails.
Upon hibernate, the system just reboots. No hints in the logs.
Suspend-to-RAM/S3 works fine.
When I disable amdgpu, hibernate & resume work fine.
Linux-4.11.3-gentoo-x86_64-AMD_A12-9800_RADEON_R7,_12_COMPUTE_CORES_4C+8G-with-gentoo-2.3, sys-kernel/linux-firmware-20170519
Cheers, Daniel
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #7 from Daniel daniel@danieltippmann.de --- Now on kernel 4.11.7 and linux-firmware-20170622, still no good.
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #8 from Przemek soprwa@gmail.com --- Created attachment 132411 --> https://bugs.freedesktop.org/attachment.cgi?id=132411&action=edit dmesg after first suspend/resume process kernel 4.12
Situation still persist on kernel 4.12. After second "suspend" machine cannot resume. I've attached dmesg after first suspend/resume process.
If I could help/test patches please let me know. Thanks for your effort, Przemek.
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #9 from Przemek soprwa@gmail.com --- I have just upgraded kernel to 4.15. There is a big progress. Laptop can now successfully suspend (S3) and resume many times in a row.
_Thank you very much for your hard work_.
But unfortunately hibernate to disk (S4) still does not work as expected. Process is causing hard lockup (system freeze) just on the first attempt.
Display goes black (backlight is on), cpu is getting hot (fans are working 100% rpms), and I can do noting more than press "power button" to hard reset the machine.
There is no more "amdgpu_atombios_dp_link_train" message in dmesg instead there are mesages related to "swiotlb buffer is full" and "swiotlb: coherent allocation failed" as in the bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082.
Thanks, Przemek
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #10 from Przemek soprwa@gmail.com --- After some research I think that messages "swiotlb buffer is full" and "swiotlb: coherent allocation failed" are not related to this bug:
https://lkml.org/lkml/2018/1/16/106
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #11 from Przemek soprwa@gmail.com --- Created attachment 137085 --> https://bugs.freedesktop.org/attachment.cgi?id=137085&action=edit kernel log during hibernate
Kernel log taken during hibernate process. Netbook was booted up with command line "initcall_debug" and "no_console_suspend".
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #12 from Przemek soprwa@gmail.com --- The valid mailing list post, when it comes to messages "swiotlb buffer is full" and "swiotlb: coherent allocation failed", is: https://lkml.org/lkml/2018/1/10/132. Thanks to Alex Deucher correcting me in another bug report.
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #13 from Przemek soprwa@gmail.com --- Using the opportunity that I was working on another bug report with AMD DC kernel driver, and have git cloned agd5f drm-next-4.17-wip branch, I have tried to debug the situation on hibernate process with this experimental kernel also.
This time hibernation image is created, and laptop turns off after that.
But - during the hibernation process screen goes totally white, process take approximately 15-20 seconds on ssd drive before netbook gets power-off.
During resume hibernation image is read (it is possible to see percentage) and then screen goes white, and machine is in a locked-up state. I can do nothing but press power button to hard-reset the netbook.
Moreover I foud only once an error message in kernel log:
"Mar 28 01:36:54 eclipse kernel: [ 86.665473] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:39:crtc-0] flip_done timed out Mar 28 01:36:54 eclipse kernel: [ 86.828685] [drm:gfx_v7_0_ring_test_ring] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD) Mar 28 01:36:54 eclipse kernel: [ 86.828698] [drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block <gfx_v7_0> failed -22 Mar 28 01:36:54 eclipse kernel: [ 86.828703] [drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed (-22). Mar 28 01:36:54 eclipse kernel: [ 86.828718] dpm_run_callback(): pci_pm_restore+0x0/0xa0 returns -22 Mar 28 01:36:54 eclipse kernel: [ 86.828742] PM: Device 0000:00:01.0 failed to restore async: error -22"
of course I am unable to reproduce it. I am not sure how much it is related but this could be usefull, tough.
Kernel was booted up with amdgpu.dc=1.
Any help is appreciated.
Thanks, Przemek.
https://bugs.freedesktop.org/show_bug.cgi?id=100979
Przemek soprwa@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #14 from Przemek soprwa@gmail.com --- I have tested hibernation on amd-staging-drm-next (git status - commit fa16d1eb6a78b265480bd4c2b8739c1ea261cdd8 ) and it's working as it should (with minor glitch).
Moreover I can suspend and resume the machine many times in a row without freeze/lockup.
I nave no idea which commit made things work again, because haven't checked this future lately.
The only problem is that the second monitor connected to hdmi output is turning of after suspend/hibernate, and eDP screen brightness level is maxed out after resume from hibernate but this is not the case of this report.
Given the above I'm closing this bug report as RESOLVED/FIXED.
Thank you very much, Przemek.
https://bugs.freedesktop.org/show_bug.cgi?id=100979
--- Comment #15 from Przemek soprwa@gmail.com --- Just for the rectification,
after resume from hibernate both screens lights up as they should (eDP and HDMI).
dri-devel@lists.freedesktop.org