https://bugs.freedesktop.org/show_bug.cgi?id=68178
Priority: medium Bug ID: 68178 Assignee: dri-devel@lists.freedesktop.org Summary: evergreen: hard lockup on suspend and resume with current firmware Severity: major Classification: Unclassified OS: Linux (All) Reporter: nine@detonation.org Hardware: x86-64 (AMD64) Status: NEW Version: unspecified Component: DRM/Radeon Product: DRI
Created attachment 84133 --> https://bugs.freedesktop.org/attachment.cgi?id=84133&action=edit dmesg of my system for information
As soon as the up to date firmware package is installed, my system hangs on suspend and again on resume with the screen turned off and the system not reacting to anything. Tested it on various kernels with the earliest being 3.7.10 (current openSUSE kernel) and the latest being 3.11-rc5.
I tried to get more information, but there are no logs, the screen is turned off and even netconsole did not show more. Do you have suggestions about how to debug this or things that I can try to narrow it down?
My GPU is a:
01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Redwood [Radeon HD 5670] (prog-if 00 [VGA controller]) Subsystem: PC Partner Limited Device e151 Flags: bus master, fast devsel, latency 0, IRQ 48 Memory at e0000000 (64-bit, prefetchable) [size=256M] Memory at f4420000 (64-bit, non-prefetchable) [size=128K] I/O ports at e000 [size=256] Expansion ROM at f4400000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150] Advanced Error Reporting Kernel driver in use: radeon
Attaching dmesg of a running system just for info.
https://bugs.freedesktop.org/show_bug.cgi?id=68178
Alex Deucher agd5f@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|evergreen: hard lockup on |evergreen: hard lockup on |suspend and resume with |suspend and resume with dpm |current firmware |
--- Comment #1 from Alex Deucher agd5f@yahoo.com --- It's not the firmware per se. It's probably the new dpm code. Do you still get when dpm is disabled?
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #2 from nine@detonation.org --- (In reply to comment #1)
It's not the firmware per se. It's probably the new dpm code. Do you still get when dpm is disabled?
Indeed. If I turn off dpm, I can suspend successfully and more or less successfully resume. But a few seconds after resume my X server dies. May be related or not. Attaching a dmesg taken after that.
It's strange though that with an older kernel I still get hangs on suspend and resume even though they do not support dpm at all.
So how can I proceed?
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #3 from nine@detonation.org --- Created attachment 84144 --> https://bugs.freedesktop.org/attachment.cgi?id=84144&action=edit dmesg after X server crash after suspend/resume without dpm
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #4 from Alex Deucher agd5f@yahoo.com --- Sounds like this may be a problem independent of dpm. Have you ever had successful suspend and resume? When you say X crash, do you you mean X hangs? system hangs? segfault?
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #5 from nine@detonation.org --- (In reply to comment #4)
Sounds like this may be a problem independent of dpm. Have you ever had successful suspend and resume?
I've always (meaning > 5 years) had successful suspend and resume on this machine. Until I updated the firmware files to test UVD and then dpm. With the original firmware contained in openSUSE's kernel-firmware-20130114git-1.2.1 package, suspend/resume works just fine. I started having problems immediately after updating the radeon firmware files from your FTP site. In the meantime, openSUSE shipped an update to the kernel-firmware package with which I see the same problems.
When you say X crash, do you you mean X hangs? system hangs? segfault?
I mean the X server terminated unexpectedly and I got thrown back to the login screen. Other than this message and the part about GPU lockup in the dmesg dump I posted, I could not find any messages.
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #6 from nine@detonation.org --- It seems like I have two independent problems which may explain the confusing results I got:
* with DPM enabled I get hard locks on suspend and sometimes on resume. After suspend I have to turn off power manually but it seems like it successfully writes the suspend image to disk. On resume it sometimes locks with disabled output, sometimes it works.
* Regardless of DPM enabled or not I get X server crashes within minutes after a suspend/resume cycle. This happens with kernel 3.11.1 with current firmware. When I downgrade my kernel-firmware package to 20130114git-1.2.1, this problem vanishes. But the firmware may be the original cause. With the old firmware I for example do not have direct rendering or acceleration.
At least I found a logfile giving more information about the X server crash. Attaching.
Is there anything else I can do to debug these problems?
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #7 from nine@detonation.org --- Created attachment 86476 --> https://bugs.freedesktop.org/attachment.cgi?id=86476&action=edit Xorg.0.log after X server crash after resume
https://bugs.freedesktop.org/show_bug.cgi?id=68178
nine@detonation.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #84144|0 |1 is obsolete| |
--- Comment #8 from nine@detonation.org --- Created attachment 86477 --> https://bugs.freedesktop.org/attachment.cgi?id=86477&action=edit dmesg after X server crashed
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #9 from Alex Deucher agd5f@yahoo.com --- THe only thing that has changed in the ucode is adding new ucode for UVD and SMC. If you use the newer firmware package but remove the UVD and/or SMC ucode images, you should get the same behavior as with the old firmware package. Since dpm is not enabled by default, I think the problem is probably with UVD.
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #10 from nine@detonation.org --- Indeed! After removing CYPRESS_uvd.bin the X server crashes vanish. Only the hard locks with dpm remain.
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #11 from Lars lars+freedesktop@6xq.net --- (In reply to comment #6)
- Regardless of DPM enabled or not I get X server crashes within minutes
after a suspend/resume cycle. This happens with kernel 3.11.1 with current firmware. When I downgrade my kernel-firmware package to 20130114git-1.2.1, this problem vanishes. But the firmware may be the original cause. With the old firmware I for example do not have direct rendering or acceleration.
I have the same problem with kernel 3.11.6 on Gentoo Linux (stable) with a Radeon HD 4650. Hibernation itself works, but shortly after resuming the machine Xorg crashes with a bus error. Additionally dmesg contains messages about GPU lockups before or after hibernating. Removing the uvd/smc blobs stops Xorg from crashing, but disables direct rendering (including XVideo, …). Note that Xorg does *not* crash after resuming from suspend to RAM.
Downgrading to 3.10 (with the same firmware package) does not solve the problem, so I’m back to 3.4.67 for now, which works (both hibernate/direct rendering) for me.
https://bugs.freedesktop.org/show_bug.cgi?id=68178
Alex Deucher agd5f@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|evergreen: hard lockup on |xserver crashes with uvd |suspend and resume with dpm |
https://bugs.freedesktop.org/show_bug.cgi?id=68178
--- Comment #12 from nine@detonation.org --- Good news: somewhere between kernel 2.12.1 and 2.13-rc2 the hard lockup on suspend got fixed! I've suspended several times now without a single lockup. On resume though I still have lockups about half of the time I tried. This is without UVD firmware but with active DPM
https://bugs.freedesktop.org/show_bug.cgi?id=68178
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #13 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/375.
dri-devel@lists.freedesktop.org