https://bugzilla.kernel.org/show_bug.cgi?id=87791
Bug ID: 87791 Summary: radeonsi lockup and oops Product: Drivers Version: 2.5 Kernel Version: 3.17 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: acab@digitalfuture.it Regression: No
Created attachment 156761 --> https://bugzilla.kernel.org/attachment.cgi?id=156761&action=edit lockup example (no oops)
After upgrading mesa from mesa-10.3.0 to mesa-10.3.1 the Radeon HD 7700 card locks up several times a day without any specific trigger (or reliable way to reproduce it). Xorg appears frozen with just the mouse pointer moving. It's not even possible to switch to a VT, however everything else works just fine. At any given time the X bt looks like this: Thread 2 (Thread 0x7fe8147be700 (LWP 2415)): #0 0x00007fe81b4d911c in pthread_cond_wait () from /lib64/libpthread.so.0 #1 0x00007fe816a807a3 in ?? () from /usr/lib64/dri/radeonsi_dri.so #2 0x00007fe816a7ffc7 in ?? () from /usr/lib64/dri/radeonsi_dri.so #3 0x00007fe81b4d5083 in start_thread () from /lib64/libpthread.so.0 #4 0x00007fe81b9da3ad in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fe81d607880 (LWP 2380)): #0 0x00007fe81b9836e9 in __memcpy_sse2_unaligned () from /lib64/libc.so.6 #1 0x00007fe816857a46 in ?? () from /usr/lib64/dri/radeonsi_dri.so #2 0x00007fe816858742 in ?? () from /usr/lib64/dri/radeonsi_dri.so #3 0x00007fe816859452 in ?? () from /usr/lib64/dri/radeonsi_dri.so #4 0x00007fe8168abf13 in ?? () from /usr/lib64/dri/radeonsi_dri.so #5 0x00007fe8168ac993 in ?? () from /usr/lib64/dri/radeonsi_dri.so #6 0x00007fe81684ad74 in ?? () from /usr/lib64/dri/radeonsi_dri.so #7 0x00007fe81684c510 in ?? () from /usr/lib64/dri/radeonsi_dri.so #8 0x00007fe81a58b883 in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #9 0x00007fe81a58c64e in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #10 0x00007fe81a58d160 in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #11 0x00007fe81a58d81c in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #12 0x00007fe81a56c674 in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #13 0x00007fe81a56cd84 in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #14 0x00007fe81a56dd5e in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #15 0x00007fe81a56e6ba in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #16 0x0000000000563e2d in miCopyRegion () #17 0x00000000005643b6 in miDoCopy () #18 0x00007fe81a56e6fd in ?? () from /usr/lib64/xorg/modules/libglamoregl.so #19 0x0000000000511828 in ?? () #20 0x0000000000432291 in ?? () #21 0x0000000000435d3e in ?? () #22 0x0000000000439b6a in ?? () #23 0x00007fe81b913a65 in __libc_start_main () from /lib64/libc.so.6 #24 0x000000000042531e in _start ()
https://bugzilla.kernel.org/show_bug.cgi?id=87791
--- Comment #1 from aCaB acab@digitalfuture.it --- Created attachment 156771 --> https://bugzilla.kernel.org/attachment.cgi?id=156771&action=edit lockup example (with oops)
https://bugzilla.kernel.org/show_bug.cgi?id=87791
aCaB acab@digitalfuture.it changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #156771|application/octet-stream |text/plain mime type| |
https://bugzilla.kernel.org/show_bug.cgi?id=87791
aCaB acab@digitalfuture.it changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #156761|application/octet-stream |text/plain mime type| |
https://bugzilla.kernel.org/show_bug.cgi?id=87791
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexdeucher@gmail.com
--- Comment #2 from Alex Deucher alexdeucher@gmail.com --- (In reply to aCaB from comment #0)
Created attachment 156761 [details] lockup example (no oops)
After upgrading mesa from mesa-10.3.0 to mesa-10.3.1 the Radeon HD 7700 card locks up several times a day without any specific trigger (or reliable way to reproduce it).
This sounds like a mesa regression rather than a kernel driver bug. Can you bisect mesa?
https://bugzilla.kernel.org/show_bug.cgi?id=87791
--- Comment #3 from aCaB acab@digitalfuture.it --- (In reply to Alex Deucher from comment #2)
This sounds like a mesa regression rather than a kernel driver bug. Can you bisect mesa?
I understand mesa may be sending crap to the kernel space but that doesn't sound like a good reason to deref a NULL.
As for bisecting mesa, I am certainly willing to do that but I need a reliable way to trigger the lockup rather than just log in and wait for it to occour. Will see if I get more hints over then next few days.
https://bugzilla.kernel.org/show_bug.cgi?id=87791
--- Comment #4 from Michel Dänzer michel@daenzer.net --- (In reply to aCaB from comment #3)
I understand mesa may be sending crap to the kernel space but that doesn't sound like a good reason to deref a NULL.
AFAICT that should be fixed by the changes to radeon_ttm.c in https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5... .
https://bugzilla.kernel.org/show_bug.cgi?id=87791
--- Comment #5 from aCaB acab@digitalfuture.it --- Michel, Thanks for your pointer and sorry for the late answer.
I'll try harder to find a reproducible case (firefox with some large animation seems to trigger it some times). In the meantime, if the lockup is a mesa bug then feel free to close this ticket.
dri-devel@lists.freedesktop.org