https://bugs.freedesktop.org/show_bug.cgi?id=94242
Bug ID: 94242 Summary: [radeonsi] Crash while running Fedora mock tool for prompting root (gtksu) Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: shawn.starr@rogers.com QA Contact: dri-devel@lists.freedesktop.org
Kernel: 4.5.0-0.rc4.git2.2.fc24.x86_64 MESA: git master, Feb 19-20th builds LLVM: trunk 3.9, Feb 19-20th builds DDX:
For some reason, X crashes with radeonsi triggering VM fault, when in GNOME or KDE environments:
If you open up a shell console (gnome-terminal, konsole etc), run mock as non-root, as soon as an attempt to prompt for root happens (with gtksu) it locks up system.
[ 38.111551] radeon 0000:01:00.0: GPU fault detected: 146 0x0008480c [ 38.111861] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 38.112219] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0804800C [ 38.112576] VM fault (0x0c, vmid 4) at page 0, read from 'TC2' (0x54433200) (72) [ 38.112931] radeon 0000:01:00.0: GPU fault detected: 146 0x0008440c [ 38.113229] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08000000 [ 38.113587] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001 [ 38.113945] VM fault (0x01, vmid 4) at page 134217728, read from 'TC3' (0x54433300) (68)
I've attached a VM fault debug of crash
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #1 from Shawn Starr shawn.starr@rogers.com --- Created attachment 121873 --> https://bugs.freedesktop.org/attachment.cgi?id=121873&action=edit R600_DEBUG=check_vm capture of VM fault
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #2 from Michel Dänzer michel@daenzer.net --- (In reply to Shawn Starr from comment #0)
If you open up a shell console (gnome-terminal, konsole etc), run mock as non-root, as soon as an attempt to prompt for root happens (with gtksu) it locks up system.
Since you can retrieve the GPUVM fault messages, it's hard to believe that the system locks up completely. Can you try logging in via ssh after the problem occurs and getting a gdb backtrace of the Xorg process?
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #3 from Michel Dänzer michel@daenzer.net --- This isn't limited to gtksu. I can reproduce it fairly quickly by playing around with a MATE desktop session (which seems to use GTK2). OTOH I haven't run into it with this GNOME3 session, which mostly uses GTK3, though with some GTK2 apps as well.
I bisected it to 9aaf28da ("radeonsi: enable compiling one variant per shader"). I also confirmed that it happens with Marek's current si-one-variant branch as well as an older snapshot of that branch.
Now the "fun" part will be tracking down which glamor shaders are broken by this and why. Meanwhile, it might be better to disable the single shader variant by default, especially on the 11.2 branch.
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #4 from Shawn Starr shawn.starr@rogers.com --- Attempting to attach gdb to X, I am unable to break out of gdb.
X info:
X.Org X Server 1.18.0 Release Date: 2015-11-09 X Protocol Version 11, Revision 0
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #5 from Michel Dänzer michel@daenzer.net --- Created attachment 121931 --> https://bugs.freedesktop.org/attachment.cgi?id=121931&action=edit apitrace reproducing the problem
This apitrace reproduces the problem for me on Kaveri and Tonga.
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #6 from Marek Olšák maraeo@gmail.com --- Sadly, I can't reproduce this on Verde, Bonaire, Tonga using the apitrace.
Could you please get a new check_vm report with this branch?
https://cgit.freedesktop.org/~mareko/mesa/log/?h=ddebug-shader-dump
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #7 from Michel Dänzer michel@daenzer.net --- Created attachment 121979 --> https://bugs.freedesktop.org/attachment.cgi?id=121979&action=edit check_vm dump from ddebug-shader-dump branch
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #8 from Marek Olšák maraeo@gmail.com --- The bad news is the check_vm report probably doesn't contain the problematic shaders. The good news is I can reproduce this after updating LLVM, thus this is an LLVM bug. I'm bisecting.
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #9 from Marek Olšák maraeo@gmail.com --- The first bad commit:
commit 98ef4478258fda9028cd1786841eca952c136319 Author: Tom Stellard thomas.stellard@amd.com Date: Fri Feb 12 23:45:29 2016 +0000
AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions
Reviewers: arsenm
Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D16603
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260765 91177308-0d34-0410-b5e6-96231b3b80d8
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #10 from Marek Olšák maraeo@gmail.com --- Created attachment 121988 --> https://bugs.freedesktop.org/attachment.cgi?id=121988&action=edit problematic shader
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #11 from Marek Olšák maraeo@gmail.com --- The problematic shader is attached. It has "s_branch" at the end "ret" somewhere in the middle. My initial theory is that the shader fails to jump to the epilog, which is outside of the binary, and jumps somewhere else. It may be even stuck in an infinite loop due to an incorrect jump.
https://bugs.freedesktop.org/show_bug.cgi?id=94242
Ilia Mirkin imirkin@alum.mit.edu changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #121931|text/plain |application/octet-stream mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #12 from Oleg Suchilov thevoidnnos@gmail.com --- got the same issue. Gigabyte HD7870 Arch Linux x64 mesa-git - from mesa-git repo (http://pkgbuild.com/~lcarlier/mesa-git/) kernel - linux-mainline 4.5.0-rc7-mainline
radeon 0000:01:00.0: GPU fault detected: 147 0x0c024801 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FFFF860 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048001
last working version was on commit 89d25a8 (mesa-11.2) problems started after commit ff360a5 (mesa-11.3)
when i launch 'plank' or 'mate-system-monitor' everything freezes (on some versions/commits my mouse is still working and on some it doesn't) and my Xorg server crashes. sometimes the session restarts, but after login the opengl is not available (glxinfo shows some errors)
https://bugs.freedesktop.org/show_bug.cgi?id=94242
--- Comment #13 from Marek Olšák maraeo@gmail.com --- The fix is under review: http://reviews.llvm.org/D17964
https://bugs.freedesktop.org/show_bug.cgi?id=94242
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #14 from Michel Dänzer michel@daenzer.net --- Fixed in LLVM SVN r263441.
dri-devel@lists.freedesktop.org