https://bugs.freedesktop.org/show_bug.cgi?id=83416
Priority: medium Bug ID: 83416 Assignee: dri-devel@lists.freedesktop.org Summary: [radeonsi] Serious Sam 3 lockup during its start Severity: major Classification: Unclassified OS: Linux (All) Reporter: lordheavym@gmail.com Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Drivers/Gallium/radeonsi Product: Mesa
Created attachment 105638 --> https://bugs.freedesktop.org/attachment.cgi?id=105638&action=edit kernel.log file with kernel 3.17rc3
* Tested with both kernel 3.16.1 and kernel 3.17rc3, with and without hyperz * OpenGL renderer string: Gallium 0.4 on AMD PITCAIRN * OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.4.0-devel (git-021e84f)
I can reproduce the lockup with the trace: http://pkgbuild.com/~lcarlier/trace/Sam3.tar.xz
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #1 from smoki smoki00790@gmail.com ---
Can not reproduce it on Kabini, with same git version 021e84f.
That with mesa builded against current llvm-3.6 svn just pass fine, and when i build mesa against 3.5 this this apitrace just segfault... in both cases no lockup.
Debian.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #2 from Michel Dänzer michel@daenzer.net --- I get no lockup either, but I do see the same GPUVM protection faults:
radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF00819
The FF bits make me suspect bits 32-4x of the GPUVM address are getting clobbered, maybe because of the LLVM backend generating invalid shader code.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #3 from Laurent carlier lordheavym@gmail.com --- Created attachment 105674 --> https://bugs.freedesktop.org/attachment.cgi?id=105674&action=edit ouput of 'R600_DEBUG=ps,vs glretrace Sam3.trace'
LLVM is 3.6svn r216889
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #4 from Laurent carlier lordheavym@gmail.com --- Link to the trace in google drive: https://drive.google.com/file/d/0B1WCo3k21FK3dTZmaFFmU2wwQzQ/edit?usp=sharin...
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #5 from smoki smoki00790@gmail.com --- (In reply to comment #2)
I get no lockup either, but I do see the same GPUVM protection faults:
radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF00819
The FF bits make me suspect bits 32-4x of the GPUVM address are getting clobbered, maybe because of the LLVM backend generating invalid shader code.
For me nothing new in dmesg, but there is something very interesting here happen. When radeonsi.so is striped this trace segfault for me, if not striped it pass fine no segfault, what that can be? Hmm...
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #6 from Laurent carlier lordheavym@gmail.com --- Just to note that this trace is produced with apitrace 5.0 and with the following commandline: GALLIUM_HUD=num-bytes-moved apitrace32 trace %command%
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #7 from smoki smoki00790@gmail.com --- Created attachment 105676 --> https://bugs.freedesktop.org/attachment.cgi?id=105676&action=edit segfault
(In reply to comment #5)
For me nothing new in dmesg, but there is something very interesting here happen. When radeonsi.so is striped this trace segfault for me, if not striped it pass fine no segfault, what that can be? Hmm...
After restart it works but segfault again, wwird... this one tried on a pure 32bit OS.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #8 from smoki smoki00790@gmail.com --- @Laurent carlier
Is this new issue or regressions maybe?
Don't have SSAM3 game, but i remember from earlier versions that Serios Sam have bunch of different settings, maybe you can try some different settings started with Low or something, maybe only some of settings triggers the issue, etc.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #9 from smoki smoki00790@gmail.com ---
Try also some stable mesas if you can 10.2 or 10.3, i have very strange issues with 32bit mesa and apps, particulary build system in current git seems very broken for me. Make install, SSE41 macro compile needs much more CPU time, striping does not work fine, default optimization level is not good -O3 fixes it, etc.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #10 from Laurent carlier lordheavym@gmail.com --- Just tried with mesa-10.2.6/llvm-3.4.2 and the trace works fine except the following from LLVM: LLVM ERROR: ran out of registers during register allocation
Here are the flags used: CPPFLAGS="-D_FORTIFY_SOURCE=2" CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4" CXXFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4" LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro" DEBUG_CFLAGS="-g -fvar-tracking-assignments" DEBUG_CXXFLAGS="-g -fvar-tracking-assignments"
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #11 from Vadim Girlin ptpzz@yandex.ru --- (In reply to comment #2)
I get no lockup either, but I do see the same GPUVM protection faults:
radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0FF00819
The FF bits make me suspect bits 32-4x of the GPUVM address are getting clobbered, maybe because of the LLVM backend generating invalid shader code.
I've found similar bug with incorrect high part of the address and the problem was that llvm backend uses S_ADD/SUB_I32 for lowering 64-bit integer add/sub, but it should use _U32 versions instead. I was going to send the patch but the fix is trivial, basically just replace all uses of S_ADD/SUB_I32 with S_ADD/SUB_U32. I'm not sure if you are hitting the same issue though.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #12 from Tom Stellard tstellar@gmail.com --- Created attachment 105709 --> https://bugs.freedesktop.org/attachment.cgi?id=105709&action=edit Fix suggested by Vadim
Can you try this patch?
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #13 from Michel Dänzer michel@daenzer.net --- (In reply to comment #12)
Can you try this patch?
The patch fixes the GPUVM faults for me while replaying the apitrace.
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #14 from Laurent carlier lordheavym@gmail.com --- (In reply to comment #12)
Created attachment 105709 [details] [review] Fix suggested by Vadim
Can you try this patch?
It doesn't fix the lockup for me. I've tested mesa-git with llvm 3.4.3 both the trace and the game, and they failled both with the following error:
LLVM ERROR: Cannot select: 0x1671def0: i32 = truncate 0x16716ff4 [ORD=21] [ID=121] 0x16716ff4: i128 = srl 0x1671cb14, 0x16717198 [ORD=21] [ID=102] 0x1671cb14: i128,ch = load 0x166a9484, 0x167123bc, 0x16712e20<LD16[%32](tbaa=!"const")> [ORD=21] [ID=90] 0x167123bc: i64,ch = CopyFromReg 0x166a9484, 0x16712330 [ID=81] 0x16712330: i64 = Register %vreg66 [ID=2] 0x16712e20: i64 = undef [ID=8] 0x16717198: i32 = Constant<96> [ID=76] In function: main
https://bugs.freedesktop.org/show_bug.cgi?id=83416
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #15 from Laurent carlier lordheavym@gmail.com --- I can confirm that 8bd67231797e5d79d72a4e91b37ea81da30c6df3 is fixing the hang.
Thanks Marek, closing!
https://bugs.freedesktop.org/show_bug.cgi?id=83416
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |---
--- Comment #16 from Laurent carlier lordheavym@gmail.com --- Bad luck, it's hanging again! -> reopened
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #17 from Grigori Goronzy greg@chown.ath.cx --- Does this Mesa patch help?
https://bugs.freedesktop.org/attachment.cgi?id=105755
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #18 from Laurent carlier lordheavym@gmail.com --- (In reply to comment #17)
Does this Mesa patch help?
No, it doesn't help
https://bugs.freedesktop.org/show_bug.cgi?id=83416
--- Comment #19 from Laurent carlier lordheavym@gmail.com --- Fixed with current mesa trunk, so closing
https://bugs.freedesktop.org/show_bug.cgi?id=83416
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution|--- |FIXED
dri-devel@lists.freedesktop.org