https://bugs.freedesktop.org/show_bug.cgi?id=88978
Bug ID: 88978 Summary: [bisected] [SI Scheduler] Graphical corruption in Dota 2 Product: Mesa Version: git Hardware: All OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: commendsarnex@gmail.com QA Contact: dri-devel@lists.freedesktop.org CC: tstellar@gmail.com
Hi guys. If I use LLVM git, I get these graphical glitches in Dota 2 native.
https://i.imgur.com/I4vyWFt.jpg
The bug has been bisected to LLVM: 51a3c27d6e0c66cc8d2d1da8e9205fec7b74ca5c R600/SI: Define a schedule model and enable the generic machine scheduler
I'm using Mesa git, Kernel 3.18.5 and Linux Mint.
Thanks alot,
sarnex
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #1 from smoki smoki00790@gmail.com --- Which card is used? I already saw those artifacts in bug 88758 and tried to reproduce it on Kabini with two apitraces i have, but i can't. Those are from bug 67887 and bug 88301.
Can you also reproduce artifacts with any of those two? Or made new one, might be some recent game update only show an issue or it arise only with particular settings.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #2 from sarnex commendsarnex@gmail.com --- Hi smoki. I am on a Radeon HD 7950(TAHITI). If I try the trace from https://bugs.freedesktop.org/show_bug.cgi?id=67887, I DO see the same graphical glitches that I get.
Here an apitrace I just took: https://idontevenlift.no-ip.org/sarnex_dota_linux.trace.xz
The other apitrace is too large, I'll try it tomorrow.
Thanks, sarnex
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #3 from smoki smoki00790@gmail.com --- OK, thanks for trace, but i can't reproduce it either with your trace on Kabini.
So as it happen on R7 265 and HD 7950, i guess this is likely GCN 1.0 only bug.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #4 from Tom Stellard tstellar@gmail.com --- (In reply to sarnex from comment #0)
Hi guys. If I use LLVM git, I get these graphical glitches in Dota 2 native.
https://i.imgur.com/I4vyWFt.jpg
The bug has been bisected to LLVM: 51a3c27d6e0c66cc8d2d1da8e9205fec7b74ca5c R600/SI: Define a schedule model and enable the generic machine scheduler
I'm using Mesa git, Kernel 3.18.5 and Linux Mint.
Thanks alot,
sarnex
Can you run the game with the environment variable: R600_DEBUG=ps,vs,gs and post the output.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #5 from sarnex commendsarnex@gmail.com ---
(In reply to Tom Stellard from comment #4)
(In reply to sarnex from comment #0)
Hi guys. If I use LLVM git, I get these graphical glitches in Dota 2 native.
https://i.imgur.com/I4vyWFt.jpg
The bug has been bisected to LLVM: 51a3c27d6e0c66cc8d2d1da8e9205fec7b74ca5c R600/SI: Define a schedule model and enable the generic machine scheduler
I'm using Mesa git, Kernel 3.18.5 and Linux Mint.
Thanks alot,
sarnex
Can you run the game with the environment variable: R600_DEBUG=ps,vs,gs and post the output.
Hi Tom, thanks for replying.
The log is here, since it's too big to be an attachment. Skip to near the end to see the in-game bugged time, the beginning is mostly the menu.
Log: http://paste.ubuntu.com/10075950/
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #6 from Tom Stellard tstellar@gmail.com --- (In reply to sarnex from comment #5)
(In reply to Tom Stellard from comment #4)
(In reply to sarnex from comment #0)
Hi guys. If I use LLVM git, I get these graphical glitches in Dota 2 native.
https://i.imgur.com/I4vyWFt.jpg
The bug has been bisected to LLVM: 51a3c27d6e0c66cc8d2d1da8e9205fec7b74ca5c R600/SI: Define a schedule model and enable the generic machine scheduler
I'm using Mesa git, Kernel 3.18.5 and Linux Mint.
Thanks alot,
sarnex
Can you run the game with the environment variable: R600_DEBUG=ps,vs,gs and post the output.
Hi Tom, thanks for replying.
The log is here, since it's too big to be an attachment. Skip to near the end to see the in-game bugged time, the beginning is mostly the menu.
Thanks, would you also be able to get a dump using the last good commit.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #7 from sarnex commendsarnex@gmail.com --- (In reply to Tom Stellard from comment #6)
(In reply to sarnex from comment #5)
(In reply to Tom Stellard from comment #4)
(In reply to sarnex from comment #0)
Hi guys. If I use LLVM git, I get these graphical glitches in Dota 2 native.
https://i.imgur.com/I4vyWFt.jpg
The bug has been bisected to LLVM: 51a3c27d6e0c66cc8d2d1da8e9205fec7b74ca5c R600/SI: Define a schedule model and enable the generic machine scheduler
I'm using Mesa git, Kernel 3.18.5 and Linux Mint.
Thanks alot,
sarnex
Can you run the game with the environment variable: R600_DEBUG=ps,vs,gs and post the output.
Hi Tom, thanks for replying.
The log is here, since it's too big to be an attachment. Skip to near the end to see the in-game bugged time, the beginning is mostly the menu.
Thanks, would you also be able to get a dump using the last good commit.
Hi Tom,
Here is the log from the commit directly before R600/SI: Define a schedule model and enable the generic machine scheduler, and it has no graphical issues
Log: http://paste.ubuntu.com/10143733/
Thanks again, sarnex
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #8 from Daniel Scharrer daniel@constexpr.org --- The Mesa patch from bug 88561 comment 6 fixes this for me - at least the glitches with the posted apitrace.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #9 from sarnex commendsarnex@gmail.com --- (In reply to Daniel Scharrer from comment #8)
The Mesa patch from bug 88561 comment 6 fixes this for me - at least the glitches with the posted apitrace.
Hi Daniel,
Thanks for the information. The patch from Marek significantly reduces the number of artifacts in Dota 2, but it does not completely fix the issue and I still see a few artifacts per second. It seems that this bug and the Portal bug are related, but there is still an underlying bug somewhere.
Thanks, sarnex
https://bugs.freedesktop.org/show_bug.cgi?id=88978
Daniel Scharrer daniel@constexpr.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freedesktop.or | |g/show_bug.cgi?id=88561
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #10 from sarnex commendsarnex@gmail.com --- This issue is still present on LLVM git and Mesa git, although the frequency of the corruption is significantly lowered with Marek's patch from https://bugs.freedesktop.org/show_bug.cgi?id=88561#c6
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #11 from Daniel Scharrer daniel@constexpr.org --- Created attachment 115995 --> https://bugs.freedesktop.org/attachment.cgi?id=115995&action=edit patch to disable the machine scheduler for SI
I can confirm that these these glitches are still present on current LLVM + Mesa git with a 7950 (TAHITI).
Glitches happen in various games with different engines (Source, Unity, …). Here is a trace of The Talos Principle (first posted in bug #88561 comment 9), that still produces more than just occasional glitches (even with Marek's patch): http://constexpr.org/tmp/Talos-radeonsi.3.trace.xz (147 MiB)
Like sarnex, I have bisected this to LLVM 51a3c27d6e0c66cc8d2d1da8e9205fec7b74ca5c (r227461). I had to revert b8797a7 and a99a16a in current Mesa git for it to build against that LLVM revision.
Some Source engine games (L4D2, Nuclear Dawn, maybe others) don't just produce graphical glitches but also frequently lock up the GPU since a later change to the machine scheduler (r233366) - see bug #90378.
Disabling the machine scheduler for SI on current LLVM (see attached patch) also fixes both the lockups an graphical glitches.
Additionally, using R600_DEBUG=switch_on_eop with unpatched LLVM also works around both the graphical glitches and and GPU lockups.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #12 from Daniel Scharrer daniel@constexpr.org --- Created attachment 115996 --> https://bugs.freedesktop.org/attachment.cgi?id=115996&action=edit R600_DEBUG=ps,vs,gs output for the Talos trace with r227460 (no lockups)
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #13 from Daniel Scharrer daniel@constexpr.org --- Created attachment 115997 --> https://bugs.freedesktop.org/attachment.cgi?id=115997&action=edit R600_DEBUG=ps,vs,gs output for the Talos trace with r227461 (lockups)
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #14 from Tom Stellard tstellar@gmail.com --- Can you post your dmesg log too?
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #15 from Daniel Scharrer daniel@constexpr.org --- Created attachment 116176 --> https://bugs.freedesktop.org/attachment.cgi?id=116176&action=edit dmesg log
Here is the dmesg log with Linux 4.0.4-gentoo and LLVM patched to disable the machine scheduler for SI, after replaying both sarnex' and my trace. I don't have an unpatched LLVM build right now, but don't remember the dmesg output being different.
The log is compressed because there are lots of GPU faults at the end (bug #87278) which pushed the uncompressed log over the attachment size limit - not sure if you wanted those or just the startup part.
Bug #90378 has the dmesg output for L4D2 including GPU lockups with an unpatched (but older revision on) LLVM and 4.0.1-gentoo in attachment 115653.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #16 from Tom Stellard tstellar@gmail.com --- (In reply to Daniel Scharrer from comment #15)
Created attachment 116176 [details] dmesg log
Here is the dmesg log with Linux 4.0.4-gentoo and LLVM patched to disable the machine scheduler for SI, after replaying both sarnex' and my trace. I don't have an unpatched LLVM build right now, but don't remember the dmesg output being different.
The log is compressed because there are lots of GPU faults at the end (bug #87278) which pushed the uncompressed log over the attachment size limit - not sure if you wanted those or just the startup part.
Bug #90378 has the dmesg output for L4D2 including GPU lockups with an unpatched (but older revision on) LLVM and 4.0.1-gentoo in attachment 115653 [details].
Would you be able to post an API trace of one of the games that is locking up?
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #17 from Tom Stellard tstellar@gmail.com --- (In reply to Tom Stellard from comment #16)
(In reply to Daniel Scharrer from comment #15)
Created attachment 116176 [details] dmesg log
Here is the dmesg log with Linux 4.0.4-gentoo and LLVM patched to disable the machine scheduler for SI, after replaying both sarnex' and my trace. I don't have an unpatched LLVM build right now, but don't remember the dmesg output being different.
The log is compressed because there are lots of GPU faults at the end (bug #87278) which pushed the uncompressed log over the attachment size limit - not sure if you wanted those or just the startup part.
Bug #90378 has the dmesg output for L4D2 including GPU lockups with an unpatched (but older revision on) LLVM and 4.0.1-gentoo in attachment 115653 [details] [details].
Would you be able to post an API trace of one of the games that is locking up?
Nevermind, I think the one you posted already should be enough.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #18 from Tom Stellard tstellar@gmail.com --- Created attachment 116227 --> https://bugs.freedesktop.org/attachment.cgi?id=116227&action=edit sdiff output for suspected bad shader
Here is a dump from sdiff of a good shader with no GPU protection faults (left side) and a bad shader that causes GPU protection faults (right side). Search for the pipe character '|' to find the only difference between the two shaders.
I'm not sure yet why this difference would lead to GPU protection faults.
https://bugs.freedesktop.org/show_bug.cgi?id=88978
--- Comment #19 from sarnex commendsarnex@gmail.com --- I cannot reproduce this on Mesa master, it must have been fixed with radeonsi: completely rework updating descriptors without CP DMA
https://bugs.freedesktop.org/show_bug.cgi?id=88978
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #20 from Michel Dänzer michel@daenzer.net --- Resolving per comment 19, thanks for the update.
dri-devel@lists.freedesktop.org