https://bugs.freedesktop.org/show_bug.cgi?id=89034
Bug ID: 89034 Summary: Firefox crashing xserver and some major rendering bugs Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: critical Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: smoki00790@gmail.com QA Contact: dri-devel@lists.freedesktop.org
Created attachment 113269 --> https://bugs.freedesktop.org/attachment.cgi?id=113269&action=edit valley_artifacts
Well i put this as radeosi bug as i am not sure if happens elsewhere. It is LLVM bug actually which happens once subreg liveness is enabled, so svn 228228 is bisected as bad. I running current llvm with subreg liveness disabled, as this is major/grave one bug for me.
Same issue was present in Tom's perf branches last month once subreg liveness is enabled too.
Hardware is kabini (Athlon 5350), current Debian Sid 64bit, kernel 3.19.0, xserver git, mesa git, etc...
Attached is screenshot from Unigine Valley for example, there are major rendering issues in many other GL apps.
For Firefox crashing xserver not sure how to debug that (btw it crashed X immidiate at starting FF) , if i build llvm with debug and assertations screen/monitor somehow looks like it goes sleep mode (without any messages in logs) and only hard reset helps. If i build llvm without those it just crashing xserver, but there is not enough info in Xorg.0.log :(
https://bugs.freedesktop.org/show_bug.cgi?id=89034
smoki smoki00790@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #113269|text/plain |image/png mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #1 from Michel Dänzer michel@daenzer.net --- (In reply to smoki from comment #0)
Attached is screenshot from Unigine Valley for example, there are major rendering issues in many other GL apps.
Haven't seen such artifacts on Kaveri. Can you attach the stderr output of running Valley or another affected app with R600_DEBUG=vs,ps with and without the bisected commit?
For Firefox crashing xserver not sure how to debug that [...]
Is there something about it in the Xorg stderr output? It should be captured in a gdm log file.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #2 from smoki smoki00790@gmail.com ---
Well i can but i need first to make a bottle of coffee because one llvm build takes 30 minutes on Kabini :D. but OK will do something... later...
I don't have gdm.log as i don't use it, plain startx is used. I can only attach Xorg.0.log without debug build... but well, later...
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #3 from Michel Dänzer michel@daenzer.net --- (In reply to smoki from comment #2)
Well i can but i need first to make a bottle of coffee because one llvm build takes 30 minutes on Kabini :D.
It shouldn't take that long after switching between the bisected commit and the one before it (or just re-applying/reverting it on top of whatever later commit you may have built last). If you're not using ccache yet, that might help a little as well.
I don't have gdm.log as i don't use it, plain startx is used.
Then something like
startx [...] 2>stderr.txt
should capture the stderr output in a file.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #4 from smoki smoki00790@gmail.com --- Created attachment 113272 --> https://bugs.freedesktop.org/attachment.cgi?id=113272&action=edit xcrash
Using wihout patched llvm to post this as i can't use proper llvm with browser to post this :D
This is with debug llvm, but without asserts enabled likely there are assertation there holding something to not logging, but dunno anyway might be useful.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #5 from Michel Dänzer michel@daenzer.net --- Please get a backtrace of the crash with gdb by attaching gdb to the Xorg process via ssh before starting Firefox.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #6 from smoki smoki00790@gmail.com --- (In reply to Michel Dänzer from comment #3)
Then something like
startx [...] 2>stderr.txt
should capture the stderr output in a file.
Actually that one drop something, this is with debug+assertion enabled llvm when monitor just gooes to "sleep" after starting firefox... seems like usefull :) ?
X: TargetRegisterInfo.cpp:189: virtual const llvm::TargetRegisterClass* llvm::TargetRegisterInfo::getMatchingSuperRegClass(const llvm::TargetRegisterClass*, const llvm::TargetRegisterClass*, unsigned int) const: Assertion `A && B && "Missing register class"' failed.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #7 from Michel Dänzer michel@daenzer.net --- Please get a gdb backtrace (bt full) for the assertion failure then. Bonus points for running Xorg with R600_DEBUG=vs,ps and grabbing its stderr output leading up to the assertion failure as well.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #8 from smoki smoki00790@gmail.com --- Well i think i can't do that, because i don't have another machine right now.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #9 from Michel Dänzer michel@daenzer.net --- Alternatively, you can try using a script for gdb's --command option, something like:
set logging on set logging redirect handle SIGPIPE nostop noprint continue bt full continue quit
That should capture the backtrace in a file called gdb.txt. See http://wiki.x.org/wiki/Development/Documentation/ServerDebugging/#index6h2 for more background.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #10 from smoki smoki00790@gmail.com --- Created attachment 113309 --> https://bugs.freedesktop.org/attachment.cgi?id=113309&action=edit gdb.txt
Hopefully i did it fine :) gdb.txt attached...
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #11 from smoki smoki00790@gmail.com --- Created attachment 113310 --> https://bugs.freedesktop.org/attachment.cgi?id=113310&action=edit stderr.txt
...and stderr.txt
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #12 from smoki smoki00790@gmail.com --- Those seems interesting:
err = 0x7f6cbf5538f0 <error: Cannot access memory at address 0x7f6cbf5538f0> buffer_data = 0x25 <error: Cannot access memory at address 0x25>
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #13 from sarnex commendsarnex@gmail.com --- I'm getting this issue also. I thought it was my own fault because it was so strange. I'm on Linux Mint, Kernel 3.19 and the crash happens when using LLVM 3.7 on Xserver 1.16 and git.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #14 from Lorenzo Bona lorenz.bona@gmail.com --- I'm using llvm from debian/ubuntu nightly build and I'm experiencing quite the same problem.
I've builded mesa against llvm3.7svn228689 yesterday evening. KDE starts ok, with startx, but as soon as I open a window (terminal, dolphin, firefox or what ever) X crash.
You can see my xserver crash log. (attachment 1)
The last good build was on llvm3.7~svn227765 which is around the 2nd of February (nightly builds struggled since then in 32bit build, until yesterday afternoon). Also I'm facing many corruptions on 227765 build in Dota2. (attachment 2)
Switched back to llvm3.6rc2-2 and it's ok now, X doesn't crash and Dota2 corruptions are gone.
Sorry but I can't bisect using nigthly packages (I'm not able to build a .deb from svn).
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #15 from Lorenzo Bona lorenz.bona@gmail.com --- Created attachment 113343 --> https://bugs.freedesktop.org/attachment.cgi?id=113343&action=edit attachment_1_Xorg
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #16 from Lorenzo Bona lorenz.bona@gmail.com --- Created attachment 113344 --> https://bugs.freedesktop.org/attachment.cgi?id=113344&action=edit attachment_2_Dota2
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #17 from Lorenzo Bona lorenz.bona@gmail.com --- Sorry, I forget some infos:
mesa/xserver/ddx/drm from git kernel drm-fixes-3.19 GPU: R7-265
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #18 from Tom Stellard tstellar@gmail.com --- I have just committed a change to llvm svn that disables sub-reg liveness and filed an LLVM bug for this:
http://www.llvm.org/bugs/show_bug.cgi?id=22548
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #19 from Lorenzo Bona lorenz.bona@gmail.com --- Thank you Tom, with latest changes crashes and Valley rendering issue are gone for me. BTW I'm still facing rendering issue in Dota2.
Performances with LLVM-3.7 are great, about 30FPS in Valley. Nice.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #20 from Daniel Scharrer daniel@constexpr.org --- (In reply to Lorenzo Bona from comment #19)
BTW I'm still facing rendering issue in Dota2.
That looks a lot like bug #88978. You could try the trace posted there to see if you are experiencing the same issue.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
Kai kai@dev.carbon-project.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |kai@dev.carbon-project.org
--- Comment #21 from Kai kai@dev.carbon-project.org --- *** Bug 89045 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #22 from Tom Stellard tstellar@gmail.com --- Created attachment 113709 --> https://bugs.freedesktop.org/attachment.cgi?id=113709&action=edit Patch to re-enable subreg liveness
I think this bug has been fixed. This patch re-enables subreg livess. Can you see if the issue still exists with this patch applied to LLVM git.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #23 from smoki smoki00790@gmail.com --- Created attachment 113715 --> https://bugs.freedesktop.org/attachment.cgi?id=113715&action=edit valley.png
(In reply to Tom Stellard from comment #22)
Created attachment 113709 [details] [review] Patch to re-enable subreg liveness
I think this bug has been fixed. This patch re-enables subreg livess. Can you see if the issue still exists with this patch applied to LLVM git.
Just tried it on top of svn230129... Firefox does not crash xserver anymore, but rendering is still broken mostly fine now in valley, that "half picture" broken rendering is also fixed https://bugs.freedesktop.org/attachment.cgi?id=113269... but there are still black squares appear here and there - see attachment. Basically firefox xserver crash is fixed, but rendering in games is not... and i have some other examples when rendering is much worse.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #24 from smoki smoki00790@gmail.com --- Created attachment 113716 --> https://bugs.freedesktop.org/attachment.cgi?id=113716&action=edit stacking.png
In Stacking game (as another example) rendering is also borked but differently, and so on...
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #25 from Lorenzo Bona lorenz.bona@gmail.com --- (In reply to smoki from comment #24)
Created attachment 113716 [details] stacking.png
In Stacking game (as another example) rendering is also borked but differently, and so on...
Have you already tried this patch from Marek? http://cgit.freedesktop.org/mesa/mesa/commit/?id=7692704b144b2aa9a57767a4321...
Rendering issue in Dota2 are quite gone now, sometimes you can see little glitch here and there, but very rarely.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #26 from smoki smoki00790@gmail.com --- @ Lorenzo Bona
That is for SI, i am on CIK... this issue whole another one, probably affect all chips.
@Tom
There are also ~140 piglit quick tests failed once subreg liveness is enabled.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #27 from Daniel Scharrer daniel@constexpr.org --- Also no X server crashed here on TAHITI with LLVM r230124 + your patch.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #28 from Tom Stellard tstellar@gmail.com --- (In reply to smoki from comment #26)
@ Lorenzo Bona
That is for SI, i am on CIK... this issue whole another one, probably affect all chips.
@Tom
There are also ~140 piglit quick tests failed once subreg liveness is enabled.
Which piglit tests regress and what GPU do you have?
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #29 from smoki smoki00790@gmail.com --- (In reply to Tom Stellard from comment #28)
Which piglit tests regress and what GPU do you have?
Kabini. I did fresh piglit run now and it shows there are 159 regressed now... too many to be listed so i upload html summary:
https://dl.dropboxusercontent.com/u/74553632/compare.tar.bz2
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #30 from Michel Dänzer michel@daenzer.net --- (In reply to smoki from comment #29)
I did fresh piglit run now and it shows there are 159 regressed now...
I think at least the piglit regressions aren't directly related to sub-register liveness and should be tracked in a separate bug report:
On my Kaveri, I've been seeing random failures of some (of the same as yours) piglit tests recently (with sub-register liveness disabled). The only way I've found to avoid those failures is to keep rebooting until I get lucky. It seems like some recent change (most likely in Mesa?) causes the hardware to go into a weird, semi-persistent state.
I'm afraid it might be tricky to bisect that, but it would be very helpful.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #31 from smoki smoki00790@gmail.com --- (In reply to Michel Dänzer from comment #30)
I think at least the piglit regressions aren't directly related to sub-register liveness and should be tracked in a separate bug report:
Those regressions are only reproducable here with sub reg liveness enabled.
On my Kaveri, I've been seeing random failures of some (of the same as yours) >piglit tests recently (with sub-register liveness disabled). The only way I've >found to avoid those failures is to keep rebooting until I get lucky. It seems >like some recent change (most likely in Mesa?) causes the hardware to go into a >weird, semi-persistent state.
I'm afraid it might be tricky to bisect that, but it would be very helpful.
That sounds like a separate one, but i don't have that and can't reproduce it on Kabini. I only have some known of those sometimes fails at random, but those are under "warn" and just few of them (i am talking about just 1-3 tests), but no "fail" tests happens here at random.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #32 from Tom Stellard tstellar@gmail.com --- (In reply to smoki from comment #31)
(In reply to Michel Dänzer from comment #30)
I think at least the piglit regressions aren't directly related to sub-register liveness and should be tracked in a separate bug report:
Those regressions are only reproducable here with sub reg liveness enabled.
On my Kaveri, I've been seeing random failures of some (of the same as yours) >piglit tests recently (with sub-register liveness disabled). The only way I've >found to avoid those failures is to keep rebooting until I get lucky. It seems >like some recent change (most likely in Mesa?) causes the hardware to go into a >weird, semi-persistent state.
I'm afraid it might be tricky to bisect that, but it would be very helpful.
That sounds like a separate one, but i don't have that and can't reproduce it on Kabini. I only have some known of those sometimes fails at random, but those are under "warn" and just few of them (i am talking about just 1-3 tests), but no "fail" tests happens here at random.
Would you be able to set the environment variable R600_DEBUG=ps,vs and run the glsl-fs-min test with the good and bad commit and post the output.
R600_DEBUG=ps,vs ./bin/shader_runner tests/shaders/glsl-fs-min-shader_test -auto
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #33 from smoki smoki00790@gmail.com ---
glsl-fs-min is one of the random failing tests actually, it sometimes pass sometimes fail with or without subreg liveness, so that is not problem here i think.
Currently i have 7 warns, 1 test which made gpu fault, 4 are crash/segfault and 22 which random failing. 18 of those that random failing (mostly on second piglit run) are EXT_transform_feedback/xyz tests, 4 on some glsl tests, etc...
In whole that is 34 potentionaly problematic tests, with all those excluded from run, this is what i get - 136 tests which fail with subreg liveness enabled:
https://dl.dropboxusercontent.com/u/74553632/compare2.tar.bz2
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #34 from Michel Dänzer michel@daenzer.net --- (In reply to smoki from comment #33)
glsl-fs-min is one of the random failing tests actually, it sometimes pass sometimes fail with or without subreg liveness, so that is not problem here i think.
I can't reproduce random failures with glsl-fs-min nor any piglit regressions with sub-register liveness enabled, but sub-register liveness doesn't seem to result in any code difference for glsl-fs-min anyway.
Can you find another test which consistently passes without sub-register liveness and fails with it *and* shows a difference between them in the R600_DEBUG=vs,ps stderr output, and attach the latter for both cases?
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #35 from smoki smoki00790@gmail.com --- Created attachment 113809 --> https://bugs.freedesktop.org/attachment.cgi?id=113809&action=edit subreg_disabled.txt
(In reply to Michel Dänzer from comment #34)
I can't reproduce random failures with glsl-fs-min nor any piglit regressions with sub-register liveness enabled, but sub-register liveness doesn't seem to result in any code difference for glsl-fs-min anyway.
I can't too if i run it alone, so there is no difference, it just fail sometimes in full piglit run.
Can you find another test which consistently passes without sub-register liveness and fails with it *and* shows a difference between them in the R600_DEBUG=vs,ps stderr output, and attach the latter for both cases?
As i said yesterday comment 33 i trimmed down only ones which shows this regression, you can pick any of those 136 test which shows difference. Let say:
R600_DEBUG=vs,ps ./bin/copy-pixels -samples=8 -auto
Outputs attached, first without...
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #36 from smoki smoki00790@gmail.com --- Created attachment 113810 --> https://bugs.freedesktop.org/attachment.cgi?id=113810&action=edit subreg_enabled.txt
...second with subreg liveness enabled.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #37 from Tom Stellard tstellar@gmail.com --- After examining the shader dumps one thing that looks suspicious to me is that in the good dump, we have several instructions like this:
image_load v[9:12], 15, 0, 0, 0, 0, 0, 0, 0, v[4:7], s[8:15]
But nothing is ever written to the last component of vaddr: v7
However, in the bad dumps we have:
image_load v[8:11], 15, 0, 0, 0, 0, 0, 0, 0, v[1:4], s[8:15]
And a value is stored in the last component of vaddr: v4 before every image load.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #38 from Tom Stellard tstellar@gmail.com --- Created attachment 113825 --> https://bugs.freedesktop.org/attachment.cgi?id=113825&action=edit Possible fix
Can you try this patch and see if it helps?
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #39 from smoki smoki00790@gmail.com --- Created attachment 113826 --> https://bugs.freedesktop.org/attachment.cgi?id=113826&action=edit subreg_enabled2.txt
(In reply to Tom Stellard from comment #38)
Created attachment 113825 [details] [review] Possible fix
Can you try this patch and see if it helps?
Still fail, as dump is now very different i attached it.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #40 from Tom Stellard tstellar@gmail.com --- There have been a few register allocator bugs fixed in LLVM recently, can you re-apply the "Patch to re-enable subreg liveness" to latest LLVM and test again?
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #41 from smoki smoki00790@gmail.com --- Tried svn232842 with subreg liveness enabled + mesa a04b520890c669ce012b4b18165392dcabe0b27b
Nothing, still same bugs are there.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #42 from Tom Stellard tstellar@gmail.com --- (In reply to smoki from comment #41)
Tried svn232842 with subreg liveness enabled + mesa a04b520890c669ce012b4b18165392dcabe0b27b
Nothing, still same bugs are there.
I can't reproduce any of these failures on my Verde card. What is still failing for you? Piglit tests? If you still see corruption in Unigine Valley, can you post the command you use to launch the program and which scene the corruption occurs in?
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #43 from smoki smoki00790@gmail.com --- Created attachment 114561 --> https://bugs.freedesktop.org/attachment.cgi?id=114561&action=edit hof.png
Yes, those piglit tests from comment 33 still failing. But also Unigine Valley still have corruptions, i run it via 'valey' script then apply some setings via interface. Squares happens regradles of settings on scenes 1, 2, 3 and 6. On 2 and 3 there is not only black squrares, but fog also starts to not render correctly on some/far camera positions.
And also Stacking game from comment 24 have same borked rendering. Game Hands of Fate also show rendering issues (screenshot attached)... and so on, many apps are affected once i enable subreg liveness.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #44 from Tom Stellard tstellar@gmail.com --- (In reply to smoki from comment #43)
Created attachment 114561 [details] hof.png
Yes, those piglit tests from comment 33 still failing. But also Unigine Valley still have corruptions, i run it via 'valey' script then apply some setings via interface. Squares happens regradles of settings on scenes 1, 2, 3 and 6. On 2 and 3 there is not only black squrares, but fog also starts to not render correctly on some/far camera positions.
Can you try running the piglit tests with no X server and with the environment variable PIGLIT_PLATFORM=gbm You will need to install waffle from git and enable gbm support and then rebuild piglit for this to work.
-Tom
And also Stacking game from comment 24 have same borked rendering. Game Hands of Fate also show rendering issues (screenshot attached)... and so on, many apps are affected once i enable subreg liveness.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #45 from smoki smoki00790@gmail.com --- If you ask does same tests fail there, then yes - same tests fail with PIGLIT_PLATFORM=gbm with no xserver. And dump is the same with our example.
PIGLIT_PLATFORM=gbm R600_DEBUG=vs,ps ./bin/copy-pixels -samples=8 -auto
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #46 from smoki smoki00790@gmail.com --- Ah, i forgot to add that comparison anyway... That is no X gbm piglit, just enabled/disabled subreg liveness:
https://dl.dropboxusercontent.com/u/74553632/compare11.tar.bz2
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #47 from Tom Stellard tstellar@gmail.com --- If you enable sub-reg liveness in this branch: http://cgit.freedesktop.org/~tstellar/llvm/log/?h=sched-perf-Mar-27-2015, do you still see the bugs?
https://bugs.freedesktop.org/show_bug.cgi?id=89034
--- Comment #48 from smoki smoki00790@gmail.com --- (In reply to Tom Stellard from comment #47)
If you enable sub-reg liveness in this branch: http://cgit.freedesktop.org/~tstellar/llvm/log/?h=sched-perf-Mar-27-2015, do you still see the bugs?
In unigine valley there is not corruption with that anymore, perf goes down by 5% just to mention...
But all other bugs are still there like corruptions in Stacking and Hands of Fate games and all same piglit tests still fail.
https://bugs.freedesktop.org/show_bug.cgi?id=89034
smoki smoki00790@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #49 from smoki smoki00790@gmail.com ---
Issue fixed in llvm:
R600/SI: Fix bug with v_interp_p1_f32 instructions on 16 bank lds chips
The src and dst register cannot be the same on chips with 16 lds banks.
Closing.
dri-devel@lists.freedesktop.org