https://bugs.freedesktop.org/show_bug.cgi?id=86720
Bug ID: 86720 Summary: Europa Universalis 4 freezing during game start (10.3.3) Product: Mesa Version: 10.3 Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r600 Assignee: dri-devel@lists.freedesktop.org Reporter: buchno@riseup.net
Created attachment 110028 --> https://bugs.freedesktop.org/attachment.cgi?id=110028&action=edit 134220 line strace from loading game to freeze to force quit.
Europa Universalis 4 stops responding after loading from pressing "Play" in the nation select screen.
Not present in Mesa 10.3.2, the problem was introduced in 10.3.3. Related to the radeon driver, as the Intel driver works without issue.
OS: Arch Linux GPU: Radeon HD 6950 GPU Driver: xf86-video-ati 7.5.0 CPU: Intel i5-2500K
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #1 from Michel Dänzer michel@daenzer.net --- Can you bisect Mesa?
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Felix Schwarz felix.schwarz@oss.schwarz.eu changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |felix.schwarz@oss.schwarz.e | |u Blocks| |77449
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Eero Tamminen eero.t.tamminen@intel.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|Europa Universalis 4 |[radeon] Europa Universalis |freezing during game start |4 freezing during game |(10.3.3) |start (10.3.3)
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #2 from glwhieuhghfbjds@riseup.net --- After a couple of hours of compiling...
Commit e8c7affa66407932519fc6d82a449b453343d9fc works fine. Commit d26258166ca056da62536bebdf107e21d9ce92fb introduces the issue.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #3 from Marek Olšák maraeo@gmail.com --- It looks like the R600 driver cannot handle some loops and hangs (in an infinite loop probably).
There are some new fixes for Cayman in the master branch. Can you apply them and see if they help?
http://cgit.freedesktop.org/mesa/mesa/log?qt=grep&q=r600
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #4 from glwhieuhghfbjds@riseup.net --- Commit 133280120b4bc714bbb7665e383f36ab262c280a, after the fixes for Cayman in master, still has the issue.
Or did you mean of me to cherry-pick specific fixes?
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #5 from Marek Olšák maraeo@gmail.com --- (In reply to glwhieuhghfbjds from comment #4)
Commit 133280120b4bc714bbb7665e383f36ab262c280a, after the fixes for Cayman in master, still has the issue.
Or did you mean of me to cherry-pick specific fixes?
No. It looks like Cayman still has some bugs.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #6 from Joti Papadopoulos pan.papadopoulos80@gmail.com --- Actually, i'm having this issue as well with a HD5850. So it's probably not Cayman specific
OS: Arch Linux GPU:Radeon HD5850 Mesa 10.3.5 and 10.5git(last tested about a week ago) CPU:Phenom 2 X3 720
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #7 from Artem Hluvchynskyi excieve@gmail.com --- Same issue with HD5730 and mesa git: Mesa 10.5.0-devel (git-8d2542f). Tried turning off every available graphics option in the game and using "notiling" (which helped with a similar GPU lockup in Tropico 5) but nothing.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #8 from Artem Hluvchynskyi excieve@gmail.com --- Created attachment 111977 --> https://bugs.freedesktop.org/attachment.cgi?id=111977&action=edit Part of the log during the lockup
Messages in the attached part of the log are repeating for a while, with monitors going off and on. Eventually the system is usable for a while and then just locks up for good, requiring a hard reset.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #9 from João Grego jpgrego@gmail.com --- Having the same issue with HD5650. When using mesa 10.2.8 there was no problem, but when I upgraded to mesa 10.4.2 the entire system froze after loading a game, either new or saved.
OS: Gentoo amd64 GPU: Radeon HD5650 GPU driver: xf86-video-ati 7.5.0 Mesa: 10.4.2 CPU: intel i3 M370
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #10 from Felix Schwarz felix.schwarz@oss.schwarz.eu --- I can confirm that the issue was introduced (on master) with this commit:
commit 6fcb5520b78cdf1e5013c125501932315a069955 Author: Marek Olšák marek.olsak@amd.com Date: Tue Oct 28 19:49:44 2014 +0100
Revert "st/mesa: set MaxUnrollIterations = 255"
This reverts commit 20836c81851e0df29a8ee9c86e5e5388738c840b.
255 is a huge number. If you have a loop with 255 iterations, unrolling it will exceed the SM3 instruction limit. Let's use the default again.
The comment about a SM3 limit doesn't make sense. For SM3, we generally want 32 (default) or a lower number due to the SM3 instruction limit, which is 512 instructions. For SM4, we can try higher numbers if needed, but some shaders can end up being pretty huge and shader compilation can take more time.
This fixes a shader compile failure on R500/SM3. Reported on IRC.
Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul brianp@vmware.com
Marek: You mentioned that the r600 driver might be unable to handle certain loops. Is there anything the community can do to get this fixed? apitrace? Checking for piglit regressions related to the mentioned commit? I assume that you would be able to fix this much better if you could reproduce the problem easily.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #11 from Marek Olšák maraeo@gmail.com --- I'd add a PIPE_CAP and allow the driver to set MaxUnrollIterations with it. For r600g, the old value should be used. For other drivers, the new value should be used.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #12 from Felix Schwarz felix.schwarz@oss.schwarz.eu --- (In reply to Marek Olšák from comment #11)
I'd add a PIPE_CAP and allow the driver to set MaxUnrollIterations with it.
"I'd add" as in "I plan to add" or as in "someone else should fix that by ..."?
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #13 from Médéric Boquien mboquien@free.fr --- For what it is worth, a temporary workaround would be setting R600_DEBUG=nosb. I saw that in a comment about bug #88263 and it turns out to work.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #14 from Marek Olšák maraeo@gmail.com --- Created attachment 113246 --> https://bugs.freedesktop.org/attachment.cgi?id=113246&action=edit patch
Hi guys, could you please test this patch?
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #15 from Felix Schwarz felix.schwarz@oss.schwarz.eu --- I applied your patch on top of mesa 7ea1e3749738c63388d3bcca327e4e4dd28f17b8 with llvm 3.5 and linux 3.18.5 and I can confirm that this problem fixes the original problem as expected.
One thing that bothers me personally is that no piglit test went bad when the initial change was committed. Marek do you think it would be a good idea as a newbie project to come up with a minimal api trace and try to build a piglit test preventing that kind of problem or is that likely to be a fruitless effort?
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #16 from Marek Olšák maraeo@gmail.com --- Having a piglit test reproducing the issue would be very useful.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #17 from Joti Papadopoulos pan.papadopoulos80@gmail.com --- I can confirm the patch fixes the issue here as well
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #18 from Felix Schwarz felix.schwarz@oss.schwarz.eu --- I created an API trace which reproduces the problem for me (https://dl.dropboxusercontent.com/s/hc6v7gdcshj4ljd/eu4.trace-lockup.xz?dl=0 , 120 MB, unfortunately trimming produces only invalid states). The shaders itself don't look to bad to me but I don't have any experience deducing a miminal testcase from an apitrace (pointers welcome).
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Marek Olšák maraeo@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC|maraeo@gmail.com |
--- Comment #19 from Marek Olšák maraeo@gmail.com --- (removing myself from the Cc list, already getting emails from dri-devel)
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Joe Glaser jpg84@drexel.edu changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO
--- Comment #20 from Joe Glaser jpg84@drexel.edu --- Hey everyone,
I am a bit new to working with Mesa and haven't needed to patch anything in it before. However, I am experiencing the GPU overload mentioned in the OC's post (Hitting Play at Nation Select causes the game to attempt to load, but crash when displaying the units and requiring a system reboot). I'd like to apply Marek Olšák's patch file, however I am unfamiliar with the commands I need to use to run it on Fedora 21.
Any ideas?
OS: Fedora 21 x86_64 [Gnome3.14.2] GPU: Radeon HD 6520G GPU Driver: Gallium 0.4 on AMD SUMO CPU: AMD A6-3420M APU with Radeon HD Graphics x4
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Joe Glaser jpg84@drexel.edu changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Joe Glaser jpg84@drexel.edu changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |critical
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #21 from Benjamin Bellec b.bellec@gmail.com --- @Joe Glaser As said above, an easy workaround is to disable the r600 shader optimizer. Just launch Steam like this: $ R600_DEBUG=nosb steam
But performance will be not good, and you already have a weak GPU. If it's too slow, then you can try to patch and build Mesa. You can follow this tutorial: http://forums.fedora-fr.org/viewtopic.php?pid=532589 If you don't read french, you can translate the page with Google Chrome. The translation is still... understandable.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #22 from Médéric Boquien mboquien@free.fr --- Has the patch been committed? I have just tried with Mesa 10.5.1 and the freeze is still there. Thanks!
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #23 from Benjamin Bellec b.bellec@gmail.com --- (In reply to Médéric Boquien from comment #22)
Has the patch been committed? I have just tried with Mesa 10.5.1 and the freeze is still there. Thanks!
No the patch hasn't been committed. Some developer said the patch is not a proper fix. Look at this for more information : http://lists.freedesktop.org/archives/mesa-dev/2015-February/076633.html
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #24 from noga.dany@gmail.com --- Still present in 10.6.1 and game is still unplayable on Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #25 from Felix Schwarz felix.schwarz@oss.schwarz.eu --- For Fedora I have a COPR with the Fedora's mesa package plus the workaround proposed by Marek on top. Other than that I think what is really needed as a first step is to extract a minimal (piglit) test case.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #26 from noga.dany@gmail.com --- I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks very bad.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #27 from Benjamin Bellec b.bellec@gmail.com --- (In reply to noga.dany from comment #26)
I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks very bad.
Remove the "force_s3tc_enable=true" in your command.
And be sure to have the s3tc lib installed (libtxc_dxtn). You can check that with this command: $ glxinfo |grep s3tc
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #28 from noga.dany@gmail.com --- (In reply to Benjamin Bellec from comment #27)
(In reply to noga.dany from comment #26)
I tried run it with nosb parameters like mentioned above "R600_DEBUG=nosb force_s3tc_enable=true /usr/bin/steam %U" and game works, but textures looks very bad.
Remove the "force_s3tc_enable=true" in your command.
And be sure to have the s3tc lib installed (libtxc_dxtn). You can check that with this command: $ glxinfo |grep s3tc
Ok, I have installed libtxc_dxtn 64bit and 32bit and tried it with "R600_DEBUG=nosb /usr/bin/steam %U". Unfortunately it doesn't look good either. Screenshot attached. Mesa 10.6.4 Kernel 4.1.1 AMD HD 6870
So with nosb works but looks bad and without nosb it looks good, but system freezes.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #29 from noga.dany@gmail.com --- Created attachment 117697 --> https://bugs.freedesktop.org/attachment.cgi?id=117697&action=edit screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #30 from a.t.martens@gmail.com --- (In reply to noga.dany from comment #29)
Created attachment 117697 [details] screenshot with R600_DEBUG=nosb on Mesa 10.6.4 and AMD HD 6870
I think the graphics glitches that appear with nosb can be fixed if you disable some of the effects. I can't remember which at the moment. I was able to run with no apparent issues on my HD 6850 after turning on nosb.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
noga.dany@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|10.3 |11.0 Summary|[radeon] Europa Universalis |[radeon] Europa Universalis |4 freezing during game |4 freezing during game |start (10.3.3) |start (10.3.3+, still | |broken on 11.0.2)
--- Comment #31 from noga.dany@gmail.com --- Tested on 11.0.2 and still it freezes
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #32 from Benjamin Bellec b.bellec@gmail.com --- Marek pushed the fix. So it's likely to be fixed in Mesa 11.0.4 which should be released in less than a week.
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Fabio Pedretti fabio.ped@libero.it changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
https://bugs.freedesktop.org/show_bug.cgi?id=86720
Lukáš Krejza gryffus@hkfree.org changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freedesktop.or | |g/show_bug.cgi?id=93706
https://bugs.freedesktop.org/show_bug.cgi?id=86720
--- Comment #33 from Lukáš Krejza gryffus@hkfree.org --- I have found possibly related bug / regression on current Mesa git: https://bugs.freedesktop.org/show_bug.cgi?id=93706
https://bugs.freedesktop.org/show_bug.cgi?id=86720
iive@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Assignee|dri-devel@lists.freedesktop |iive@yahoo.com |.org | Resolution|FIXED |--- CC| |iive@yahoo.com
--- Comment #34 from iive@yahoo.com --- Created attachment 125183 --> https://bugs.freedesktop.org/attachment.cgi?id=125183&action=edit EU4 shader #175 in TGSI , unoptmized disassembly, sbdump of all stages and optimized disassembly
While the committed workaround does work for this case, the bug in R600 Shader Backend is not fixed and it is triggered by other more complicated shaders. For example: https://bugs.freedesktop.org/show_bug.cgi?id=94900
I had locally reverted the unroll workaround in order to obtain the form that triggers this bug. If you need to test the bug with this shader, then in `r600_pipe.c:559` you have to set `PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT` to 32, instead of 255.
The buggy shader is the vertex shader of call 1024042 in the trace. When using `R600_DEBUG=ps,vs`, the shader is under #175 .
Like in the other bugreport, this shader causes assertion failure in the sb_checker (if mesa is compiled with debugging) and the bug is also workarounded by `R600_DEBUG=sbsafemath`.
This works because it disables the call to `fold_assoc()` in `expr_handler::fold_alu_op2()` somewhere around `sb_expr.cpp:740`
In order to locate the bug, I've enabled the sbdump for all SB stages.
I'm also uploading a second log, with the "fold_assoc()" disabled, so a side-by-side comparison of both logs could indicate how the function affects the result through the different stages. (I recommend `diffuse` program.)
This shader is easier to analyze, because it contains just one loop with 4 iterations and no other conditional branches and jumps. The loop counter register is used as index for memory access. The memory address calculations might be involved in triggering the bug as `fold_assoc()` works on them. The `sb_checker` complains about instructions that list the counter register, so it is possible that the instruction that increments it is somehow "optimized" out.
dri-devel@lists.freedesktop.org