https://bugs.freedesktop.org/show_bug.cgi?id=101739
Bug ID: 101739 Summary: An issue with alpha-to-coverage handling is causing Arma 3 64-bit Linux port to render trees incorrectly Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: krystian.galaj@vpltd.com QA Contact: dri-devel@lists.freedesktop.org
The game is sometimes rendering trees. It has multiple tree mesh models, and when the tree gets close, the game is switching between different LODs/models by rendering both models overlapping on a grid of pixels, as if checkerboard, pixels of old LOD in black fields of it, new one in white.
As long as it’s doing that using single-sampled buffers as render targets, all is fine. However in higher quality modes, the game switches to multisampled buffers. When using multisampled buffers, if ATOC is turned on in settings, the game is using alpha-to-coverage technique without polygon sorting to make sure that the grass is rendered correctly. At the same time it’s using depth test and depth write to fill up the depth buffer. In the same execution of fragment shader it sets the output color and alpha value, and the depth buffer value, and it expects the alpha value to cause the color value to end up only in some samples of multisampled texture, and - the important bit - the depth value to end up only in the corresponding samples of the depth buffer.
This technique works on DX11 on Windows, on all drivers on Mac OS X, and on NVidia drivers on Linux, but on Mesa Radeonsi drivers on Linux it makes the rendering go bad, rendering most of the tree pixels white until the LOD transition ends.
The white is visible on the screen because it is the initial color of the multisampled render target, and for some reason this color is allowed to leak through, which means that in some cases the depth buffer value is set, when the corresponding color buffer value isn’t.
Both depth and color buffer have the same number of samples (8), and were created with fixedSamples = true in OpenGL call.
It looks as if the depth buffer values in case of Mesa Radeonsi driver were: - correctly not written when the depth test fails for the fragment shader, - correctly written to all samples of the depth buffer when alpha coverage directs the draw to fill them all, but - INCORRECTLY assigned not only to those samples to which alpha coverage directed the fragment output color, but either to all the samples in depth buffer, or to samples in depth buffer that don’t correspond to the samples in color attachment buffer.
We encountered the same issue in 2015 in fglrx drivers, contacted AMD team about it, and received confirmation that it is a bug, and that it was fixed. Unfortunately, shortly after that fglrx drivers went out of use.
The issue can be easily reproduced in Arma 3 by: - launching Arma 3 game, - switching to High quality or higher to turn on multisampled buffers, - enabling ATOC (alpha to coverage) in Video settings, or making sure it's enabled, - launching first level of Drawdown 2035 campaign, ie. starting a new campaign, bypassing optional tutorial, and observing as the main character walks into the base, the bushes flicker white from time to time as they get closer.
The issue is present in Mesa 17.2.0-devel from padoka PPA, on at least Radeon R7 260X/360.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #1 from Roland Scheidegger sroland@vmware.com --- My naive conclusion from the description would be that the hw is doing earlyz optimizations when it shouldn't (so, the depth updates happen before the final sample mask modified by the alpha value is known). The driver doesn't set if early z is enabled directly, since it just sets EARLY_Z_THEN_LATE_Z most of the time, and the hw should figure out if early z is possible (taking into account all state), otherwise use late z. If that's the case, then overriding the Z_ORDER to LATE_Z for the db_shader_control value might be necessary in this case. But that's just a guess, I could be completely wrong here - I've got only a very rough idea of radeonsi hw and driver...
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #2 from jan.p57@gmx.de --- I also have this problem. Is there a way to force/override Z_ORDER to LATE_Z, at best per application, so that I can try whether this has any effect? If not: what other way is there to test it? I am willing to e.g. patch, compile and test some code if somebody tells me what to do, no promise when I'll find the time for that though.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
Vedran Miletić vedran@miletic.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |77449
Referenced Bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=77449 [Bug 77449] Tracker bug for all bugs related to Steam titles
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #3 from Roland Scheidegger sroland@vmware.com --- (In reply to Jan from comment #2)
I also have this problem. Is there a way to force/override Z_ORDER to LATE_Z, at best per application, so that I can try whether this has any effect? If not: what other way is there to test it? I am willing to e.g. patch, compile and test some code if somebody tells me what to do, no promise when I'll find the time for that though.
You can't override that, you'd need a mesa patch looking something like this: diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index fcf4928e65..13e44dac16 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1417,6 +1417,11 @@ static void si_emit_db_render_state(struct si_context *sctx, struct r600_atom *s db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z); }
+ if (sctx->queued.named.blend->alpha_to_coverage) { + db_shader_control &= C_02880C_Z_ORDER; + db_shader_control |= S_02880C_Z_ORDER(V_02880C_LATE_Z); + } +
Albeit probably would need to add a blend dependency like this too: @@ -658,6 +658,10 @@ static void si_bind_blend_state(struct pipe_context *ctx, void *state) old_blend->dual_src_blend != blend->dual_src_blend) si_mark_atom_dirty(sctx, &sctx->cb_render_state);
+ if (!old_blend || + old_blend->alpha_to_coverage != blend->alpha_to_coverage) + si_mark_atom_dirty(sctx, &sctx->db_render_state); + si_pm4_bind_state(sctx, blend, state);
if (!old_blend ||
But as said, I really don't have much knowledge of the driver.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #4 from nadro-linux@wp.pl --- You can fix that issue by added this entry to drirc file: ------- <application name="ARMA 3" executable="arma3.x86_64"> <option name="glsl_correct_derivatives_after_discard" value="true"/> </application> ------- I hope that it will be permanently added to mesa official drirc file.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #5 from ysblokje@gmail.com --- (In reply to nadro-linux from comment #4)
You can fix that issue by added this entry to drirc file:
<application name="ARMA 3" executable="arma3.x86_64"> <option name="glsl_correct_derivatives_after_discard" value="true"/> </application> ------- I hope that it will be permanently added to mesa official drirc file.
I just tried this, has no effect for me. At least not on mesa 17.3.1
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #6 from ysblokje@gmail.com --- addendum : using the glsl_correct_derivatives_after_discard=true commandline option does work.
But a big but : FPS took a nosedive.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #7 from tom34 fura34@wp.pl --- I using mesa3d (17.3.6) from stable padoka repo and i see white bushes in game, here is the example:
(Headphone warning, mic volume too high) https://www.twitch.tv/videos/239136303?t=01m10s
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #8 from tom34 fura34@wp.pl --- I see that Bohemia mentioned about this bug here: https://community.bistudio.com/wiki/Arma_3_Experimental_Ports#Known_Issues
"AMD Mesa drivers can cause graphical glitches, such as white blinking vegetation LODs."
"ATOC might cause rendering issues with AMD cards using MESA drivers."
https://bugs.freedesktop.org/show_bug.cgi?id=101739
Jan Havran havran.jan@email.cz changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |havran.jan@email.cz
--- Comment #9 from Jan Havran havran.jan@email.cz --- I can confirm this bug (and also other users are facing it, like [1]).
My spec: Distribution: Antergos 64b Linux: 4.15.9 Mesa 17.3.6 Arma 1.80
Processor: Intel(R) Core(TM) i5-4210M CPU @ 2.60GHz GPU: AMD Radeon R7 M265
[1] https://www.gamingonlinux.com/articles/the-linux-beta-of-arma-3-has-been-upd...
https://bugs.freedesktop.org/show_bug.cgi?id=101739
Gregor Münch gr.muench@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |maraeo@gmail.com
--- Comment #10 from Gregor Münch gr.muench@gmail.com --- Some comments from a VP dev: https://www.gamingonlinux.com/articles/the-linux-beta-of-arma-3-has-been-upd...
Seems to be that it's not clear if the bug is on Mesa or VPs side. Maybe some Mesa dev could comment.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #11 from Marek Olšák maraeo@gmail.com --- If "glsl_correct_derivatives_after_discard=true" fixes it, it's not an alpha-to-coverage issue. It's a problem with the use of discard in GLSL.
There is no difference in alpha-to-coverage behavior between DX and GL. Make sure you have this fix: https://cgit.freedesktop.org/mesa/mesa/commit/?id=f222cfww3c6d6fc5d9dee3742d...
If there is a difference between DX and GL, the GL specification can be corrected or a new GL extension can be added for the DX behavior.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #12 from Ilia Mirkin imirkin@alum.mit.edu --- Probably meant this change:
https://cgit.freedesktop.org/mesa/mesa/commit/?id=f222cf3c6d6fc5d9dee3742d20...
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #13 from Marek Olšák maraeo@gmail.com --- Yes. Thanks.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #14 from Krystian Gałaj krystian.galaj@vpltd.com --- (In reply to Gregor Münch from comment #10)
Some comments from a VP dev: https://www.gamingonlinux.com/articles/the-linux-beta-of-arma-3-has-been- updated-to-180-compatible-with-windows-again-for-a-time.11349/ comment_id=118838
Seems to be that it's not clear if the bug is on Mesa or VPs side. Maybe some Mesa dev could comment.
I am not sure what we could do on VP side to work around the bug. It happens in a single execution of fragment shader on a multisampled color buffer and depth buffer. The same execution is writing a color value, and it's supposed to write a depth value into the corresponding sample of depth buffer. I don't know of any additional keywords in GLSL that we could specify to make sure this is the case. If anyone knows about something we're specifying wrong, please advise.
As for rendering techniques used, we are only converting the rendering technique used by the original Arma 3 developer team from Direct3D API to OpenGL. So one way of working around the problem would be to ask them to do LOD switching in another way in future release - and then we could port that new version. But since it's not happening on the same graphics cards on Windows or Mac, only on Linux, it isn't likely this rework would be given any high priority. And we are good at API knowledge and conversion between them, but inventing another technique to swap in for existing technique in not our game requires slightly different approach, and, above all, good knowledge of the entire complicated rendering engine used in the game, so as not to break anything.
I don't think that working around the problem is a good thing to mention in a bug ticket... this thing might be happening in other games, maybe not so high profile, so it would make sense to fix it in driver. It can be done, if it's working on the same cards using Windows drivers...
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #15 from Marek Olšák maraeo@gmail.com --- We can add "glsl_correct_derivatives_after_discard=true" to Mesa's drirc and call it a day.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
--- Comment #16 from Gregor Münch gr.muench@gmail.com --- Tested today and the bug is still there with current git (also with nir). The workaround also still works and at least on my Tahiti card I actually experience no performance drop.
https://bugs.freedesktop.org/show_bug.cgi?id=101739
Marek Olšák maraeo@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED
--- Comment #17 from Marek Olšák maraeo@gmail.com --- I pushed the workaround as commit 8e0b4cb8a1fcb1572be8eca16a806520aac08a61. Closing.
dri-devel@lists.freedesktop.org