https://bugs.freedesktop.org/show_bug.cgi?id=104492
Bug ID: 104492 Summary: Compute Shader: Wrong alignment when assigning struct value to structured SSBO Product: Mesa Version: 17.2 Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: florian.will@googlemail.com QA Contact: dri-devel@lists.freedesktop.org
Created attachment 136547 --> https://bugs.freedesktop.org/attachment.cgi?id=136547&action=edit Code to reproduce this issue. Requires SDL2.
Hi,
I'm trying to get the Banshee 3D engine https://github.com/BearishSun/BansheeEngine working on my hardware (HD 7870) using Mesa radeonsi (amdgpu kernel module). I use 17.2.4-0ubuntu1~17.10.1 from the Ubuntu artful-proposed archive.
It uses a compute shader that reads a texture cube and writes 6 sets of coefficients into a shader storage buffer object. Details are not important as I have isolated the (probably) incorrect driver behaviour and created a much simpler program that triggers the same issue.
In short: Let's say we have a GLSL struct and access the SSBO through a dynamically-sized array of that struct type using the std430 layout, like this:
struct ResultRecord { float a[10]; float b[10]; float c[10]; float weight; };
layout(std430) buffer gOutput { ResultRecord ssb[]; };
Then a simple assignment of a local variable of the same struct type to ssb[i] writes the floats into incorrect buffer offsets, because b, c and "weight" are placed as if the array elements in a, b and c were vec4-aligned instead of float-aligned, tripling the size of a, b and c.
Code that triggers it is like this, which should hopefully be valid GLSL (and makes more sense the way it is used in Banshee 3D): ResultRecord result; // ... populate result ... ssb[gl_LocalInvocationIndex] = result;
The test program is available here and contains a (maybe too) detailed explanation and three program variations that fix the issue: https://gist.github.com/w-flo/b1a5791749eea5f36cb54628037ba2bf But I'll also attach it to this bug report.
Looking at the RADEON_DUMP_SHADERS=1 output, I think that the bug is already visible in the TGSI dump, as explained in the comment at the top of my reproducer program. I'll attach the output (it's possibly based on an older version of the reproducer).
https://bugs.freedesktop.org/show_bug.cgi?id=104492
--- Comment #1 from florian.will@googlemail.com --- Created attachment 136548 --> https://bugs.freedesktop.org/attachment.cgi?id=136548&action=edit RADEON_DUMP_SHADERS=1 output.
https://bugs.freedesktop.org/show_bug.cgi?id=104492
--- Comment #2 from Ilia Mirkin imirkin@alum.mit.edu --- Can you see if this is fixed on master? I fixed a seemingly related issue with
https://cgit.freedesktop.org/mesa/mesa/commit/?id=0332c7484b712e56ce1a6648c5...
https://bugs.freedesktop.org/show_bug.cgi?id=104492
--- Comment #3 from florian.will@googlemail.com --- Hi Ilia,
unfortunately updating to 17.4~git1801031930.ad2187~oibaf~a (apparently a Jan 3 commit) did not fix this test case. I'll downgrade again because that package breaks renderdoc and the gnome login screen (wayland-based) on my system, probably packaging-related.
https://bugs.freedesktop.org/show_bug.cgi?id=104492
florian.will@googlemail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|17.2 |git
https://bugs.freedesktop.org/show_bug.cgi?id=104492
--- Comment #4 from florian.will@googlemail.com --- Maybe the bug is in compiler/glsl/lower_buffer_access.cpp? https://cgit.freedesktop.org/mesa/mesa/tree/src/compiler/glsl/lower_buffer_a...
I'm not sure about the control flow for radeonsi, but if that pass is used in my setup, and this emit_access() code is used to break down struct derefs into multiple scalar/array/vec derefs, then it seems like the check for packing == GLSL_INTERFACE_STD430 is missing in lines 77/78 and 85, and the std140 layout is assumed instead. I think that might explain the observed incorrect behaviour.
In that same file, the check for std430 was added in a few places, like in line 338.
I'd love to give this a try and add checks to use std430_base_alignment() and std430_size() if appropriate, but I'm not really prepared to compile Mesa myself right now.
So, if someone who knows this code feels like my suggested change is correct and required, I'd be more than happy if they could take care of this. Otherwise I might give it a try myself in some time.
https://bugs.freedesktop.org/show_bug.cgi?id=104492
--- Comment #5 from florian.will@googlemail.com --- I'm preparing a patch right now with the changes described in my last comment. The patch fixes my test case and I see no piglit regressions (only some unstable tests unrelated to my patch). If I get git send-email to work, I'll send it to the list.
https://bugs.freedesktop.org/show_bug.cgi?id=104492
Timothy Arceri t_arceri@yahoo.com.au changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #6 from Timothy Arceri t_arceri@yahoo.com.au --- Fixed by:
commit 7e025def6d7d3d6bf94facd6ec6d956f40cbb31e Author: Florian Will florian.will@gmail.com Date: Fri Jan 5 15:33:31 2018 +0100
glsl: Respect std430 layout in lower_buffer_access
Respect the std430 rules for determining offset and size of struct members when using a std430 buffer. std140 rules lead to wrong buffer offsets in that case.
Fixes my test case attached in Bugzilla. No piglit changes.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492 Reviewed-by: Timothy Arceri tarceri@itsqueeze.com
dri-devel@lists.freedesktop.org