On Tue., Mar. 17, 2020, 06:02 Michel Dänzer, michel@daenzer.net wrote:
On 2020-03-16 7:33 p.m., Marek Olšák wrote:
On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer michel@daenzer.net
wrote:
On 2020-03-16 4:50 a.m., Marek Olšák wrote:
The synchronization works because the Mesa driver waits for idle
(drains
the GFX pipeline) at the end of command buffers and there is only 1 graphics queue, so everything is ordered.
The GFX pipeline runs asynchronously to the command buffer, meaning the command buffer only starts draws and doesn't wait for completion. If
the
Mesa driver didn't wait at the end of the command buffer, the command buffer would finish and a different process could start execution of
its
own command buffer while shaders of the previous process are still
running.
If the Mesa driver submits a command buffer internally (because it's
full),
it doesn't wait, so the GFX pipeline doesn't notice that a command
buffer
ended and a new one started.
The waiting at the end of command buffers happens only when the flush
is
external (Swap buffers, glFlush).
It's a performance problem, because the GFX queue is blocked until the
GFX
pipeline is drained at the end of every frame at least.
So explicit fences for SwapBuffers would help.
Not sure what difference it would make, since the same thing needs to be done for explicit fences as well, doesn't it?
No. Explicit fences don't require userspace to wait for idle in the
command
buffer. Fences are signalled when the last draw is complete and caches
are
flushed. Before that happens, any command buffer that is not dependent on the fence can start execution. There is never a need for the GPU to be
idle
if there is enough independent work to do.
I don't think explicit fences in the context of this discussion imply using that different fence signalling mechanism though. My understanding is that the API proposed by Jason allows implicit fences to be used as explicit ones and vice versa, so presumably they have to use the same signalling mechanism.
Anyway, maybe the different fence signalling mechanism you describe could be used by the amdgpu kernel driver in general, then Mesa could drop the waits for idle and get the benefits with implicit sync as well?
Yes. If there is any waiting, or should be done in the GPU scheduler, not in the command buffer, so that independent command buffers can use the GFX queue.
Marek
-- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer