[ Dropping dri-devel list ]
On Mit, 2012-02-01 at 15:01 +0000, Simon Farnsworth wrote:
r300g is able to sleep until a fence completes rather than busywait because it creates a special buffer object and relocation that stays busy until the CS containing the fence is finished.
Copy the idea into r600g, and use it to sleep if the user asked for an infinite wait, falling back to busywaiting if the user provided a timeout.
Signed-off-by: Simon Farnsworth simon.farnsworth@onelan.co.uk
This is a userspace only fix for my problem, as suggested by Mark Olšák. It doesn't fix the case with a timeout (which doesn't matter to me), but works on existing kernels.
Given that my previous suggested fix opened a can of worms (mostly because I don't fully understand modern Radeon hardware), this is probably enough of a fix to apply for now, leaving a proper fix for someone with more understanding of the way multiple rings interact.
[...]
diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index 8eb8e6d..cc81c48 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -1623,6 +1623,20 @@ void r600_context_emit_fence(struct r600_context *ctx, struct r600_resource *fen ctx->pm4[ctx->pm4_cdwords++] = 0; /* DATA_HI */ ctx->pm4[ctx->pm4_cdwords++] = PKT3(PKT3_NOP, 0, 0); ctx->pm4[ctx->pm4_cdwords++] = r600_context_bo_reloc(ctx, fence_bo, RADEON_USAGE_WRITE);
- if (sleep_bo) {
unsigned reloc_index;
/* Create a dummy BO so that fence_finish without a timeout can sleep waiting for completion */
*sleep_bo = ctx->ws->buffer_create(ctx->ws, 1, 1,
PIPE_BIND_CUSTOM,
RADEON_DOMAIN_GTT);
/* Add the fence as a dummy relocation. */
reloc_index = ctx->ws->cs_add_reloc(ctx->cs,
ctx->ws->buffer_get_cs_handle(*sleep_bo),
RADEON_USAGE_READWRITE, RADEON_DOMAIN_GTT);
if (reloc_index >= ctx->creloc)
ctx->creloc = reloc_index+1;
- }
Is there a point in making sleep_bo optional?
diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index c38fbc5..71e31b1 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -605,6 +605,14 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen, }
while (rscreen->fences.data[rfence->index] == 0) {
/* Special-case infinite timeout */
if (timeout == PIPE_TIMEOUT_INFINITE &&
rfence->sleep_bo) {
rscreen->ws->buffer_wait(rfence->sleep_bo, RADEON_USAGE_READWRITE);
pb_reference(&rfence->sleep_bo, NULL);
continue;
}
I think rfence->sleep_bo should only be unreferenced in r600_fence_reference() when the fence is recycled, otherwise it'll be leaked if r600_fence_finish() is never called for some reason.
If r600_fence_finish() only ever called os_time_sleep(), never sched_yield() (like r300_fence_finish()), would that avoid your problem even with a finite timeout?