The function radeon_bo_list_validate can cause a bo to move, resulting in a different sync_obj and a dependency to wait for this move to finish.
Signed-off-by: Christian König deathsimple@vodafone.de Reviewed-by: Alex Deucher alexander.deucher@amd.com --- drivers/gpu/drm/radeon/radeon.h | 1 - drivers/gpu/drm/radeon/radeon_cs.c | 21 ++++++++++++++------- 2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 884e0d4..4c1b981 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -836,7 +836,6 @@ struct radeon_cs_parser { struct radeon_cs_reloc *relocs; struct radeon_cs_reloc **relocs_ptr; struct list_head validated; - bool sync_to_ring[RADEON_NUM_RINGS]; /* indices of various chunks */ int chunk_ib_idx; int chunk_relocs_idx; diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 435a3d9..7fd0987 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -85,12 +85,6 @@ int radeon_cs_parser_relocs(struct radeon_cs_parser *p) radeon_bo_list_add_object(&p->relocs[i].lobj, &p->validated);
- if (p->relocs[i].robj->tbo.sync_obj && !(r->flags & RADEON_RELOC_DONT_SYNC)) { - struct radeon_fence *fence = p->relocs[i].robj->tbo.sync_obj; - if (!radeon_fence_signaled(fence)) { - p->sync_to_ring[fence->ring] = true; - } - } } else p->relocs[i].handle = 0; } @@ -118,11 +112,24 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, u32 ring, s32 priority
static int radeon_cs_sync_rings(struct radeon_cs_parser *p) { + bool sync_to_ring[RADEON_NUM_RINGS] = { }; int i, r;
+ for (i = 0; i < p->nrelocs; i++) { + if (!p->relocs[i].robj || !p->relocs[i].robj->tbo.sync_obj) + continue; + + if (!(p->relocs[i].flags & RADEON_RELOC_DONT_SYNC)) { + struct radeon_fence *fence = p->relocs[i].robj->tbo.sync_obj; + if (!radeon_fence_signaled(fence)) { + sync_to_ring[fence->ring] = true; + } + } + } + for (i = 0; i < RADEON_NUM_RINGS; ++i) { /* no need to sync to our own or unused rings */ - if (i == p->ring || !p->sync_to_ring[i] || !p->rdev->ring[i].ready) + if (i == p->ring || !sync_to_ring[i] || !p->rdev->ring[i].ready) continue;
if (!p->ib->fence->semaphore) {
So don't confuse devs by doing so.
Signed-off-by: Christian König deathsimple@vodafone.de --- drivers/gpu/drm/radeon/r600.c | 15 +-------------- 1 files changed, 1 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 4f08e5e..4a4ac8f 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2719,20 +2719,7 @@ int r600_ib_test(struct radeon_device *rdev, int ring) ib->ptr[0] = PACKET3(PACKET3_SET_CONFIG_REG, 1); ib->ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); ib->ptr[2] = 0xDEADBEEF; - ib->ptr[3] = PACKET2(0); - ib->ptr[4] = PACKET2(0); - ib->ptr[5] = PACKET2(0); - ib->ptr[6] = PACKET2(0); - ib->ptr[7] = PACKET2(0); - ib->ptr[8] = PACKET2(0); - ib->ptr[9] = PACKET2(0); - ib->ptr[10] = PACKET2(0); - ib->ptr[11] = PACKET2(0); - ib->ptr[12] = PACKET2(0); - ib->ptr[13] = PACKET2(0); - ib->ptr[14] = PACKET2(0); - ib->ptr[15] = PACKET2(0); - ib->length_dw = 16; + ib->length_dw = 3; r = radeon_ib_schedule(rdev, ib); if (r) { radeon_scratch_free(rdev, scratch);
2012/2/23 Christian König deathsimple@vodafone.de
So don't confuse devs by doing so.
Signed-off-by: Christian König deathsimple@vodafone.de
drivers/gpu/drm/radeon/r600.c | 15 +-------------- 1 files changed, 1 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 4f08e5e..4a4ac8f 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2719,20 +2719,7 @@ int r600_ib_test(struct radeon_device *rdev, int ring) ib->ptr[0] = PACKET3(PACKET3_SET_CONFIG_REG, 1); ib->ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); ib->ptr[2] = 0xDEADBEEF;
ib->ptr[3] = PACKET2(0);
ib->ptr[4] = PACKET2(0);
ib->ptr[5] = PACKET2(0);
ib->ptr[6] = PACKET2(0);
ib->ptr[7] = PACKET2(0);
ib->ptr[8] = PACKET2(0);
ib->ptr[9] = PACKET2(0);
ib->ptr[10] = PACKET2(0);
ib->ptr[11] = PACKET2(0);
ib->ptr[12] = PACKET2(0);
ib->ptr[13] = PACKET2(0);
ib->ptr[14] = PACKET2(0);
ib->ptr[15] = PACKET2(0);
ib->length_dw = 16;
ib->length_dw = 3; r = radeon_ib_schedule(rdev, ib); if (r) { radeon_scratch_free(rdev, scratch);
-- 1.7.5.4
You sure about that ? I remember this helped with GPU lockup and i also seen fglrx aligning IB.
Cheers, Jerome
On 23.02.2012 18:00, Jerome Glisse wrote:
2012/2/23 Christian König <deathsimple@vodafone.de mailto:deathsimple@vodafone.de>
So don't confuse devs by doing so. Signed-off-by: Christian König <deathsimple@vodafone.de <mailto:deathsimple@vodafone..de>> --- drivers/gpu/drm/radeon/r600.c | 15 +-------------- 1 files changed, 1 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 4f08e5e..4a4ac8f 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2719,20 +2719,7 @@ int r600_ib_test(struct radeon_device *rdev, int ring) ib->ptr[0] = PACKET3(PACKET3_SET_CONFIG_REG, 1); ib->ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2); ib->ptr[2] = 0xDEADBEEF; - ib->ptr[3] = PACKET2(0); - ib->ptr[4] = PACKET2(0); - ib->ptr[5] = PACKET2(0); - ib->ptr[6] = PACKET2(0); - ib->ptr[7] = PACKET2(0); - ib->ptr[8] = PACKET2(0); - ib->ptr[9] = PACKET2(0); - ib->ptr[10] = PACKET2(0); - ib->ptr[11] = PACKET2(0); - ib->ptr[12] = PACKET2(0); - ib->ptr[13] = PACKET2(0); - ib->ptr[14] = PACKET2(0); - ib->ptr[15] = PACKET2(0); - ib->length_dw = 16; + ib->length_dw = 3; r = radeon_ib_schedule(rdev, ib); if (r) { radeon_scratch_free(rdev, scratch); -- 1.7.5.4
You sure about that ? I remember this helped with GPU lockup and i also seen fglrx aligning IB.
Yeah, pretty much. Well I searched for halve an hour for the corresponding IB alignment in mesa/the CS ioctl until I finally figured out that there isn't any.
So IBs submitted by usermode aren't aligned in any way.... So it really seems to work fine and I couldn't find any reason why we should align an IB for the GFX ring in our docs also.
Christian.
Not all rings use PM4, so the cs_parser also needs to be per ring.
Signed-off-by: Christian König deathsimple@vodafone.de --- drivers/gpu/drm/radeon/radeon.h | 4 +- drivers/gpu/drm/radeon/radeon_asic.c | 38 +++++++++++++++++---------------- drivers/gpu/drm/radeon/radeon_cs.c | 2 +- 3 files changed, 23 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 4c1b981..ca98772 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -1147,13 +1147,13 @@ struct radeon_asic { void (*emit_fence)(struct radeon_device *rdev, struct radeon_fence *fence); void (*emit_semaphore)(struct radeon_device *rdev, struct radeon_ring *cp, struct radeon_semaphore *semaphore, bool emit_wait); + int (*cs_parse)(struct radeon_cs_parser *p); } ring[RADEON_NUM_RINGS];
int (*ring_test)(struct radeon_device *rdev, struct radeon_ring *cp); int (*irq_set)(struct radeon_device *rdev); int (*irq_process)(struct radeon_device *rdev); u32 (*get_vblank_counter)(struct radeon_device *rdev, int crtc); - int (*cs_parse)(struct radeon_cs_parser *p); int (*copy_blit)(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset, @@ -1650,7 +1650,7 @@ void radeon_ring_write(struct radeon_ring *ring, uint32_t v); #define radeon_fini(rdev) (rdev)->asic->fini((rdev)) #define radeon_resume(rdev) (rdev)->asic->resume((rdev)) #define radeon_suspend(rdev) (rdev)->asic->suspend((rdev)) -#define radeon_cs_parse(p) rdev->asic->cs_parse((p)) +#define radeon_cs_parse(rdev, r, p) (rdev)->asic->ring[(r)].cs_parse((p)) #define radeon_vga_set_state(rdev, state) (rdev)->asic->vga_set_state((rdev), (state)) #define radeon_gpu_is_lockup(rdev, cp) (rdev)->asic->gpu_is_lockup((rdev), (cp)) #define radeon_asic_reset(rdev) (rdev)->asic->asic_reset((rdev)) diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c index 36a6192..636c68f 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.c +++ b/drivers/gpu/drm/radeon/radeon_asic.c @@ -145,12 +145,12 @@ static struct radeon_asic r100_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r100_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r100_cs_parse, } }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter, - .cs_parse = &r100_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = NULL, .copy = &r100_copy_blit, @@ -197,12 +197,12 @@ static struct radeon_asic r200_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r100_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r100_cs_parse, } }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter, - .cs_parse = &r100_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -248,12 +248,12 @@ static struct radeon_asic r300_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -300,12 +300,12 @@ static struct radeon_asic r300_asic_pcie = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -351,12 +351,12 @@ static struct radeon_asic r420_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -403,12 +403,12 @@ static struct radeon_asic rs400_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -455,12 +455,12 @@ static struct radeon_asic rs600_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -507,12 +507,12 @@ static struct radeon_asic rs690_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r200_copy_dma, @@ -559,12 +559,12 @@ static struct radeon_asic rv515_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -611,12 +611,12 @@ static struct radeon_asic r520_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit, + .cs_parse = &r300_cs_parse, } }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit, @@ -662,12 +662,12 @@ static struct radeon_asic r600_asic = { .ib_execute = &r600_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &r600_cs_parse, } }, .irq_set = &r600_irq_set, .irq_process = &r600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r600_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, @@ -713,12 +713,12 @@ static struct radeon_asic rs780_asic = { .ib_execute = &r600_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &r600_cs_parse, } }, .irq_set = &r600_irq_set, .irq_process = &r600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r600_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, @@ -764,12 +764,12 @@ static struct radeon_asic rv770_asic = { .ib_execute = &r600_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &r600_cs_parse, } }, .irq_set = &r600_irq_set, .irq_process = &r600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter, - .cs_parse = &r600_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, @@ -815,12 +815,12 @@ static struct radeon_asic evergreen_asic = { .ib_execute = &evergreen_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &evergreen_cs_parse, } }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter, - .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, @@ -866,12 +866,12 @@ static struct radeon_asic sumo_asic = { .ib_execute = &evergreen_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, - } + .cs_parse = &evergreen_cs_parse, + }, }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter, - .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, @@ -917,12 +917,12 @@ static struct radeon_asic btc_asic = { .ib_execute = &evergreen_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &evergreen_cs_parse, } }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter, - .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, @@ -979,24 +979,26 @@ static struct radeon_asic cayman_asic = { .ib_parse = &evergreen_ib_parse, .emit_fence = &cayman_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &evergreen_cs_parse, }, [CAYMAN_RING_TYPE_CP1_INDEX] = { .ib_execute = &cayman_ring_ib_execute, .ib_parse = &evergreen_ib_parse, .emit_fence = &cayman_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &evergreen_cs_parse, }, [CAYMAN_RING_TYPE_CP2_INDEX] = { .ib_execute = &cayman_ring_ib_execute, .ib_parse = &evergreen_ib_parse, .emit_fence = &cayman_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit, + .cs_parse = &evergreen_cs_parse, } }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter, - .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit, diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 7fd0987..dc79d08 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -348,7 +348,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, return r; } parser->ib->length_dw = ib_chunk->length_dw; - r = radeon_cs_parse(parser); + r = radeon_cs_parse(rdev, parser->ring, parser); if (r || parser->parser_error) { DRM_ERROR("Invalid command stream !\n"); return r;
On Thu, 2012-02-23 at 15:18 +0100, Christian König wrote:
Not all rings use PM4, so the cs_parser also needs to be per ring.
Signed-off-by: Christian König deathsimple@vodafone.de
Reviewed-by: Jerome Glisse jglisse@redhat.com
drivers/gpu/drm/radeon/radeon.h | 4 +- drivers/gpu/drm/radeon/radeon_asic.c | 38 +++++++++++++++++---------------- drivers/gpu/drm/radeon/radeon_cs.c | 2 +- 3 files changed, 23 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 4c1b981..ca98772 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -1147,13 +1147,13 @@ struct radeon_asic { void (*emit_fence)(struct radeon_device *rdev, struct radeon_fence *fence); void (*emit_semaphore)(struct radeon_device *rdev, struct radeon_ring *cp, struct radeon_semaphore *semaphore, bool emit_wait);
int (*cs_parse)(struct radeon_cs_parser *p);
} ring[RADEON_NUM_RINGS];
int (*ring_test)(struct radeon_device *rdev, struct radeon_ring *cp); int (*irq_set)(struct radeon_device *rdev); int (*irq_process)(struct radeon_device *rdev); u32 (*get_vblank_counter)(struct radeon_device *rdev, int crtc);
- int (*cs_parse)(struct radeon_cs_parser *p); int (*copy_blit)(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset,
@@ -1650,7 +1650,7 @@ void radeon_ring_write(struct radeon_ring *ring, uint32_t v); #define radeon_fini(rdev) (rdev)->asic->fini((rdev)) #define radeon_resume(rdev) (rdev)->asic->resume((rdev)) #define radeon_suspend(rdev) (rdev)->asic->suspend((rdev)) -#define radeon_cs_parse(p) rdev->asic->cs_parse((p)) +#define radeon_cs_parse(rdev, r, p) (rdev)->asic->ring[(r)].cs_parse((p)) #define radeon_vga_set_state(rdev, state) (rdev)->asic->vga_set_state((rdev), (state)) #define radeon_gpu_is_lockup(rdev, cp) (rdev)->asic->gpu_is_lockup((rdev), (cp)) #define radeon_asic_reset(rdev) (rdev)->asic->asic_reset((rdev)) diff --git a/drivers/gpu/drm/radeon/radeon_asic.c b/drivers/gpu/drm/radeon/radeon_asic.c index 36a6192..636c68f 100644 --- a/drivers/gpu/drm/radeon/radeon_asic.c +++ b/drivers/gpu/drm/radeon/radeon_asic.c @@ -145,12 +145,12 @@ static struct radeon_asic r100_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r100_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter,.cs_parse = &r100_cs_parse,
- .cs_parse = &r100_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = NULL, .copy = &r100_copy_blit,
@@ -197,12 +197,12 @@ static struct radeon_asic r200_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r100_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter,.cs_parse = &r100_cs_parse,
- .cs_parse = &r100_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -248,12 +248,12 @@ static struct radeon_asic r300_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -300,12 +300,12 @@ static struct radeon_asic r300_asic_pcie = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -351,12 +351,12 @@ static struct radeon_asic r420_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -403,12 +403,12 @@ static struct radeon_asic rs400_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &r100_irq_set, .irq_process = &r100_irq_process, .get_vblank_counter = &r100_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -455,12 +455,12 @@ static struct radeon_asic rs600_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -507,12 +507,12 @@ static struct radeon_asic rs690_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r200_copy_dma,
@@ -559,12 +559,12 @@ static struct radeon_asic rv515_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -611,12 +611,12 @@ static struct radeon_asic r520_asic = { .ib_execute = &r100_ring_ib_execute, .emit_fence = &r300_fence_ring_emit, .emit_semaphore = &r100_semaphore_ring_emit,
} }, .irq_set = &rs600_irq_set, .irq_process = &rs600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r300_cs_parse,
- .cs_parse = &r300_cs_parse, .copy_blit = &r100_copy_blit, .copy_dma = &r200_copy_dma, .copy = &r100_copy_blit,
@@ -662,12 +662,12 @@ static struct radeon_asic r600_asic = { .ib_execute = &r600_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
} }, .irq_set = &r600_irq_set, .irq_process = &r600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r600_cs_parse,
- .cs_parse = &r600_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
@@ -713,12 +713,12 @@ static struct radeon_asic rs780_asic = { .ib_execute = &r600_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
} }, .irq_set = &r600_irq_set, .irq_process = &r600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r600_cs_parse,
- .cs_parse = &r600_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
@@ -764,12 +764,12 @@ static struct radeon_asic rv770_asic = { .ib_execute = &r600_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
} }, .irq_set = &r600_irq_set, .irq_process = &r600_irq_process, .get_vblank_counter = &rs600_get_vblank_counter,.cs_parse = &r600_cs_parse,
- .cs_parse = &r600_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
@@ -815,12 +815,12 @@ static struct radeon_asic evergreen_asic = { .ib_execute = &evergreen_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
} }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter,.cs_parse = &evergreen_cs_parse,
- .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
@@ -866,12 +866,12 @@ static struct radeon_asic sumo_asic = { .ib_execute = &evergreen_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
}
.cs_parse = &evergreen_cs_parse,
}, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter,},
- .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
@@ -917,12 +917,12 @@ static struct radeon_asic btc_asic = { .ib_execute = &evergreen_ring_ib_execute, .emit_fence = &r600_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
} }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter,.cs_parse = &evergreen_cs_parse,
- .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
@@ -979,24 +979,26 @@ static struct radeon_asic cayman_asic = { .ib_parse = &evergreen_ib_parse, .emit_fence = &cayman_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,
}, [CAYMAN_RING_TYPE_CP1_INDEX] = { .ib_execute = &cayman_ring_ib_execute, .ib_parse = &evergreen_ib_parse, .emit_fence = &cayman_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,.cs_parse = &evergreen_cs_parse,
}, [CAYMAN_RING_TYPE_CP2_INDEX] = { .ib_execute = &cayman_ring_ib_execute, .ib_parse = &evergreen_ib_parse, .emit_fence = &cayman_fence_ring_emit, .emit_semaphore = &r600_semaphore_ring_emit,.cs_parse = &evergreen_cs_parse,
} }, .irq_set = &evergreen_irq_set, .irq_process = &evergreen_irq_process, .get_vblank_counter = &evergreen_get_vblank_counter,.cs_parse = &evergreen_cs_parse,
- .cs_parse = &evergreen_cs_parse, .copy_blit = &r600_copy_blit, .copy_dma = NULL, .copy = &r600_copy_blit,
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 7fd0987..dc79d08 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -348,7 +348,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev, return r; } parser->ib->length_dw = ib_chunk->length_dw;
- r = radeon_cs_parse(parser);
- r = radeon_cs_parse(rdev, parser->ring, parser); if (r || parser->parser_error) { DRM_ERROR("Invalid command stream !\n"); return r;
Storing pointers to the IBs in a static var just leads to giving the same content back for all cards in the system.
Signed-off-by: Christian König deathsimple@vodafone.de --- drivers/gpu/drm/radeon/radeon_ring.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 30a4c50..32c83d8 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -478,7 +478,9 @@ static struct drm_info_list radeon_debugfs_ring_info_list[] = { static int radeon_debugfs_ib_info(struct seq_file *m, void *data) { struct drm_info_node *node = (struct drm_info_node *) m->private; - struct radeon_ib *ib = node->info_ent->data; + struct drm_device *dev = node->minor->dev; + struct radeon_device *rdev = dev->dev_private; + struct radeon_ib *ib = &rdev->ib_pool.ibs[*((unsigned*)node->info_ent->data)]; unsigned i;
if (ib == NULL) { @@ -495,6 +497,7 @@ static int radeon_debugfs_ib_info(struct seq_file *m, void *data)
static struct drm_info_list radeon_debugfs_ib_list[RADEON_IB_POOL_SIZE]; static char radeon_debugfs_ib_names[RADEON_IB_POOL_SIZE][32]; +static unsigned radeon_debugfs_ib_idx[RADEON_IB_POOL_SIZE]; #endif
int radeon_debugfs_ring_init(struct radeon_device *rdev) @@ -514,10 +517,11 @@ int radeon_debugfs_ib_init(struct radeon_device *rdev)
for (i = 0; i < RADEON_IB_POOL_SIZE; i++) { sprintf(radeon_debugfs_ib_names[i], "radeon_ib_%04u", i); + radeon_debugfs_ib_idx[i] = i; radeon_debugfs_ib_list[i].name = radeon_debugfs_ib_names[i]; radeon_debugfs_ib_list[i].show = &radeon_debugfs_ib_info; radeon_debugfs_ib_list[i].driver_features = 0; - radeon_debugfs_ib_list[i].data = &rdev->ib_pool.ibs[i]; + radeon_debugfs_ib_list[i].data = &radeon_debugfs_ib_idx[i]; } return radeon_debugfs_add_files(rdev, radeon_debugfs_ib_list, RADEON_IB_POOL_SIZE);
2012/2/23 Christian König deathsimple@vodafone.de:
The function radeon_bo_list_validate can cause a bo to move, resulting in a different sync_obj and a dependency to wait for this move to finish.
Signed-off-by: Christian König deathsimple@vodafone.de Reviewed-by: Alex Deucher alexander.deucher@amd.com
For the series:
Reviewed-by: Alex Deucher alexander.deucher@amd.com
drivers/gpu/drm/radeon/radeon.h | 1 - drivers/gpu/drm/radeon/radeon_cs.c | 21 ++++++++++++++------- 2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 884e0d4..4c1b981 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -836,7 +836,6 @@ struct radeon_cs_parser { struct radeon_cs_reloc *relocs; struct radeon_cs_reloc **relocs_ptr; struct list_head validated;
- bool sync_to_ring[RADEON_NUM_RINGS];
/* indices of various chunks */ int chunk_ib_idx; int chunk_relocs_idx; diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 435a3d9..7fd0987 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -85,12 +85,6 @@ int radeon_cs_parser_relocs(struct radeon_cs_parser *p) radeon_bo_list_add_object(&p->relocs[i].lobj, &p->validated);
- if (p->relocs[i].robj->tbo.sync_obj && !(r->flags & RADEON_RELOC_DONT_SYNC)) {
- struct radeon_fence *fence = p->relocs[i].robj->tbo.sync_obj;
- if (!radeon_fence_signaled(fence)) {
- p->sync_to_ring[fence->ring] = true;
- }
- }
} else p->relocs[i].handle = 0; } @@ -118,11 +112,24 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, u32 ring, s32 priority
static int radeon_cs_sync_rings(struct radeon_cs_parser *p) {
- bool sync_to_ring[RADEON_NUM_RINGS] = { };
int i, r;
- for (i = 0; i < p->nrelocs; i++) {
- if (!p->relocs[i].robj || !p->relocs[i].robj->tbo.sync_obj)
- continue;
- if (!(p->relocs[i].flags & RADEON_RELOC_DONT_SYNC)) {
- struct radeon_fence *fence = p->relocs[i].robj->tbo.sync_obj;
- if (!radeon_fence_signaled(fence)) {
- sync_to_ring[fence->ring] = true;
- }
- }
- }
for (i = 0; i < RADEON_NUM_RINGS; ++i) { /* no need to sync to our own or unused rings */
- if (i == p->ring || !p->sync_to_ring[i] || !p->rdev->ring[i].ready)
- if (i == p->ring || !sync_to_ring[i] || !p->rdev->ring[i].ready)
continue;
if (!p->ib->fence->semaphore) {
1.7.5.4
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, 2012-02-23 at 10:17 -0500, Alex Deucher wrote:
2012/2/23 Christian König deathsimple@vodafone.de:
The function radeon_bo_list_validate can cause a bo to move, resulting in a different sync_obj and a dependency to wait for this move to finish.
Signed-off-by: Christian König deathsimple@vodafone.de Reviewed-by: Alex Deucher alexander.deucher@amd.com
For the series:
Reviewed-by: Alex Deucher alexander.deucher@amd.com
Same Reviewed-by: Jerome Glisse jglisse@redhat.com
drivers/gpu/drm/radeon/radeon.h | 1 - drivers/gpu/drm/radeon/radeon_cs.c | 21 ++++++++++++++------- 2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 884e0d4..4c1b981 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -836,7 +836,6 @@ struct radeon_cs_parser { struct radeon_cs_reloc *relocs; struct radeon_cs_reloc **relocs_ptr; struct list_head validated;
bool sync_to_ring[RADEON_NUM_RINGS]; /* indices of various chunks */ int chunk_ib_idx; int chunk_relocs_idx;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c index 435a3d9..7fd0987 100644 --- a/drivers/gpu/drm/radeon/radeon_cs.c +++ b/drivers/gpu/drm/radeon/radeon_cs.c @@ -85,12 +85,6 @@ int radeon_cs_parser_relocs(struct radeon_cs_parser *p) radeon_bo_list_add_object(&p->relocs[i].lobj, &p->validated);
if (p->relocs[i].robj->tbo.sync_obj && !(r->flags & RADEON_RELOC_DONT_SYNC)) {
struct radeon_fence *fence = p->relocs[i].robj->tbo.sync_obj;
if (!radeon_fence_signaled(fence)) {
p->sync_to_ring[fence->ring] = true;
}
} } else p->relocs[i].handle = 0; }
@@ -118,11 +112,24 @@ static int radeon_cs_get_ring(struct radeon_cs_parser *p, u32 ring, s32 priority
static int radeon_cs_sync_rings(struct radeon_cs_parser *p) {
bool sync_to_ring[RADEON_NUM_RINGS] = { }; int i, r;
for (i = 0; i < p->nrelocs; i++) {
if (!p->relocs[i].robj || !p->relocs[i].robj->tbo.sync_obj)
continue;
if (!(p->relocs[i].flags & RADEON_RELOC_DONT_SYNC)) {
struct radeon_fence *fence = p->relocs[i].robj->tbo.sync_obj;
if (!radeon_fence_signaled(fence)) {
sync_to_ring[fence->ring] = true;
}
}
}
for (i = 0; i < RADEON_NUM_RINGS; ++i) { /* no need to sync to our own or unused rings */
if (i == p->ring || !p->sync_to_ring[i] || !p->rdev->ring[i].ready)
if (i == p->ring || !sync_to_ring[i] || !p->rdev->ring[i].ready) continue; if (!p->ib->fence->semaphore) {
-- 1.7.5.4
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Christian,
On Thursday, February 23, 2012 15:18:42 Christian König wrote:
The function radeon_bo_list_validate can cause a bo to move, resulting in a different sync_obj and a dependency to wait for this move to finish.
Signed-off-by: Christian König deathsimple@vodafone.de Reviewed-by: Alex Deucher alexander.deucher@amd.com
I am not sure, but to me this looks like this could fix these kind of gpu lockups that I experience since some time every now and then. The usual symptom is that I get the
radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec GPU lockup (waiting for 0x00682AC3 last fence id 0x00682AC2) [...]
kernel message. Each time with the fence being off by one like in the example above.
If this change has the potential to fix this issue I think this particular patch should be considered for the current upstream kernel release.
Mathias
2012/2/23 Mathias Fröhlich Mathias.Froehlich@gmx.net
Christian,
On Thursday, February 23, 2012 15:18:42 Christian König wrote:
The function radeon_bo_list_validate can cause a bo to move, resulting in a different sync_obj and a dependency to wait for this move to finish.
Signed-off-by: Christian König deathsimple@vodafone.de Reviewed-by: Alex Deucher alexander.deucher@amd.com
I am not sure, but to me this looks like this could fix these kind of gpu lockups that I experience since some time every now and then. The usual symptom is that I get the
radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec GPU lockup (waiting for 0x00682AC3 last fence id 0x00682AC2) [...]
kernel message. Each time with the fence being off by one like in the example above.
If this change has the potential to fix this issue I think this particular patch should be considered for the current upstream kernel release.
Mathias
No this patch doesn't. This patch is all about getting proper sync btw different rings.
Sorry for the deception.
Cheers, Jerome
On 23.02.2012 18:32, Jerome Glisse wrote:
2012/2/23 Mathias Fröhlich <Mathias.Froehlich@gmx.net mailto:Mathias.Froehlich@gmx.net>
Christian, On Thursday, February 23, 2012 15:18:42 Christian König wrote: > The function radeon_bo_list_validate can cause a > bo to move, resulting in a different sync_obj > and a dependency to wait for this move to finish. > > Signed-off-by: Christian König <deathsimple@vodafone.de <mailto:deathsimple@vodafone.de>> > Reviewed-by: Alex Deucher <alexander.deucher@amd.com <mailto:alexander.deucher@amd.com>> I am not sure, but to me this looks like this could fix these kind of gpu lockups that I experience since some time every now and then. The usual symptom is that I get the radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec GPU lockup (waiting for 0x00682AC3 last fence id 0x00682AC2) [...] kernel message. Each time with the fence being off by one like in the example above. If this change has the potential to fix this issue I think this particular patch should be considered for the current upstream kernel release. Mathias
No this patch doesn't. This patch is all about getting proper sync btw different rings.
That's unfortunately true, since we wasn't able to release any code that makes direct use of the different rings (yet) it shouldn't really matter in practice. I Just wanted to have that fix upstream since it is an obvious bug.
Christian.
Sorry for the deception.
Cheers, Jerome
dri-devel@lists.freedesktop.org