Hi Thomas,
Am 23.08.21 um 09:56 schrieb Thomas Hellström:
Hi, Christian,
Still working through some backlog and this series appears to have slipped under the radar.
Still I think a cover letter would benefit reviewers...
On Mon, 2021-07-19 at 13:51 +0200, Christian König wrote:
Keep track for which BO a resource was allocated. This is necessary to move the LRU handling into the resources.
So is this needed, when looking up a resource from the LRU, to find the bo the resource is currently attached to?
yes, correct. The main motivation is to fix resource handling during allocations and moves.
Essentially we currently have the following steps during resource allocation: 1. Locking the BO. 2. Figuring out we need resources. 2. Allocating the resource. 3. Moving the BO to the correct LRU. 4. Unlocking the BO.
The problem is now that we have a short time where we allocated the resource, but can't evict it again. In other words we don't know to which object a resource belongs.
Buffer moves are even worse since we have an old resource as well as a new one at the same time and so potentially would need the BO on two LRUs at the same time.
This has caused a whole bunch of problems in the past because we ran into out of resource situations and implemented quite a number of workarounds for this.
A bit problematic is i915 since it tries to use the resource interface without a BO which is illegal from the conceptional
s/conceptional/conceptual/ ?
point of view.
v2: Document that this is a weak reference and add a workaround for i915
Hmm. As pointed out in my previous review the weak pointer should really be cleared under a lookup lock to avoid use-after-free bugs. I don't see that happening here. Doesn't this series aim for a future direction of early destruction of bos and ditching the ghost objects? In that case IMHO you need to allow for a NULL bo pointer and any bo information needed at resource destruction needs to be copied on the resource... (That would also ofc help with the i915 problem).
Yes, I'm going back and force how to do this cleanly as well.
Ok going to give the NULL pointer a try.
Signed-off-by: Christian König christian.koenig@amd.com
drivers/gpu/drm/i915/intel_region_ttm.c | 5 +++++ drivers/gpu/drm/ttm/ttm_bo_util.c | 7 +++++-- drivers/gpu/drm/ttm/ttm_resource.c | 1 + include/drm/ttm/ttm_resource.h | 2 ++ 4 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_region_ttm.c b/drivers/gpu/drm/i915/intel_region_ttm.c index 27fe0668d094..980b10a7debf 100644 --- a/drivers/gpu/drm/i915/intel_region_ttm.c +++ b/drivers/gpu/drm/i915/intel_region_ttm.c @@ -88,6 +88,7 @@ intel_region_ttm_node_reserve(struct intel_memory_region *mem, place.fpfn = offset >> PAGE_SHIFT; place.lpfn = place.fpfn + (size >> PAGE_SHIFT); mock_bo.base.size = size; + mock_bo.bdev = &mem->i915->bdev; ret = man->func->alloc(man, &mock_bo, &place, &res); if (ret == -ENOSPC) ret = -ENXIO; @@ -104,7 +105,11 @@ void intel_region_ttm_node_free(struct intel_memory_region *mem, struct ttm_resource *res) { struct ttm_resource_manager *man = mem->region_private; + struct ttm_buffer_object mock_bo = {};
+ mock_bo.base.size = res->num_pages * PAGE_SIZE; + mock_bo.bdev = &mem->i915->bdev; + res->bo = &mock_bo; man->func->free(man, res); }
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index 2f57f824e6db..a1570aa8ff56 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -239,6 +239,11 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, if (bo->type != ttm_bo_type_sg) fbo->base.base.resv = &fbo->base.base._resv;
+ if (fbo->base.resource) { + fbo->base.resource->bo = &fbo->base;
This is scary: What if someone else (LRU lookup) just dereferenced the previous resource->bo pointer? I think this needs proper weak reference locking and lifetime handling, as mentioned above.
Ah, yes good point. Missed this occasion.
Thanks, Christian.
+ bo->resource = NULL; + }
/Thomas