Am 04.06.22 um 00:44 schrieb Felix Kuehling:
[+amd-gfx]
On 2022-06-03 15:37, Felix Kuehling wrote:
On 2022-06-03 06:46, Christian König wrote:
Resources about to be destructed are not tied to BOs any more.
I've been seeing a backtrace in that area with a patch series I'm working on, but didn't have enough time to track it down yet. I'll try if this patch fixes it.
The patch doesn't apply on amd-staging-drm-next. I made the following change instead, which fixes my problem (and I do see the pr_err being triggered):
--- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -157,6 +157,10 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx, list_for_each_entry(bo, &man->lru[j], lru) { uint32_t num_pages = PFN_UP(bo->base.size);
+ if (!bo->resource) { + pr_err("### bo->resource is NULL\n"); + continue; + }
Yeah, that should be functional identical.
Can I get an rb for that? Going to provide backports to older kernels as well then.
Regards, Christian.
ret = ttm_bo_swapout(bo, ctx, gfp_flags); /* ttm_bo_swapout has dropped the lru_lock */ if (!ret)
Regards, Felix
Signed-off-by: Christian König christian.koenig@amd.com
drivers/gpu/drm/ttm/ttm_device.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c index a0562ab386f5..e7147e304637 100644 --- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -156,8 +156,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx, ttm_resource_manager_for_each_res(man, &cursor, res) { struct ttm_buffer_object *bo = res->bo; - uint32_t num_pages = PFN_UP(bo->base.size); + uint32_t num_pages; + if (!bo) + continue;
+ num_pages = PFN_UP(bo->base.size); ret = ttm_bo_swapout(bo, ctx, gfp_flags); /* ttm_bo_swapout has dropped the lru_lock */ if (!ret)