Am 27.05.21 um 17:01 schrieb Thomas Hellström:
On Thu, 2021-05-27 at 16:54 +0200, Christian König wrote:
Am 27.05.21 um 16:19 schrieb Thomas Hellström:
The swapping code was dereference bo->ttm pointers without having the dma-resv lock held. Also it might try to swap out unpopulated bos.
Fix this by moving the bo->ttm dereference until we have the reservation lock. Check that the ttm_tt is populated after the swap_notify callback.
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
drivers/gpu/drm/ttm/ttm_bo.c | 16 +++++++++++++++- drivers/gpu/drm/ttm/ttm_device.c | 8 +++----- 2 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 9f53506a82fc..86213d37657b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1163,6 +1163,16 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, if (!ttm_bo_evict_swapout_allowable(bo, ctx, &place, &locked, NULL)) return -EBUSY;
+ dma_resv_assert_held(bo->base.resv);
+ if (!bo->ttm || + bo->ttm->page_flags & TTM_PAGE_FLAG_SG || + bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) { + if (locked) + dma_resv_unlock(bo->base.resv); + return -EBUSY; + }
if (!ttm_bo_get_unless_zero(bo)) { if (locked) dma_resv_unlock(bo->base.resv); @@ -1215,7 +1225,8 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, if (bo->bdev->funcs->swap_notify) bo->bdev->funcs->swap_notify(bo);
- ret = ttm_tt_swapout(bo->bdev, bo->ttm, gfp_flags); + if (ttm_tt_is_populated(bo->ttm)) + ret = ttm_tt_swapout(bo->bdev, bo->ttm, gfp_flags);
Exactly that is what I won't recommend. We would try to swap out the same BO over and over again with that.
But we wouldn't since the BO is taken off the LRU and never re-added,
Well then that would be a bug in itself.
Why not move that to the check above as well?
Because the BO may become unpopulated in swap_notify(), i915, like vmwgfx, sometimes sets up gpu bindings from system, and when we get a notification from user-space that those are purgeable, we don't want to purge immediately but wait for a potential swapout.
Uff, good point. But then we need to check that at both locations I think.
Because populating the TT object currently doesn't put the BO back on the LRU eventually.
Christian.
/Thomas
Christian.
out:
/* @@ -1225,6 +1236,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, if (locked) dma_resv_unlock(bo->base.resv); ttm_bo_put(bo);
+ /* Don't break locking rules. */ + WARN_ON(ret == -EBUSY); return ret; }
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c index 460953dcad11..eaa7487ae404 100644 --- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -143,14 +143,12 @@ int ttm_device_swapout(struct ttm_device *bdev, struct ttm_operation_ctx *ctx,
for (j = 0; j < TTM_MAX_BO_PRIORITY; ++j) { list_for_each_entry(bo, &man->lru[j], lru) { - uint32_t num_pages; + pgoff_t num_pages;
- if (!bo->ttm || - bo->ttm->page_flags & TTM_PAGE_FLAG_SG || - bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) + if (!READ_ONCE(bo->ttm)) continue;
- num_pages = bo->ttm->num_pages; + num_pages = bo->base.size >> PAGE_SHIFT; ret = ttm_bo_swapout(bo, ctx, gfp_flags); /* ttm_bo_swapout has dropped the lru_lock */ if (!ret)