[PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access

List overview All Threads
Download

newer

older

[PATCH] drm/i915/tc: Delete bogus...

[PATCH v1] drm/msm: use compatible...

Matthew Auld

27 Sep 2021 27 Sep '21

11:41 a.m.

In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling Felix.Kuehling@amd.com Date: Thu Jul 13 17:01:16 2017 -0400

drm/ttm: Implement vm_operations_struct.access v2

we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 ----- 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..5b9b7fd01a69 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,

switch (bo->resource->mem_type) { case TTM_PL_SYSTEM: - if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { - ret = ttm_tt_swapin(bo->ttm); - if (unlikely(ret != 0)) - return ret; - } fallthrough; case TTM_PL_TT: ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);

-- 2.26.3

Show replies by date

Matthew Auld

27 Sep 27 Sep

11:41 a.m.

New subject: [PATCH v5 02/13] drm/ttm: stop setting page->index for the ttm_tt

In commit:

commit 58aa6622d32af7d2c08d45085f44c54554a16ed7 Author: Thomas Hellstrom thellstrom@vmware.com Date: Fri Jan 3 11:47:23 2014 +0100

drm/ttm: Correctly set page mapping and -index members

we started setting the page->mapping and page->index to point to the virtual address space, if the pages were faulted with TTM. Apparently this was needed for core-mm to able to reverse lookup the virtual address given the struct page, and potentially unmap it from the page tables. However as pointed out by Thomas, since we are now using PFN_MAP, instead of say PFN_MIXED, this should no longer be the case.

There was also apparently some usecase in vmwgfx which needed this for dirty tracking, but that also doesn't appear to be the case anymore, as pointed out by Thomas.

We still need keep the page->mapping for now, since that is still needed for different reasons, but we try to address that in the next patch.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 -- drivers/gpu/drm/ttm/ttm_tt.c | 4 +--- 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 5b9b7fd01a69..9a2119fe4bdd 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -346,8 +346,6 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf, } else if (unlikely(!page)) { break; } - page->index = drm_vma_node_start(&bo->base.vma_node) + - page_offset; pfn = page_to_pfn(page); }

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index dae52433beeb..1cc04c224988 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -367,10 +367,8 @@ static void ttm_tt_clear_mapping(struct ttm_tt *ttm) if (ttm->page_flags & TTM_PAGE_FLAG_SG) return;

- for (i = 0; i < ttm->num_pages; ++i) { + for (i = 0; i < ttm->num_pages; ++i) (*page)->mapping = NULL; - (*page++)->index = 0; - } }

void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm)

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 03/13] drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu

Now that setting page->index shouldn't be needed anymore, we are just left with setting page->mapping, and here it looks like amdgpu is the only user, where pointing the page->mapping at the dev_mapping is used to verify that the pages do indeed belong to the device, if userspace later tries to touch them.

v2(Christian): - Drop the functions altogether and just inline modifying the page->mapping

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 15 ++++++++++++++- drivers/gpu/drm/ttm/ttm_tt.c | 25 ------------------------- 2 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 820fcb24231f..438377a89aa3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1119,6 +1119,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev, { struct amdgpu_device *adev = amdgpu_ttm_adev(bdev); struct amdgpu_ttm_tt *gtt = (void *)ttm; + pgoff_t i; + int ret;

/* user pages are bound by amdgpu_ttm_tt_pin_userptr() */ if (gtt->userptr) { @@ -1131,7 +1133,14 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev, if (ttm->page_flags & TTM_PAGE_FLAG_SG) return 0;

- return ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx); + ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx); + if (ret) + return ret; + + for (i = 0; i < ttm->num_pages; ++i) + ttm->pages[i]->mapping = bdev->dev_mapping; + + return 0; }

/* @@ -1145,6 +1154,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, { struct amdgpu_ttm_tt *gtt = (void *)ttm; struct amdgpu_device *adev; + pgoff_t i;

amdgpu_ttm_backend_unbind(bdev, ttm);

@@ -1158,6 +1168,9 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, if (ttm->page_flags & TTM_PAGE_FLAG_SG) return;

+ for (i = 0; i < ttm->num_pages; ++i) + ttm->pages[i]->mapping = NULL; + adev = amdgpu_ttm_adev(bdev); return ttm_pool_free(&adev->mman.bdev.pool, ttm); } diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 1cc04c224988..980ecb079b2c 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -289,17 +289,6 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm, return ret; }

-static void ttm_tt_add_mapping(struct ttm_device *bdev, struct ttm_tt *ttm) -{ - pgoff_t i; - - if (ttm->page_flags & TTM_PAGE_FLAG_SG) - return; - - for (i = 0; i < ttm->num_pages; ++i) - ttm->pages[i]->mapping = bdev->dev_mapping; -} - int ttm_tt_populate(struct ttm_device *bdev, struct ttm_tt *ttm, struct ttm_operation_ctx *ctx) { @@ -336,7 +325,6 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ret) goto error;

- ttm_tt_add_mapping(bdev, ttm); ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED; if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { ret = ttm_tt_swapin(ttm); @@ -359,24 +347,11 @@ int ttm_tt_populate(struct ttm_device *bdev, } EXPORT_SYMBOL(ttm_tt_populate);

-static void ttm_tt_clear_mapping(struct ttm_tt *ttm) -{ - pgoff_t i; - struct page **page = ttm->pages; - - if (ttm->page_flags & TTM_PAGE_FLAG_SG) - return; - - for (i = 0; i < ttm->num_pages; ++i) - (*page)->mapping = NULL; -} - void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { if (!ttm_tt_is_populated(ttm)) return;

- ttm_tt_clear_mapping(ttm); if (bdev->funcs->ttm_tt_unpopulate) bdev->funcs->ttm_tt_unpopulate(bdev, ttm); else

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 04/13] drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY

No longer used it seems.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- include/drm/ttm/ttm_tt.h | 1 - 1 file changed, 1 deletion(-)

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 89b15d673b22..842ce756213c 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -41,7 +41,6 @@ struct ttm_operation_ctx; #define TTM_PAGE_FLAG_SWAPPED (1 << 4) #define TTM_PAGE_FLAG_ZERO_ALLOC (1 << 6) #define TTM_PAGE_FLAG_SG (1 << 8) -#define TTM_PAGE_FLAG_NO_RETRY (1 << 9)

#define TTM_PAGE_FLAG_PRIV_POPULATED (1 << 31)

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 05/13] drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/

It covers more than just ttm_bo_type_sg usage, like with say dma-buf, since one other user is userptr in amdgpu, and in the future we might have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian): - Rename these to TTM_TT_FLAGS_* - Fix up all the holes in the flag values

Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Acked-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +++++----- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 6 +++--- drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++-- drivers/gpu/drm/radeon/radeon_ttm.c | 8 ++++---- drivers/gpu/drm/ttm/ttm_bo.c | 4 ++-- drivers/gpu/drm/ttm/ttm_bo_util.c | 4 ++-- drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +- drivers/gpu/drm/ttm/ttm_pool.c | 2 +- drivers/gpu/drm/ttm/ttm_tt.c | 24 ++++++++++++------------ include/drm/ttm/ttm_device.h | 2 +- include/drm/ttm/ttm_tt.h | 18 +++++++++--------- 11 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 438377a89aa3..0cf94421665f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -894,7 +894,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev, DRM_ERROR("failed to pin userptr\n"); return r; } - } else if (ttm->page_flags & TTM_PAGE_FLAG_SG) { + } else if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) { if (!ttm->sg) { struct dma_buf_attachment *attach; struct sg_table *sgt; @@ -1130,7 +1130,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev, return 0; }

- if (ttm->page_flags & TTM_PAGE_FLAG_SG) + if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) return 0;

ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx); @@ -1165,7 +1165,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, return; }

- if (ttm->page_flags & TTM_PAGE_FLAG_SG) + if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) return;

for (i = 0; i < ttm->num_pages; ++i) @@ -1198,8 +1198,8 @@ int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo, return -ENOMEM; }

- /* Set TTM_PAGE_FLAG_SG before populate but after create. */ - bo->ttm->page_flags |= TTM_PAGE_FLAG_SG; + /* Set TTM_TT_FLAG_EXTERNAL before populate but after create. */ + bo->ttm->page_flags |= TTM_TT_FLAG_EXTERNAL;

gtt = (void *)bo->ttm; gtt->userptr = addr; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index b94497989995..a77e90f300fe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -194,7 +194,7 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,

if (obj->flags & I915_BO_ALLOC_CPU_CLEAR && man->use_tt) - page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; + page_flags |= TTM_TT_FLAG_ZERO_ALLOC;

ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, i915_ttm_select_tt_caching(obj)); @@ -562,7 +562,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, }

/* Populate ttm with pages if needed. Typically system memory. */ - if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED))) { + if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_TT_FLAG_SWAPPED))) { ret = ttm_tt_populate(bo->bdev, ttm, ctx); if (ret) return ret; @@ -573,7 +573,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, return PTR_ERR(dst_st);

clear = !cpu_maps_iomem(bo->resource) && (!ttm || !ttm_tt_is_populated(ttm)); - if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))) + if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC))) __i915_ttm_move(bo, clear, dst_mem, bo->ttm, dst_st, true);

ttm_bo_move_sync_cleanup(bo, dst_mem); diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index d3b21d318b42..12b107acb6ee 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -1250,7 +1250,7 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev, struct ttm_tt *ttm_dma = (void *)ttm; struct nouveau_drm *drm; struct device *dev; - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);

if (ttm_tt_is_populated(ttm)) return 0; @@ -1273,7 +1273,7 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev, { struct nouveau_drm *drm; struct device *dev; - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);

if (slave) return; diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 7793249bc549..11b21d605584 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -545,14 +545,14 @@ static int radeon_ttm_tt_populate(struct ttm_device *bdev, { struct radeon_device *rdev = radeon_get_rdev(bdev); struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm); - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);

if (gtt && gtt->userptr) { ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL); if (!ttm->sg) return -ENOMEM;

- ttm->page_flags |= TTM_PAGE_FLAG_SG; + ttm->page_flags |= TTM_TT_FLAG_EXTERNAL; return 0; }

@@ -569,13 +569,13 @@ static void radeon_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm { struct radeon_device *rdev = radeon_get_rdev(bdev); struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm); - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL);

radeon_ttm_tt_unbind(bdev, ttm);

if (gtt && gtt->userptr) { kfree(ttm->sg); - ttm->page_flags &= ~TTM_PAGE_FLAG_SG; + ttm->page_flags &= ~TTM_TT_FLAG_EXTERNAL; return; }

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 3b22c0013dbf..d62b2013c367 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1115,8 +1115,8 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, return -EBUSY;

if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) || - bo->ttm->page_flags & TTM_PAGE_FLAG_SG || - bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED || + bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL || + bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED || !ttm_bo_get_unless_zero(bo)) { if (locked) dma_resv_unlock(bo->base.resv); diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index c893c3db2623..a342d701c91c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -147,7 +147,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, bool clear; int ret = 0;

- if (ttm && ((ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) || + if (ttm && ((ttm->page_flags & TTM_TT_FLAG_SWAPPED) || dst_man->use_tt)) { ret = ttm_tt_populate(bdev, ttm, ctx); if (ret) @@ -169,7 +169,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, }

clear = src_iter->ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm)); - if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))) + if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC))) ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter);

if (!src_iter->ops->maps_tt) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 9a2119fe4bdd..950f4f132802 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -162,7 +162,7 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo, * Refuse to fault imported pages. This should be handled * (if at all) by redirecting mmap to the exporter. */ - if (bo->ttm && (bo->ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { dma_resv_unlock(bo->base.resv); return VM_FAULT_SIGBUS; } diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index c961a788b519..1bba0a0ed3f9 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -371,7 +371,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, WARN_ON(!num_pages || ttm_tt_is_populated(tt)); WARN_ON(dma_addr && !pool->dev);

- if (tt->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC) + if (tt->page_flags & TTM_TT_FLAG_ZERO_ALLOC) gfp_flags |= __GFP_ZERO;

if (ctx->gfp_retry_mayfail) diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 980ecb079b2c..86f31fde6e35 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -68,12 +68,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc) switch (bo->type) { case ttm_bo_type_device: if (zero_alloc) - page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; + page_flags |= TTM_TT_FLAG_ZERO_ALLOC; break; case ttm_bo_type_kernel: break; case ttm_bo_type_sg: - page_flags |= TTM_PAGE_FLAG_SG; + page_flags |= TTM_TT_FLAG_EXTERNAL; break; default: pr_err("Illegal buffer object type\n"); @@ -156,7 +156,7 @@ EXPORT_SYMBOL(ttm_tt_init);

void ttm_tt_fini(struct ttm_tt *ttm) { - WARN_ON(ttm->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED); + WARN_ON(ttm->page_flags & TTM_TT_FLAG_PRIV_POPULATED);

if (ttm->swap_storage) fput(ttm->swap_storage); @@ -178,7 +178,7 @@ int ttm_sg_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,

ttm_tt_init_fields(ttm, bo, page_flags, caching);

- if (page_flags & TTM_PAGE_FLAG_SG) + if (page_flags & TTM_TT_FLAG_EXTERNAL) ret = ttm_sg_tt_alloc_page_directory(ttm); else ret = ttm_dma_tt_alloc_page_directory(ttm); @@ -224,7 +224,7 @@ int ttm_tt_swapin(struct ttm_tt *ttm)

fput(swap_storage); ttm->swap_storage = NULL; - ttm->page_flags &= ~TTM_PAGE_FLAG_SWAPPED; + ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;

return 0;

@@ -279,7 +279,7 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm,

ttm_tt_unpopulate(bdev, ttm); ttm->swap_storage = swap_storage; - ttm->page_flags |= TTM_PAGE_FLAG_SWAPPED; + ttm->page_flags |= TTM_TT_FLAG_SWAPPED;

return ttm->num_pages;

@@ -300,7 +300,7 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ttm_tt_is_populated(ttm)) return 0;

- if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { atomic_long_add(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_add(ttm->num_pages, @@ -325,8 +325,8 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ret) goto error;

- ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED; - if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { + ttm->page_flags |= TTM_TT_FLAG_PRIV_POPULATED; + if (unlikely(ttm->page_flags & TTM_TT_FLAG_SWAPPED)) { ret = ttm_tt_swapin(ttm); if (unlikely(ret != 0)) { ttm_tt_unpopulate(bdev, ttm); @@ -337,7 +337,7 @@ int ttm_tt_populate(struct ttm_device *bdev, return 0;

error: - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_sub(ttm->num_pages, @@ -357,14 +357,14 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) else ttm_pool_free(&bdev->pool, ttm);

- if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_sub(ttm->num_pages, &ttm_dma32_pages_allocated); }

- ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED; + ttm->page_flags &= ~TTM_TT_FLAG_PRIV_POPULATED; }

#ifdef CONFIG_DEBUG_FS diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h index cbe03d45e883..0a4ddec78d8f 100644 --- a/include/drm/ttm/ttm_device.h +++ b/include/drm/ttm/ttm_device.h @@ -65,7 +65,7 @@ struct ttm_device_funcs { * ttm_tt_create * * @bo: The buffer object to create the ttm for. - * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. + * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags. * * Create a struct ttm_tt to back data with system memory pages. * No pages are actually allocated. diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 842ce756213c..b023cd58ff38 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -38,17 +38,17 @@ struct ttm_resource; struct ttm_buffer_object; struct ttm_operation_ctx;

-#define TTM_PAGE_FLAG_SWAPPED (1 << 4) -#define TTM_PAGE_FLAG_ZERO_ALLOC (1 << 6) -#define TTM_PAGE_FLAG_SG (1 << 8) +#define TTM_TT_FLAG_SWAPPED (1 << 0) +#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) +#define TTM_TT_FLAG_EXTERNAL (1 << 2)

-#define TTM_PAGE_FLAG_PRIV_POPULATED (1 << 31) +#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31)

/** * struct ttm_tt * * @pages: Array of pages backing the data. - * @page_flags: see TTM_PAGE_FLAG_* + * @page_flags: see TTM_TT_FLAG_* * @num_pages: Number of pages in the page array. * @sg: for SG objects via dma-buf * @dma_address: The DMA (bus) addresses of the pages @@ -84,7 +84,7 @@ struct ttm_kmap_iter_tt {

static inline bool ttm_tt_is_populated(struct ttm_tt *tt) { - return tt->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED; + return tt->page_flags & TTM_TT_FLAG_PRIV_POPULATED; }

/** @@ -103,7 +103,7 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc); * * @ttm: The struct ttm_tt. * @bo: The buffer object we create the ttm for. - * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. + * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags. * @caching: the desired caching state of the pages * * Create a struct ttm_tt to back data with system memory pages. @@ -178,7 +178,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm); */ static inline void ttm_tt_mark_for_clear(struct ttm_tt *ttm) { - ttm->page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; + ttm->page_flags |= TTM_TT_FLAG_ZERO_ALLOC; }

void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages); @@ -194,7 +194,7 @@ struct ttm_kmap_iter *ttm_kmap_iter_tt_init(struct ttm_kmap_iter_tt *iter_tt, * * @bo: Buffer object we allocate the ttm for. * @bridge: The agp bridge this device is sitting on. - * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. + * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags. * * * Create a TTM backend that uses the indicated AGP bridge as an aperture

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 06/13] drm/ttm: add some kernel-doc for TTM_TT_FLAG_*

Move it to inline kernel-doc, otherwise we can't add empty lines it seems. Also drop the kernel-doc for pages_list, which doesn't seem to exist.

v2(Christian): - Add a note that FLAG_SWAPPED shouldn't need to be touched by drivers. - Mention what FLAG_POPULATED does.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- include/drm/ttm/ttm_tt.h | 60 +++++++++++++++++++++++++++------------- 1 file changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index b023cd58ff38..86d74069be3e 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -38,35 +38,57 @@ struct ttm_resource; struct ttm_buffer_object; struct ttm_operation_ctx;

-#define TTM_TT_FLAG_SWAPPED (1 << 0) -#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) -#define TTM_TT_FLAG_EXTERNAL (1 << 2) - -#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) - /** - * struct ttm_tt - * - * @pages: Array of pages backing the data. - * @page_flags: see TTM_TT_FLAG_* - * @num_pages: Number of pages in the page array. - * @sg: for SG objects via dma-buf - * @dma_address: The DMA (bus) addresses of the pages - * @swap_storage: Pointer to shmem struct file for swap storage. - * @pages_list: used by some page allocation backend - * @caching: The current caching state of the pages, see enum ttm_caching. - * - * This is a structure holding the pages, caching- and aperture binding - * status for a buffer object that isn't backed by fixed (VRAM / AGP) + * struct ttm_tt - This is a structure holding the pages, caching- and aperture + * binding status for a buffer object that isn't backed by fixed (VRAM / AGP) * memory. */ struct ttm_tt { + /** @pages: Array of pages backing the data. */ struct page **pages; + /** + * @page_flags: The page flags. + * + * Supported values: + * + * TTM_TT_FLAG_SWAPPED: Set by TTM when the pages have been unpopulated + * and swapped out by TTM. Calling ttm_tt_populate() will then swap the + * pages back in, and unset the flag. Drivers should in general never + * need to touch this. + * + * TTM_TT_FLAG_ZERO_ALLOC: Set if the pages will be zeroed on + * allocation. + * + * TTM_TT_FLAG_EXTERNAL: Set if the underlying pages were allocated + * externally, like with dma-buf or userptr. This effectively disables + * TTM swapping out such pages. Also important is to prevent TTM from + * ever directly mapping these pages. + * + * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable + * this flag. + * + * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is + * set by TTM after ttm_tt_populate() has successfully returned, and is + * then unset when TTM calls ttm_tt_unpopulate(). + */ +#define TTM_TT_FLAG_SWAPPED (1 << 0) +#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) +#define TTM_TT_FLAG_EXTERNAL (1 << 2) + +#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) uint32_t page_flags; + /** @num_pages: Number of pages in the page array. */ uint32_t num_pages; + /** @sg: for SG objects via dma-buf. */ struct sg_table *sg; + /** @dma_address: The DMA (bus) addresses of the pages. */ dma_addr_t *dma_address; + /** @swap_storage: Pointer to shmem struct file for swap storage. */ struct file *swap_storage; + /** + * @caching: The current caching state of the pages, see enum + * ttm_caching. + */ enum ttm_caching caching; };

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 07/13] drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE

In commit:

commit 667a50db0477d47fdff01c666f5ee1ce26b5264c Author: Thomas Hellstrom thellstrom@vmware.com Date: Fri Jan 3 11:17:18 2014 +0100

drm/ttm: Refuse to fault (prime-) imported pages

we introduced the restriction that imported pages should not be directly mappable through TTM(this also extends to userptr). In the next patch we want to introduce a shmem_tt backend, which should follow all the existing rules with TTM_PAGE_FLAG_EXTERNAL, since it will need to handle swapping itself, but with the above mapping restriction lifted.

v2(Christian): - Don't OR together EXTERNAL and EXTERNAL_MAPPABLE in the definition of EXTERNAL_MAPPABLE, just leave it the caller to handle this correctly, otherwise we might encounter subtle issues.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 6 ++++-- drivers/gpu/drm/ttm/ttm_tt.c | 3 +++ include/drm/ttm/ttm_tt.h | 19 ++++++++++++++++--- 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 950f4f132802..33680c94127c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -163,8 +163,10 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo, * (if at all) by redirecting mmap to the exporter. */ if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { - dma_resv_unlock(bo->base.resv); - return VM_FAULT_SIGBUS; + if (!(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE)) { + dma_resv_unlock(bo->base.resv); + return VM_FAULT_SIGBUS; + } }

return 0; diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 86f31fde6e35..7e83c00a3f48 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -84,6 +84,9 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc) if (unlikely(bo->ttm == NULL)) return -ENOMEM;

+ WARN_ON(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE && + !(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)); + return 0; }

diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 86d74069be3e..f20832139815 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -67,13 +67,26 @@ struct ttm_tt { * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable * this flag. * + * TTM_TT_FLAG_EXTERNAL_MAPPABLE: Same behaviour as + * TTM_TT_FLAG_EXTERNAL, but with the reduced restriction that it is + * still valid to use TTM to map the pages directly. This is useful when + * implementing a ttm_tt backend which still allocates driver owned + * pages underneath(say with shmem). + * + * Note that since this also implies TTM_TT_FLAG_EXTERNAL, the usage + * here should always be: + * + * page_flags = TTM_TT_FLAG_EXTERNAL | + * TTM_TT_FLAG_EXTERNAL_MAPPABLE; + * * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is * set by TTM after ttm_tt_populate() has successfully returned, and is * then unset when TTM calls ttm_tt_unpopulate(). */ -#define TTM_TT_FLAG_SWAPPED (1 << 0) -#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) -#define TTM_TT_FLAG_EXTERNAL (1 << 2) +#define TTM_TT_FLAG_SWAPPED (1 << 0) +#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) +#define TTM_TT_FLAG_EXTERNAL (1 << 2) +#define TTM_TT_FLAG_EXTERNAL_MAPPABLE (1 << 3)

#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) uint32_t page_flags;

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 08/13] drm/i915/gem: Break out some shmem backend utils

From: Thomas Hellström thomas.hellstrom@linux.intel.com

Break out some shmem backend utils for future reuse by the TTM backend: shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can use to provide a shmem-backed TTM page pool for cached-only TTM buffer objects.

Main functional change here is that we now compute the page sizes using the dma segments rather than using the physical page address segments.

v2(Reported-by: kernel test robot lkp@intel.com) - Make sure we initialise the mapping on the error path in shmem_get_pages()

Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com Signed-off-by: Matthew Auld matthew.auld@intel.com --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 181 +++++++++++++--------- 1 file changed, 106 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 11f072193f3b..36b711ae9e28 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,46 +25,61 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); }

-static int shmem_get_pages(struct drm_i915_gem_object *obj) +static void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct intel_memory_region *mem = obj->mm.region; - const unsigned long page_count = obj->base.size / PAGE_SIZE; + struct sgt_iter sgt_iter; + struct pagevec pvec; + struct page *page; + + mapping_clear_unevictable(mapping); + + pagevec_init(&pvec); + for_each_sgt_page(page, sgt_iter, st) { + if (dirty) + set_page_dirty(page); + + if (backup) + mark_page_accessed(page); + + if (!pagevec_add(&pvec, page)) + check_release_pagevec(&pvec); + } + if (pagevec_count(&pvec)) + check_release_pagevec(&pvec); + + sg_free_table(st); + kfree(st); +} + +static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment) +{ + const unsigned long page_count = size / PAGE_SIZE; unsigned long i; - struct address_space *mapping; struct sg_table *st; struct scatterlist *sg; - struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment = i915_sg_segment_size(); - unsigned int sg_page_sizes; gfp_t noreclaim; int ret;

- /* - * Assert that the object is not currently in any GPU domain. As it - * wasn't in the GTT, there shouldn't be any way it could have been in - * a GPU cache - */ - GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); - GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); - /* * If there's no chance of allocating enough pages for the whole * object, bail early. */ - if (obj->base.size > resource_size(&mem->region)) - return -ENOMEM; + if (size > resource_size(&mr->region)) + return ERR_PTR(-ENOMEM);

st = kmalloc(sizeof(*st), GFP_KERNEL); if (!st) - return -ENOMEM; + return ERR_PTR(-ENOMEM);

-rebuild_st: if (sg_alloc_table(st, page_count, GFP_KERNEL)) { kfree(st); - return -ENOMEM; + return ERR_PTR(-ENOMEM); }

/* @@ -73,14 +88,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) * * Fail silently without starting the shrinker */ - mapping = obj->base.filp->f_mapping; mapping_set_unevictable(mapping); noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM); noreclaim |= __GFP_NORETRY | __GFP_NOWARN;

sg = st->sgl; st->nents = 0; - sg_page_sizes = 0; for (i = 0; i < page_count; i++) { const unsigned int shrink[] = { I915_SHRINK_BOUND | I915_SHRINK_UNBOUND, @@ -135,10 +148,9 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) if (!i || sg->length >= max_segment || page_to_pfn(page) != last_pfn + 1) { - if (i) { - sg_page_sizes |= sg->length; + if (i) sg = sg_next(sg); - } + st->nents++; sg_set_page(sg, page, PAGE_SIZE, 0); } else { @@ -149,14 +161,65 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) /* Check that the i965g/gm workaround works. */ GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL); } - if (sg) { /* loop terminated early; short sg table */ - sg_page_sizes |= sg->length; + if (sg) /* loop terminated early; short sg table */ sg_mark_end(sg); - }

/* Trim unused sg entries to avoid wasting memory. */ i915_sg_trim(st);

+ return st; +err_sg: + sg_mark_end(sg); + if (sg != st->sgl) { + shmem_free_st(st, mapping, false, false); + } else { + mapping_clear_unevictable(mapping); + sg_free_table(st); + kfree(st); + } + + /* + * shmemfs first checks if there is enough memory to allocate the page + * and reports ENOSPC should there be insufficient, along with the usual + * ENOMEM for a genuine allocation failure. + * + * We use ENOSPC in our driver to mean that we have run out of aperture + * space and so want to translate the error from shmemfs back to our + * usual understanding of ENOMEM. + */ + if (ret == -ENOSPC) + ret = -ENOMEM; + + return ERR_PTR(ret); +} + +static int shmem_get_pages(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct intel_memory_region *mem = obj->mm.region; + struct address_space *mapping = obj->base.filp->f_mapping; + const unsigned long page_count = obj->base.size / PAGE_SIZE; + unsigned int max_segment = i915_sg_segment_size(); + struct sg_table *st; + struct sgt_iter sgt_iter; + struct page *page; + int ret; + + /* + * Assert that the object is not currently in any GPU domain. As it + * wasn't in the GTT, there shouldn't be any way it could have been in + * a GPU cache + */ + GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); + GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); + +rebuild_st: + st = shmem_alloc_st(i915, obj->base.size, mem, mapping, max_segment); + if (IS_ERR(st)) { + ret = PTR_ERR(st); + goto err_st; + } + ret = i915_gem_gtt_prepare_pages(obj, st); if (ret) { /* @@ -168,6 +231,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) for_each_sgt_page(page, sgt_iter, st) put_page(page); sg_free_table(st); + kfree(st);

max_segment = PAGE_SIZE; goto rebuild_st; @@ -200,28 +264,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER) obj->cache_dirty = true;

- __i915_gem_object_set_pages(obj, st, sg_page_sizes); + __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));

return 0;

-err_sg: - sg_mark_end(sg); err_pages: - mapping_clear_unevictable(mapping); - if (sg != st->sgl) { - struct pagevec pvec; - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, st) { - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); - } - sg_free_table(st); - kfree(st); - + shmem_free_st(st, mapping, false, false); /* * shmemfs first checks if there is enough memory to allocate the page * and reports ENOSPC should there be insufficient, along with the usual @@ -231,6 +279,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) * space and so want to translate the error from shmemfs back to our * usual understanding of ENOMEM. */ +err_st: if (ret == -ENOSPC) ret = -ENOMEM;

@@ -251,10 +300,8 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); }

-static void -shmem_writeback(struct drm_i915_gem_object *obj) +static void __shmem_writeback(size_t size, struct address_space *mapping) { - struct address_space *mapping; struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, .nr_to_write = SWAP_CLUSTER_MAX, @@ -270,10 +317,9 @@ shmem_writeback(struct drm_i915_gem_object *obj) * instead of invoking writeback so they are aged and paged out * as normal. */ - mapping = obj->base.filp->f_mapping;

/* Begin writeback on each dirty page */ - for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) { + for (i = 0; i < size >> PAGE_SHIFT; i++) { struct page *page;

page = find_lock_page(mapping, i); @@ -296,6 +342,12 @@ shmem_writeback(struct drm_i915_gem_object *obj) } }

+static void +shmem_writeback(struct drm_i915_gem_object *obj) +{ + __shmem_writeback(obj->base.size, obj->base.filp->f_mapping); +} + void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, @@ -316,11 +368,6 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,

void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages) { - struct sgt_iter sgt_iter; - struct pagevec pvec; - struct page *page; - - GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev))); __i915_gem_object_release_shmem(obj, pages, true);

i915_gem_gtt_finish_pages(obj, pages); @@ -328,25 +375,9 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_ if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_save_bit_17_swizzle(obj, pages);

- mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping); - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, pages) { - if (obj->mm.dirty) - set_page_dirty(page); - - if (obj->mm.madv == I915_MADV_WILLNEED) - mark_page_accessed(page); - - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); + shmem_free_st(pages, file_inode(obj->base.filp)->i_mapping, + obj->mm.dirty, obj->mm.madv == I915_MADV_WILLNEED); obj->mm.dirty = false; - - sg_free_table(pages); - kfree(pages); }

static void

-- 2.26.3

Matthew Auld

11:41 a.m.

New subject: [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas): - Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need to handle objects which don't even have mm.pages, so bundling this into put_pages() would require somehow handling that edge case, hence just letting the ttm backend handle everything in try_to_writeback doesn't seem too bad. v3(Thomas): - Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects. - s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback. - Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate. - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since these just get skipped anyway. We can try to come up with something better later.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-- 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3043fcbd31bd..1c9a1d8d3434 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj, enum intel_memory_type type);

+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment); +void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup); +void __shmem_writeback(size_t size, struct address_space *mapping); + #ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index fa2ba9e2a4d0..f0fb17be2f7a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops { struct sg_table *pages); void (*truncate)(struct drm_i915_gem_object *obj); void (*writeback)(struct drm_i915_gem_object *obj); + int (*shrinker_release_pages)(struct drm_i915_gem_object *obj, + bool should_writeback);

int (*pread)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 36b711ae9e28..19e55cc29a15 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); }

-static void shmem_free_st(struct sg_table *st, struct address_space *mapping, - bool dirty, bool backup) +void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup) { struct sgt_iter sgt_iter; struct pagevec pvec; @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping, kfree(st); }

-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, - size_t size, struct intel_memory_region *mr, - struct address_space *mapping, - unsigned int max_segment) +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment) { const unsigned long page_count = size / PAGE_SIZE; unsigned long i; @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); }

-static void __shmem_writeback(size_t size, struct address_space *mapping) +void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index e382b7f2353b..cc80bd23d323 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj, return false; }

-static void try_to_writeback(struct drm_i915_gem_object *obj, - unsigned int flags) +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned int flags) { + if (obj->ops->shrinker_release_pages) + return obj->ops->shrinker_release_pages(obj, + flags & I915_SHRINK_WRITEBACK); + switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj); - return; + return 0; case __I915_MADV_PURGED: - return; + return 0; }

if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj); + + return 0; }

/** @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, }

if (!__i915_gem_object_put_pages(obj)) { - try_to_writeback(obj, shrink); - count += obj->base.size >> PAGE_SHIFT; + if (!try_to_writeback(obj, shrink)) + count += obj->base.size >> PAGE_SHIFT; } if (!ww) i915_gem_object_unlock(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index a77e90f300fe..c7402995a8f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -35,6 +35,8 @@ * @ttm: The base TTM page vector. * @dev: The struct device used for dma mapping and unmapping. * @cached_st: The cached scatter-gather table. + * @is_shmem: Set if using shmem. + * @filp: The shmem file, if using shmem backend. * * Note that DMA may be going on right up to the point where the page- * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6 +48,9 @@ struct i915_ttm_tt { struct ttm_tt ttm; struct device *dev; struct sg_table *cached_st; + + bool is_shmem; + struct file *filp; };

static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj, placement->busy_placement = busy; }

+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev); + struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM]; + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + const unsigned int max_segment = i915_sg_segment_size(); + const size_t size = ttm->num_pages << PAGE_SHIFT; + struct file *filp = i915_tt->filp; + struct sgt_iter sgt_iter; + struct sg_table *st; + struct page *page; + unsigned long i; + int err; + + if (!filp) { + struct address_space *mapping; + gfp_t mask; + + filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE); + if (IS_ERR(filp)) + return PTR_ERR(filp); + + mask = GFP_HIGHUSER | __GFP_RECLAIMABLE; + + mapping = filp->f_mapping; + mapping_set_gfp_mask(mapping, mask); + GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); + + i915_tt->filp = filp; + } + + st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment); + if (IS_ERR(st)) + return PTR_ERR(st); + + err = dma_map_sg_attrs(i915_tt->dev, + st->sgl, st->nents, + PCI_DMA_BIDIRECTIONAL, + DMA_ATTR_SKIP_CPU_SYNC | + DMA_ATTR_NO_KERNEL_MAPPING | + DMA_ATTR_NO_WARN); + if (err <= 0) { + err = -EINVAL; + goto err_free_st; + } + + i = 0; + for_each_sgt_page(page, sgt_iter, st) + ttm->pages[i++] = page; + + if (ttm->page_flags & TTM_TT_FLAG_SWAPPED) + ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; + + i915_tt->cached_st = st; + return 0; + +err_free_st: + shmem_free_st(st, filp->f_mapping, false, false); + return err; +} + +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED; + + dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl, + i915_tt->cached_st->nents, + PCI_DMA_BIDIRECTIONAL); + + shmem_free_st(fetch_and_zero(&i915_tt->cached_st), + file_inode(i915_tt->filp)->i_mapping, + backup, backup); +} + static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret;

@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, man->use_tt) page_flags |= TTM_TT_FLAG_ZERO_ALLOC;

- ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, - i915_ttm_select_tt_caching(obj)); - if (ret) { - kfree(i915_tt); - return NULL; + if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) { + page_flags |= TTM_TT_FLAG_EXTERNAL | + TTM_TT_FLAG_EXTERNAL_MAPPABLE; + i915_tt->is_shmem = true; }

+ ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching); + if (ret) + goto err_free; + i915_tt->dev = obj->base.dev->dev;

return &i915_tt->ttm; + +err_free: + kfree(i915_tt); + return NULL; +} + +static int i915_ttm_tt_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + + if (i915_tt->is_shmem) + return i915_ttm_tt_shmem_populate(bdev, ttm, ctx); + + return ttm_pool_alloc(&bdev->pool, ttm, ctx); }

static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

- if (i915_tt->cached_st) { - dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, - DMA_BIDIRECTIONAL, 0); - sg_free_table(i915_tt->cached_st); - kfree(i915_tt->cached_st); - i915_tt->cached_st = NULL; + if (i915_tt->is_shmem) { + i915_ttm_tt_shmem_unpopulate(ttm); + } else { + if (i915_tt->cached_st) { + dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, + DMA_BIDIRECTIONAL, 0); + sg_free_table(i915_tt->cached_st); + kfree(i915_tt->cached_st); + i915_tt->cached_st = NULL; + } + ttm_pool_free(&bdev->pool, ttm); } - ttm_pool_free(&bdev->pool, ttm); }

static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

+ if (i915_tt->filp) + fput(i915_tt->filp); + ttm_tt_fini(ttm); kfree(i915_tt); } @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo, { struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);

+ /* + * EXTERNAL objects should never be swapped out by TTM, instead we need + * to handle that ourselves. TTM will already skip such objects for us, + * but we would like to avoid grabbing locks for no good reason. + */ + if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL) + return -EBUSY; + /* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj) i915_gem_object_set_cache_coherency(obj, cache_level); }

-static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false, @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj) int ret;

if (obj->mm.madv == __I915_MADV_PURGED) - return; + return 0;

- /* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx); - if (!ret) { - obj->write_domain = 0; - obj->read_domains = 0; - i915_ttm_adjust_gem_after_move(obj); - i915_ttm_free_cached_io_st(obj); - obj->mm.madv = __I915_MADV_PURGED; + if (ret) + return ret; + + if (bo->ttm && i915_tt->filp) { + /* + * The below fput(which eventually calls shmem_truncate) might + * be delayed by worker, so when directly called to purge the + * pages(like by the shrinker) we should try to be more + * aggressive and release the pages immediately. + */ + shmem_truncate_range(file_inode(i915_tt->filp), + 0, (loff_t)-1); + fput(fetch_and_zero(&i915_tt->filp)); + } + + obj->write_domain = 0; + obj->read_domains = 0; + i915_ttm_adjust_gem_after_move(obj); + i915_ttm_free_cached_io_st(obj); + obj->mm.madv = __I915_MADV_PURGED; + return 0; +} + +static void i915_ttm_purge(struct drm_i915_gem_object *obj) +{ + __i915_ttm_purge(obj); +} + +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj, + bool should_writeback) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); + struct ttm_operation_ctx ctx = { + .interruptible = true, + .no_wait_gpu = false, + }; + struct ttm_placement place = {}; + int ret; + + if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM) + return 0; + + GEM_BUG_ON(!i915_tt->is_shmem); + + if (!i915_tt->filp) + return 0; + + switch (obj->mm.madv) { + case I915_MADV_DONTNEED: + return __i915_ttm_purge(obj); + case __I915_MADV_PURGED: + return 0; + } + + if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED) + return 0; + + bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED; + ret = ttm_bo_validate(bo, &place, &ctx); + if (ret) { + bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; + return ret; } + + if (should_writeback) + __shmem_writeback(obj->base.size, i915_tt->filp->f_mapping); + + return 0; }

static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,

static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create, + .ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, }

if (!i915_gem_object_has_pages(obj)) { + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); + /* Object either has a page vector or is an iomem object */ st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st; if (IS_ERR(st)) return PTR_ERR(st);

__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); + if (!bo->ttm || !i915_tt->is_shmem) + i915_gem_object_make_unshrinkable(obj); }

return ret; @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm);

/* * Don't manipulate the TTM LRUs while in TTM bo destruction. @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) * Put on the correct LRU list depending on the MADV status */ spin_lock(&bo->bdev->lru_lock); - if (obj->mm.madv != I915_MADV_WILLNEED) { + if (bo->ttm && i915_tt->filp) { + /* Try to keep shmem_tt from being considered for shrinking. */ + bo->priority = TTM_MAX_BO_PRIORITY - 1; + } else if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages, .truncate = i915_ttm_purge, + .shrinker_release_pages = i915_ttm_shrinker_release_pages, + .adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate, + .mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, }; @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, drm_gem_private_object_init(&i915->drm, &obj->base, size); i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags); i915_gem_object_init_memory_region(obj, mem); - i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->ttm.get_io_page.lock); bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :

-- 2.26.3

Thomas Hellström

29 Sep 29 Sep

11:07 a.m.

New subject: [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...

For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

Some minor comments below, with those either fixed or deemed unnecessary, Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com

...

v2(Thomas): - Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need to handle objects which don't even have mm.pages, so bundling this into put_pages() would require somehow handling that edge case, hence just letting the ttm backend handle everything in try_to_writeback doesn't seem too bad. v3(Thomas): - Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects. - s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback. - Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate. - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since these just get skipped anyway. We can try to come up with something better later.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com

drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-- 5 files changed, 245 insertions(+), 36 deletions(-)

...

+ err = dma_map_sg_attrs(i915_tt->dev, + st->sgl, st->nents, + PCI_DMA_BIDIRECTIONAL,

nit: Since this is a dma api call, should we use DMA_BIDIRECTIONAL instead of PCI_DMA_BIDIRECTIONAL? DMA_BIDIRECTIONAL is used elsewhere in this file, but not throughout the driver IIRC.

...

+                              DMA_ATTR_SKIP_CPU_SYNC | +                              DMA_ATTR_NO_KERNEL_MAPPING | +                              DMA_ATTR_NO_WARN); +       if (err <= 0) { +               err = -EINVAL; +               goto err_free_st; +       }

+       i = 0; +       for_each_sgt_page(page, sgt_iter, st) +               ttm->pages[i++] = page;

+       if (ttm->page_flags & TTM_TT_FLAG_SWAPPED) +               ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;

+       i915_tt->cached_st = st; +       return 0;

+err_free_st: +       shmem_free_st(st, filp->f_mapping, false, false); +       return err; +}

+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) +{ +       struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); +       bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;

+       dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl, +                    i915_tt->cached_st->nents, +                    PCI_DMA_BIDIRECTIONAL);

Same here.

...

+       shmem_free_st(fetch_and_zero(&i915_tt->cached_st), +                     file_inode(i915_tt->filp)->i_mapping, +                     backup, backup); +}

static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,                                          uint32_t page_flags) {         struct ttm_resource_manager *man =                 ttm_manager_type(bo->bdev, bo->resource->mem_type);         struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); +       enum ttm_caching caching = i915_ttm_select_tt_caching(obj);         struct i915_ttm_tt *i915_tt;         int ret; @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,             man->use_tt)                 page_flags |= TTM_TT_FLAG_ZERO_ALLOC; -       ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, -                         i915_ttm_select_tt_caching(obj)); -       if (ret) { -               kfree(i915_tt); -               return NULL; +       if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) { +               page_flags |= TTM_TT_FLAG_EXTERNAL | +                             TTM_TT_FLAG_EXTERNAL_MAPPABLE; +               i915_tt->is_shmem = true;         } +       ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching); +       if (ret) +               goto err_free;

i915_tt->dev = obj->base.dev->dev;         return &i915_tt->ttm;

+err_free: +       kfree(i915_tt); +       return NULL; +}

+static int i915_ttm_tt_populate(struct ttm_device *bdev, +                               struct ttm_tt *ttm, +                               struct ttm_operation_ctx *ctx) +{ +       struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

+       if (i915_tt->is_shmem) +               return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);

+       return ttm_pool_alloc(&bdev->pool, ttm, ctx); } static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) {         struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); -       if (i915_tt->cached_st) { -               dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, -                                 DMA_BIDIRECTIONAL, 0); -               sg_free_table(i915_tt->cached_st); -               kfree(i915_tt->cached_st); -               i915_tt->cached_st = NULL; +       if (i915_tt->is_shmem) { +               i915_ttm_tt_shmem_unpopulate(ttm); +       } else { +               if (i915_tt->cached_st) { +                       dma_unmap_sgtable(i915_tt->dev, i915_tt-

...
cached_st,

+                                         DMA_BIDIRECTIONAL, 0); +                       sg_free_table(i915_tt->cached_st); +                       kfree(i915_tt->cached_st); +                       i915_tt->cached_st = NULL; +               } +               ttm_pool_free(&bdev->pool, ttm);         } -       ttm_pool_free(&bdev->pool, ttm); } static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) {         struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); +       if (i915_tt->filp) +               fput(i915_tt->filp);

ttm_tt_fini(ttm);         kfree(i915_tt); } @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo, {         struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); +       /* +        * EXTERNAL objects should never be swapped out by TTM, instead we need +        * to handle that ourselves. TTM will already skip such objects for us, +        * but we would like to avoid grabbing locks for no good reason. +        */ +       if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL) +               return -EBUSY;

/* Will do for now. Our pinned objects are still on TTM's LRU lists */         return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)         i915_gem_object_set_cache_coherency(obj, cache_level); } -static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) {         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); +       struct i915_ttm_tt *i915_tt = +               container_of(bo->ttm, typeof(*i915_tt), ttm);         struct ttm_operation_ctx ctx = {                 .interruptible = true,                 .no_wait_gpu = false, @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj)         int ret;         if (obj->mm.madv == __I915_MADV_PURGED) -               return; +               return 0; -       /* TTM's purge interface. Note that we might be reentering. */         ret = ttm_bo_validate(bo, &place, &ctx); -       if (!ret) { -               obj->write_domain = 0; -               obj->read_domains = 0; -               i915_ttm_adjust_gem_after_move(obj); -               i915_ttm_free_cached_io_st(obj); -               obj->mm.madv = __I915_MADV_PURGED; +       if (ret) +               return ret;

+       if (bo->ttm && i915_tt->filp) { +               /* +                * The below fput(which eventually calls shmem_truncate) might +                * be delayed by worker, so when directly called to purge the +                * pages(like by the shrinker) we should try to be more +                * aggressive and release the pages immediately. +                */ +               shmem_truncate_range(file_inode(i915_tt->filp), +                                    0, (loff_t)-1); +               fput(fetch_and_zero(&i915_tt->filp)); +       }

+       obj->write_domain = 0; +       obj->read_domains = 0; +       i915_ttm_adjust_gem_after_move(obj); +       i915_ttm_free_cached_io_st(obj); +       obj->mm.madv = __I915_MADV_PURGED; +       return 0; +}

+static void i915_ttm_purge(struct drm_i915_gem_object *obj) +{ +       __i915_ttm_purge(obj);

Do we need a comment here as to why we choose to ignore the return value? I typically use a void cast (void)__i915_ttm_purge(obj); to indicate that ignoring the return value is intentional. Not sure if that's common practice with i915?

...

+}

+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj, +                                          bool should_writeback) +{ +       struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); +       struct i915_ttm_tt *i915_tt = +               container_of(bo->ttm, typeof(*i915_tt), ttm); +       struct ttm_operation_ctx ctx = { +               .interruptible = true, +               .no_wait_gpu = false, +       }; +       struct ttm_placement place = {}; +       int ret;

+       if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM) +               return 0;

+       GEM_BUG_ON(!i915_tt->is_shmem);

+       if (!i915_tt->filp) +               return 0;

+       switch (obj->mm.madv) { +       case I915_MADV_DONTNEED: +               return __i915_ttm_purge(obj); +       case __I915_MADV_PURGED: +               return 0; +       }

+       if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED) +               return 0;

+       bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED; +       ret = ttm_bo_validate(bo, &place, &ctx); +       if (ret) { +               bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; +               return ret;         }

+       if (should_writeback) +               __shmem_writeback(obj->base.size, i915_tt->filp-

...
f_mapping);

+       return 0; } static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo, static struct ttm_device_funcs i915_ttm_bo_driver = {         .ttm_tt_create = i915_ttm_tt_create, +       .ttm_tt_populate = i915_ttm_tt_populate,         .ttm_tt_unpopulate = i915_ttm_tt_unpopulate,         .ttm_tt_destroy = i915_ttm_tt_destroy,         .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,         }         if (!i915_gem_object_has_pages(obj)) { +               struct i915_ttm_tt *i915_tt = +                       container_of(bo->ttm, typeof(*i915_tt), ttm);

/* Object either has a page vector or is an iomem object */                 st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-

...
ttm.cached_io_st;

if (IS_ERR(st))                         return PTR_ERR(st);                 __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); +               if (!bo->ttm || !i915_tt->is_shmem) +                       i915_gem_object_make_unshrinkable(obj);         }         return ret; @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) {         struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); +       struct i915_ttm_tt *i915_tt = +               container_of(bo->ttm, typeof(*i915_tt), ttm);         /*          * Don't manipulate the TTM LRUs while in TTM bo destruction. @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)          * Put on the correct LRU list depending on the MADV status          */         spin_lock(&bo->bdev->lru_lock); -       if (obj->mm.madv != I915_MADV_WILLNEED) { +       if (bo->ttm && i915_tt->filp) { +               /* Try to keep shmem_tt from being considered for shrinking. */ +               bo->priority = TTM_MAX_BO_PRIORITY - 1; +       } else if (obj->mm.madv != I915_MADV_WILLNEED) {                 bo->priority = I915_TTM_PRIO_PURGE;         } else if (!i915_gem_object_has_pages(obj)) {                 if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {         .get_pages = i915_ttm_get_pages,         .put_pages = i915_ttm_put_pages,         .truncate = i915_ttm_purge, +       .shrinker_release_pages = i915_ttm_shrinker_release_pages,

.adjust_lru = i915_ttm_adjust_lru,         .delayed_free = i915_ttm_delayed_free,         .migrate = i915_ttm_migrate,

.mmap_offset = i915_ttm_mmap_offset,         .mmap_ops = &vm_ops_ttm, }; @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,         drm_gem_private_object_init(&i915->drm, &obj->base, size);         i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);         i915_gem_object_init_memory_region(obj, mem); -       i915_gem_object_make_unshrinkable(obj);         INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);         mutex_init(&obj->ttm.get_io_page.lock);         bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :

Zeng, Oak

5 Oct 5 Oct

2:05 a.m.

New subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

Hi Matthew/Thomas,

See one question inline

Regards, Oak

-----Original Message----- From: Intel-gfx intel-gfx-bounces@lists.freedesktop.org On Behalf Of Matthew Auld Sent: September 27, 2021 7:41 AM To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Thomas Hellström thomas.hellstrom@linux.intel.com; Christian König christian.koenig@amd.com Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-- 5 files changed, 245 insertions(+), 36 deletions(-)

-static void try_to_writeback(struct drm_i915_gem_object *obj, - unsigned int flags) +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned +int flags) { + if (obj->ops->shrinker_release_pages) + return obj->ops->shrinker_release_pages(obj, + flags & I915_SHRINK_WRITEBACK); + switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj); - return; + return 0; case __I915_MADV_PURGED: - return; + return 0; }

if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj); + + return 0; }

/** @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, }

static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj, placement->busy_placement = busy; }

+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) { + struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev); + struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM]; + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + const unsigned int max_segment = i915_sg_segment_size(); + const size_t size = ttm->num_pages << PAGE_SHIFT; + struct file *filp = i915_tt->filp; + struct sgt_iter sgt_iter; + struct sg_table *st; + struct page *page; + unsigned long i; + int err; + + if (!filp) { + struct address_space *mapping; + gfp_t mask; + + filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE); + if (IS_ERR(filp)) + return PTR_ERR(filp); + + mask = GFP_HIGHUSER | __GFP_RECLAIMABLE; + + mapping = filp->f_mapping; + mapping_set_gfp_mask(mapping, mask); + GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); + + i915_tt->filp = filp; + } + + st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment); + if (IS_ERR(st)) + return PTR_ERR(st); + + err = dma_map_sg_attrs(i915_tt->dev, + st->sgl, st->nents, + PCI_DMA_BIDIRECTIONAL, + DMA_ATTR_SKIP_CPU_SYNC | + DMA_ATTR_NO_KERNEL_MAPPING | + DMA_ATTR_NO_WARN); + if (err <= 0) { + err = -EINVAL; + goto err_free_st; + } + + i = 0; + for_each_sgt_page(page, sgt_iter, st) + ttm->pages[i++] = page; + + if (ttm->page_flags & TTM_TT_FLAG_SWAPPED) + ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; + + i915_tt->cached_st = st; + return 0; + +err_free_st: + shmem_free_st(st, filp->f_mapping, false, false); + return err; +} + +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) { + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED; + + dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl, + i915_tt->cached_st->nents, + PCI_DMA_BIDIRECTIONAL); + + shmem_free_st(fetch_and_zero(&i915_tt->cached_st), + file_inode(i915_tt->filp)->i_mapping, + backup, backup);

Should we do something to undo the shmem_file_setup operation here? From its implementation it does take a reference counter of inode and allocate file: https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

Regards, Oak

+} + static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret;

@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, man->use_tt) page_flags |= TTM_TT_FLAG_ZERO_ALLOC;

+ ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching); + if (ret) + goto err_free; + i915_tt->dev = obj->base.dev->dev;

return &i915_tt->ttm; + +err_free: + kfree(i915_tt); + return NULL; +} + +static int i915_ttm_tt_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), +ttm); + + if (i915_tt->is_shmem) + return i915_ttm_tt_shmem_populate(bdev, ttm, ctx); + + return ttm_pool_alloc(&bdev->pool, ttm, ctx); }

static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

if (obj->mm.madv == __I915_MADV_PURGED) - return; + return 0;

- /* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx); - if (!ret) { - obj->write_domain = 0; - obj->read_domains = 0; - i915_ttm_adjust_gem_after_move(obj); - i915_ttm_free_cached_io_st(obj); - obj->mm.madv = __I915_MADV_PURGED; + if (ret) + return ret; + + if (bo->ttm && i915_tt->filp) { + /* + * The below fput(which eventually calls shmem_truncate) might + * be delayed by worker, so when directly called to purge the + * pages(like by the shrinker) we should try to be more + * aggressive and release the pages immediately. + */ + shmem_truncate_range(file_inode(i915_tt->filp), + 0, (loff_t)-1); + fput(fetch_and_zero(&i915_tt->filp)); + } + + obj->write_domain = 0; + obj->read_domains = 0; + i915_ttm_adjust_gem_after_move(obj); + i915_ttm_free_cached_io_st(obj); + obj->mm.madv = __I915_MADV_PURGED; + return 0; +} + +static void i915_ttm_purge(struct drm_i915_gem_object *obj) { + __i915_ttm_purge(obj); +} + +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj, + bool should_writeback) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); + struct ttm_operation_ctx ctx = { + .interruptible = true, + .no_wait_gpu = false, + }; + struct ttm_placement place = {}; + int ret; + + if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM) + return 0; + + GEM_BUG_ON(!i915_tt->is_shmem); + + if (!i915_tt->filp) + return 0; + + switch (obj->mm.madv) { + case I915_MADV_DONTNEED: + return __i915_ttm_purge(obj); + case __I915_MADV_PURGED: + return 0; + } + + if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED) + return 0; + + bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED; + ret = ttm_bo_validate(bo, &place, &ctx); + if (ret) { + bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; + return ret; } + + if (should_writeback) + __shmem_writeback(obj->base.size, i915_tt->filp->f_mapping); + + return 0; }

static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,

__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); + if (!bo->ttm || !i915_tt->is_shmem) + i915_gem_object_make_unshrinkable(obj); }

Thomas Hellström

1:48 p.m.

New subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On 10/5/21 04:05, Zeng, Oak wrote:

...

Hi Matthew/Thomas,

See one question inline

Regards, Oak

-----Original Message----- From: Intel-gfx intel-gfx-bounces@lists.freedesktop.org On Behalf Of Matthew Auld Sent: September 27, 2021 7:41 AM To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Thomas Hellström thomas.hellstrom@linux.intel.com; Christian König christian.koenig@amd.com Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):

Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need to handle objects which don't even have mm.pages, so bundling this into put_pages() would require somehow handling that edge case, hence just letting the ttm backend handle everything in try_to_writeback doesn't seem too bad.

v3(Thomas):

Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects.

s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback.

Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate.

Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since these just get skipped anyway. We can try to come up with something better later.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com

drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-- 5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3043fcbd31bd..1c9a1d8d3434 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj, enum intel_memory_type type);

+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
		size_t size, struct intel_memory_region *mr,
		struct address_space *mapping,
		unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
   bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);

#ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h

index fa2ba9e2a4d0..f0fb17be2f7a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops { struct sg_table *pages); void (*truncate)(struct drm_i915_gem_object *obj); void (*writeback)(struct drm_i915_gem_object *obj);
int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
		      bool should_writeback);
int (*pread)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 36b711ae9e28..19e55cc29a15 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); }

-static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
	  bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
   bool dirty, bool backup)
{ struct sgt_iter sgt_iter; struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping, kfree(st); }

-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
		       size_t size, struct intel_memory_region *mr,
		       struct address_space *mapping,
		       unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
		size_t size, struct intel_memory_region *mr,
		struct address_space *mapping,
		unsigned int max_segment)
{ const unsigned long page_count = size / PAGE_SIZE; unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); }

-static void __shmem_writeback(size_t size, struct address_space *mapping) +void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index e382b7f2353b..cc80bd23d323 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj, return false; }

-static void try_to_writeback(struct drm_i915_gem_object *obj,
	     unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned +int flags) {
if (obj->ops->shrinker_release_pages)
return obj->ops->shrinker_release_pages(obj,
					flags & I915_SHRINK_WRITEBACK);
switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj);
return;
return 0;
case __I915_MADV_PURGED:
return;
return 0;
}

if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj);
return 0; }

/**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, }
		if (!__i915_gem_object_put_pages(obj)) {
			try_to_writeback(obj, shrink);
			count += obj->base.size >> PAGE_SHIFT;
			if (!try_to_writeback(obj, shrink))
				count += obj->base.size >> PAGE_SHIFT;
	}
	if (!ww)
		i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index a77e90f300fe..c7402995a8f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -35,6 +35,8 @@

@ttm: The base TTM page vector.

@dev: The struct device used for dma mapping and unmapping.

@cached_st: The cached scatter-gather table.

@is_shmem: Set if using shmem.

@filp: The shmem file, if using shmem backend.

Note that DMA may be going on right up to the point where the page-

vector is unpopulated in delayed destroy. Hence keep the @@ -46,6 +48,9 @@ struct i915_ttm_tt {

struct ttm_tt ttm; struct device *dev; struct sg_table *cached_st;

bool is_shmem;

struct file *filp; };

static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj, placement->busy_placement = busy; }

+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
		      struct ttm_tt *ttm,
		      struct ttm_operation_ctx *ctx) {
struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev);

struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM];

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

const unsigned int max_segment = i915_sg_segment_size();

const size_t size = ttm->num_pages << PAGE_SHIFT;

struct file *filp = i915_tt->filp;

struct sgt_iter sgt_iter;

struct sg_table *st;

struct page *page;

unsigned long i;

int err;

if (!filp) {
struct address_space *mapping;
gfp_t mask;
filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE);
if (IS_ERR(filp))
	return PTR_ERR(filp);
mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
mapping = filp->f_mapping;
mapping_set_gfp_mask(mapping, mask);
GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
i915_tt->filp = filp;
}

st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);

if (IS_ERR(st))
return PTR_ERR(st);
err = dma_map_sg_attrs(i915_tt->dev,
	       st->sgl, st->nents,
	       PCI_DMA_BIDIRECTIONAL,
	       DMA_ATTR_SKIP_CPU_SYNC |
	       DMA_ATTR_NO_KERNEL_MAPPING |
	       DMA_ATTR_NO_WARN);
if (err <= 0) {
err = -EINVAL;
goto err_free_st;
}

i = 0;

for_each_sgt_page(page, sgt_iter, st)
ttm->pages[i++] = page;
if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
i915_tt->cached_st = st;

return 0;
+err_free_st:

shmem_free_st(st, filp->f_mapping, false, false);

return err;

+}

+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {
struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);

bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;

dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
     i915_tt->cached_st->nents,
     PCI_DMA_BIDIRECTIONAL);
shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
      file_inode(i915_tt->filp)->i_mapping,
      backup, backup);
Should we do something to undo the shmem_file_setup operation here? From its implementation it does take a reference counter of inode and allocate file: https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

Regards, Oak

Hi, Oak,

That's done in i915_ttm_tt_destroy() afaict.

/Thomas

...

+}

static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);

enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret;

@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, man->use_tt) page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
	  i915_ttm_select_tt_caching(obj));
if (ret) {
kfree(i915_tt);
return NULL;
if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
page_flags |= TTM_TT_FLAG_EXTERNAL |
	      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
i915_tt->is_shmem = true;
}
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);

if (ret)
goto err_free;
i915_tt->dev = obj->base.dev->dev;

return &i915_tt->ttm;
+err_free:

kfree(i915_tt);

return NULL;

+}

+static int i915_ttm_tt_populate(struct ttm_device *bdev,
		struct ttm_tt *ttm,
		struct ttm_operation_ctx *ctx)
+{

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

+ttm);
if (i915_tt->is_shmem)
return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
return ttm_pool_alloc(&bdev->pool, ttm, ctx); }

static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
if (i915_tt->cached_st) {
dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
		  DMA_BIDIRECTIONAL, 0);
sg_free_table(i915_tt->cached_st);
kfree(i915_tt->cached_st);
i915_tt->cached_st = NULL;
if (i915_tt->is_shmem) {
i915_ttm_tt_shmem_unpopulate(ttm);
} else {
if (i915_tt->cached_st) {
	dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
			  DMA_BIDIRECTIONAL, 0);
	sg_free_table(i915_tt->cached_st);
	kfree(i915_tt->cached_st);
	i915_tt->cached_st = NULL;
}
ttm_pool_free(&bdev->pool, ttm);
}
ttm_pool_free(&bdev->pool, ttm); }

static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm);
if (i915_tt->filp)
fput(i915_tt->filp);
ttm_tt_fini(ttm); kfree(i915_tt); }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo, { struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
/*
* EXTERNAL objects should never be swapped out by TTM, instead we need
* to handle that ourselves. TTM will already skip such objects for us,
* but we would like to avoid grabbing locks for no good reason.
*/
if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
return -EBUSY;
/* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj) i915_gem_object_set_cache_coherency(obj, cache_level); }
-static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj) int ret;

if (obj->mm.madv == __I915_MADV_PURGED)
return;
return 0;
/* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx);

if (!ret) {
obj->write_domain = 0;
obj->read_domains = 0;
i915_ttm_adjust_gem_after_move(obj);
i915_ttm_free_cached_io_st(obj);
obj->mm.madv = __I915_MADV_PURGED;
if (ret)
return ret;
if (bo->ttm && i915_tt->filp) {
/*
 * The below fput(which eventually calls shmem_truncate) might
 * be delayed by worker, so when directly called to purge the
 * pages(like by the shrinker) we should try to be more
 * aggressive and release the pages immediately.
 */
shmem_truncate_range(file_inode(i915_tt->filp),
		     0, (loff_t)-1);
fput(fetch_and_zero(&i915_tt->filp));
}

obj->write_domain = 0;

obj->read_domains = 0;

i915_ttm_adjust_gem_after_move(obj);

i915_ttm_free_cached_io_st(obj);

obj->mm.madv = __I915_MADV_PURGED;

return 0;
+}

+static void i915_ttm_purge(struct drm_i915_gem_object *obj) {

__i915_ttm_purge(obj);

+}

+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj,
			   bool should_writeback)
+{
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);

struct i915_ttm_tt *i915_tt =
container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = {
.interruptible = true,
.no_wait_gpu = false,
};

struct ttm_placement place = {};

int ret;

if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
return 0;
GEM_BUG_ON(!i915_tt->is_shmem);

if (!i915_tt->filp)
return 0;
switch (obj->mm.madv) {

case I915_MADV_DONTNEED:
return __i915_ttm_purge(obj);
case __I915_MADV_PURGED:
return 0;
}

if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
return 0;
bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;

ret = ttm_bo_validate(bo, &place, &ctx);

if (ret) {
bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
return ret;
}
if (should_writeback)
__shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
return 0; }

static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,

static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create,

.ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, }

if (!i915_gem_object_has_pages(obj)) {
struct i915_ttm_tt *i915_tt =
	container_of(bo->ttm, typeof(*i915_tt), ttm);
/* Object either has a page vector or is an iomem object */ st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st; if (IS_ERR(st)) return PTR_ERR(st);

__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
if (!bo->ttm || !i915_tt->is_shmem)
	i915_gem_object_make_unshrinkable(obj);
}

return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
container_of(bo->ttm, typeof(*i915_tt), ttm);
/*

Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) * Put on the correct LRU list depending on the MADV status */ spin_lock(&bo->bdev->lru_lock);

if (obj->mm.madv != I915_MADV_WILLNEED) {
if (bo->ttm && i915_tt->filp) {
/* Try to keep shmem_tt from being considered for shrinking. */
bo->priority = TTM_MAX_BO_PRIORITY - 1;
} else if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages, .truncate = i915_ttm_purge,

.shrinker_release_pages = i915_ttm_shrinker_release_pages,

.adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate,

.mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, };
@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, drm_gem_private_object_init(&i915->drm, &obj->base, size); i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags); i915_gem_object_init_memory_region(obj, mem);

i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->ttm.get_io_page.lock); bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device :

-- 2.26.3

Zeng, Oak

2:23 p.m.

New subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

Regards, Oak

...

-----Original Message----- From: Thomas Hellström thomas.hellstrom@linux.intel.com Sent: October 5, 2021 9:48 AM To: Zeng, Oak oak.zeng@intel.com; Auld, Matthew matthew.auld@intel.com; intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Christian König christian.koenig@amd.com Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On 10/5/21 04:05, Zeng, Oak wrote:

...
Hi Matthew/Thomas,

See one question inline

Regards, Oak

-----Original Message----- From: Intel-gfx intel-gfx-bounces@lists.freedesktop.org On Behalf Of

Matthew Auld

...
Sent: September 27, 2021 7:41 AM To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Thomas Hellström

thomas.hellstrom@linux.intel.com; Christian König christian.koenig@amd.com

...
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should

make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

...
v2(Thomas):

Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need

to

...
 handle objects which don't even have mm.pages, so bundling this into
 put_pages() would require somehow handling that edge case, hence
 just letting the ttm backend handle everything in try_to_writeback
 doesn't seem too bad.
v3(Thomas):

Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects.

s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback.

Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate.

Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
since

...
 these just get skipped anyway. We can try to come up with something
 better later.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com

drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-
...
5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h

b/drivers/gpu/drm/i915/gem/i915_gem_object.h

...
index 3043fcbd31bd..1c9a1d8d3434 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct

drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,

...
			enum intel_memory_type type);
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
		size_t size, struct intel_memory_region *mr,
		struct address_space *mapping,
		unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
   bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);

#ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git
a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h

...
index fa2ba9e2a4d0..f0fb17be2f7a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops { struct sg_table *pages); void (*truncate)(struct drm_i915_gem_object *obj); void (*writeback)(struct drm_i915_gem_object *obj);
int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
		      bool should_writeback);
int (*pread)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *arg); diff --git
a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c

...
index 36b711ae9e28..19e55cc29a15 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec

*pvec)

...
cond_resched(); }

-static void shmem_free_st(struct sg_table *st, struct address_space

*mapping,

...
	  bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
   bool dirty, bool backup)
{ struct sgt_iter sgt_iter; struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct
address_space *mapping,

...
kfree(st); }

-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
		       size_t size, struct intel_memory_region
*mr,

...
		       struct address_space *mapping,
		       unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
		size_t size, struct intel_memory_region *mr,
		struct address_space *mapping,
		unsigned int max_segment)
{ const unsigned long page_count = size / PAGE_SIZE; unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); }

-static void __shmem_writeback(size_t size, struct address_space
*mapping)

...
+void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

...
index e382b7f2353b..cc80bd23d323 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct

drm_i915_gem_object *obj,

...
return false; }

-static void try_to_writeback(struct drm_i915_gem_object *obj,
	     unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned +int flags) {
if (obj->ops->shrinker_release_pages)
return obj->ops->shrinker_release_pages(obj,
					flags &
I915_SHRINK_WRITEBACK);

...
switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj);
return;
return 0;
case __I915_MADV_PURGED:
return;
return 0;
}

if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj);
return 0; }

/**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, }
		if (!__i915_gem_object_put_pages(obj)) {
			try_to_writeback(obj, shrink);
			count += obj->base.size >>
PAGE_SHIFT;

...
			if (!try_to_writeback(obj, shrink))
				count += obj->base.size >>
PAGE_SHIFT;

...
		}
		if (!ww)
			i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c

...
index a77e90f300fe..c7402995a8f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -35,6 +35,8 @@

@ttm: The base TTM page vector.

@dev: The struct device used for dma mapping and unmapping.

@cached_st: The cached scatter-gather table.

@is_shmem: Set if using shmem.

@filp: The shmem file, if using shmem backend.

Note that DMA may be going on right up to the point where the page-

vector is unpopulated in delayed destroy. Hence keep the @@ -46,6

+48,9 @@ struct i915_ttm_tt {

...
struct ttm_tt ttm; struct device *dev; struct sg_table *cached_st;

bool is_shmem;

struct file *filp; };

static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90

@@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,

...
placement->busy_placement = busy; }

+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
		      struct ttm_tt *ttm,
		      struct ttm_operation_ctx *ctx) {
struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
bdev);

...

struct intel_memory_region *mr = i915-

mm.regions[INTEL_MEMORY_SYSTEM];

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

ttm);

...
const unsigned int max_segment = i915_sg_segment_size();

const size_t size = ttm->num_pages << PAGE_SHIFT;

struct file *filp = i915_tt->filp;

struct sgt_iter sgt_iter;

struct sg_table *st;

struct page *page;

unsigned long i;

int err;

if (!filp) {
struct address_space *mapping;
gfp_t mask;
filp = shmem_file_setup("i915-shmem-tt", size,
VM_NORESERVE);

...
if (IS_ERR(filp))
	return PTR_ERR(filp);
mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
mapping = filp->f_mapping;
mapping_set_gfp_mask(mapping, mask);
GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
__GFP_RECLAIM));

...
i915_tt->filp = filp;
}

st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);

if (IS_ERR(st))
return PTR_ERR(st);
err = dma_map_sg_attrs(i915_tt->dev,
	       st->sgl, st->nents,
	       PCI_DMA_BIDIRECTIONAL,
	       DMA_ATTR_SKIP_CPU_SYNC |
	       DMA_ATTR_NO_KERNEL_MAPPING |
	       DMA_ATTR_NO_WARN);
if (err <= 0) {
err = -EINVAL;
goto err_free_st;
}

i = 0;

for_each_sgt_page(page, sgt_iter, st)
ttm->pages[i++] = page;
if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
i915_tt->cached_st = st;

return 0;
+err_free_st:

shmem_free_st(st, filp->f_mapping, false, false);

return err;

+}

+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;

dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
     i915_tt->cached_st->nents,
     PCI_DMA_BIDIRECTIONAL);
shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
      file_inode(i915_tt->filp)->i_mapping,
      backup, backup);
Should we do something to undo the shmem_file_setup operation here?
From its implementation it does take a reference counter of inode and allocate file: https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

...
Regards, Oak

Hi, Oak,

That's done in i915_ttm_tt_destroy() afaict.

/Thomas

Do we know whether this tt is back by a shmem at create time? If yes, I think a better place to do the shmem_file_setup is in i915_ttm_tt_create - this pairs with the fput in i915_ttm_tt_destroy. If we don't have such information at tt create time, I agree with the current approach.

Regards, Oak

...

...
+}

static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);

enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret;

@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct

ttm_buffer_object *bo,

...
   man->use_tt)
page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
	  i915_ttm_select_tt_caching(obj));
if (ret) {
kfree(i915_tt);
return NULL;
if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
page_flags |= TTM_TT_FLAG_EXTERNAL |
	      TTM_TT_FLAG_EXTERNAL_MAPPABLE;
i915_tt->is_shmem = true;
}
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);

if (ret)
goto err_free;
i915_tt->dev = obj->base.dev->dev;

return &i915_tt->ttm;
+err_free:

kfree(i915_tt);

return NULL;

+}

+static int i915_ttm_tt_populate(struct ttm_device *bdev,
		struct ttm_tt *ttm,
		struct ttm_operation_ctx *ctx)
+{

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

+ttm);
if (i915_tt->is_shmem)
return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
return ttm_pool_alloc(&bdev->pool, ttm, ctx); }

static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt
*ttm) {

...
struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

ttm);

...
if (i915_tt->cached_st) {
dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
		  DMA_BIDIRECTIONAL, 0);
sg_free_table(i915_tt->cached_st);
kfree(i915_tt->cached_st);
i915_tt->cached_st = NULL;
if (i915_tt->is_shmem) {
i915_ttm_tt_shmem_unpopulate(ttm);
} else {
if (i915_tt->cached_st) {
	dma_unmap_sgtable(i915_tt->dev, i915_tt-
cached_st,
			  DMA_BIDIRECTIONAL, 0);
	sg_free_table(i915_tt->cached_st);
	kfree(i915_tt->cached_st);
	i915_tt->cached_st = NULL;
}
ttm_pool_free(&bdev->pool, ttm);
}
ttm_pool_free(&bdev->pool, ttm); }

static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
*ttm) {

...
struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

ttm);

...
if (i915_tt->filp)
fput(i915_tt->filp);
ttm_tt_fini(ttm); kfree(i915_tt); }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
ttm_buffer_object *bo, {

...
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
/*
* EXTERNAL objects should never be swapped out by TTM, instead
we need

...
* to handle that ourselves. TTM will already skip such objects for us,
* but we would like to avoid grabbing locks for no good reason.
*/
if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
return -EBUSY;
/* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@
static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)

...
i915_gem_object_set_cache_coherency(obj, cache_level); }

-static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
drm_i915_gem_object *obj)

...
int ret;

if (obj->mm.madv == __I915_MADV_PURGED)
return;
return 0;
/* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx);

if (!ret) {
obj->write_domain = 0;
obj->read_domains = 0;
i915_ttm_adjust_gem_after_move(obj);
i915_ttm_free_cached_io_st(obj);
obj->mm.madv = __I915_MADV_PURGED;
if (ret)
return ret;
if (bo->ttm && i915_tt->filp) {
/*
 * The below fput(which eventually calls shmem_truncate)
might

...
 * be delayed by worker, so when directly called to purge the
 * pages(like by the shrinker) we should try to be more
 * aggressive and release the pages immediately.
 */
shmem_truncate_range(file_inode(i915_tt->filp),
		     0, (loff_t)-1);
fput(fetch_and_zero(&i915_tt->filp));
}

obj->write_domain = 0;

obj->read_domains = 0;

i915_ttm_adjust_gem_after_move(obj);

i915_ttm_free_cached_io_st(obj);

obj->mm.madv = __I915_MADV_PURGED;

return 0;
+}

+static void i915_ttm_purge(struct drm_i915_gem_object *obj) {

__i915_ttm_purge(obj);

+}

+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object
*obj,

...
			   bool should_writeback)
+{
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);

struct i915_ttm_tt *i915_tt =
container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = {
.interruptible = true,
.no_wait_gpu = false,
};

struct ttm_placement place = {};

int ret;

if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
return 0;
GEM_BUG_ON(!i915_tt->is_shmem);

if (!i915_tt->filp)
return 0;
switch (obj->mm.madv) {

case I915_MADV_DONTNEED:
return __i915_ttm_purge(obj);
case __I915_MADV_PURGED:
return 0;
}

if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
return 0;
bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;

ret = ttm_bo_validate(bo, &place, &ctx);

if (ret) {
bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
return ret;
}
if (should_writeback)
__shmem_writeback(obj->base.size, i915_tt->filp-
f_mapping);

return 0; }

static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,

...
static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create,

.ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17

@@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,

...
}

if (!i915_gem_object_has_pages(obj)) {
struct i915_ttm_tt *i915_tt =
	container_of(bo->ttm, typeof(*i915_tt), ttm);
/* Object either has a page vector or is an iomem object */ st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
ttm.cached_io_st; if (IS_ERR(st)) return PTR_ERR(st);
__i915_gem_object_set_pages(obj, st,
i915_sg_dma_sizes(st->sgl));

...
if (!bo->ttm || !i915_tt->is_shmem)
	i915_gem_object_make_unshrinkable(obj);
}

return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) {

...
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
container_of(bo->ttm, typeof(*i915_tt), ttm);
/*

Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
drm_i915_gem_object *obj)

...
* Put on the correct LRU list depending on the MADV status
*/
spin_lock(&bo->bdev->lru_lock);

if (obj->mm.madv != I915_MADV_WILLNEED) {
if (bo->ttm && i915_tt->filp) {
/* Try to keep shmem_tt from being considered for shrinking.
*/

...
bo->priority = TTM_MAX_BO_PRIORITY - 1;
} else if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
+1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {

...
.get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages, .truncate = i915_ttm_purge,

.shrinker_release_pages = i915_ttm_shrinker_release_pages,

.adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate,

.mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, };

@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct

intel_memory_region *mem,

...
drm_gem_private_object_init(&i915->drm, &obj->base, size); i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,

flags);

...
i915_gem_object_init_memory_region(obj, mem);

i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |

__GFP_NOWARN);

...
mutex_init(&obj->ttm.get_io_page.lock); bo_type = (obj->flags & I915_BO_ALLOC_USER) ?

ttm_bo_type_device :

...
-- 2.26.3

Matthew Auld

5:07 p.m.

New subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On 05/10/2021 15:23, Zeng, Oak wrote:

...

Regards, Oak

...
-----Original Message----- From: Thomas Hellström thomas.hellstrom@linux.intel.com Sent: October 5, 2021 9:48 AM To: Zeng, Oak oak.zeng@intel.com; Auld, Matthew matthew.auld@intel.com; intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Christian König christian.koenig@amd.com Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On 10/5/21 04:05, Zeng, Oak wrote:

...
Hi Matthew/Thomas,

See one question inline

Regards, Oak

-----Original Message----- From: Intel-gfx intel-gfx-bounces@lists.freedesktop.org On Behalf Of

Matthew Auld

...
Sent: September 27, 2021 7:41 AM To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Thomas Hellström

thomas.hellstrom@linux.intel.com; Christian König christian.koenig@amd.com

...
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

For cached objects we can allocate our pages directly in shmem. This should

make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.

...
v2(Thomas): - Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need

to

...
  handle objects which don't even have mm.pages, so bundling this into
  put_pages() would require somehow handling that edge case, hence
  just letting the ttm backend handle everything in try_to_writeback
  doesn't seem too bad.
v3(Thomas): - Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects. - s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback. - Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate. - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
since

...
  these just get skipped anyway. We can try to come up with something
  better later.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com

drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-
...
5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h

b/drivers/gpu/drm/i915/gem/i915_gem_object.h

...
index 3043fcbd31bd..1c9a1d8d3434 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct

drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,

...
                                 enum intel_memory_type type);
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
                      size_t size, struct intel_memory_region *mr,
                      struct address_space *mapping,
                      unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
         bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);

#ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git
a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h

...
index fa2ba9e2a4d0..f0fb17be2f7a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops { struct sg_table *pages); void (*truncate)(struct drm_i915_gem_object *obj); void (*writeback)(struct drm_i915_gem_object *obj);
int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
                            bool should_writeback);
int (*pread)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *arg); diff --git
a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c

...
index 36b711ae9e28..19e55cc29a15 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec

*pvec)

...
 cond_resched();
}

-static void shmem_free_st(struct sg_table *st, struct address_space
*mapping,

...
                bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space *mapping,
         bool dirty, bool backup)
{ struct sgt_iter sgt_iter; struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct
address_space *mapping,

...
 kfree(st);
}

-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
                             size_t size, struct intel_memory_region
*mr,

...
                             struct address_space *mapping,
                             unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
                      size_t size, struct intel_memory_region *mr,
                      struct address_space *mapping,
                      unsigned int max_segment)
{ const unsigned long page_count = size / PAGE_SIZE; unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); }

-static void __shmem_writeback(size_t size, struct address_space
*mapping)

...
+void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

...
index e382b7f2353b..cc80bd23d323 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct

drm_i915_gem_object *obj,

...
 return false;
}

-static void try_to_writeback(struct drm_i915_gem_object *obj,
                   unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned +int flags) {
if (obj->ops->shrinker_release_pages)
      return obj->ops->shrinker_release_pages(obj,
                                              flags &
I915_SHRINK_WRITEBACK);

...
switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj);
      return;
      return 0;
case __I915_MADV_PURGED:
      return;
      return 0;
}

if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj);
return 0; }

/**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, }
                         if (!__i915_gem_object_put_pages(obj)) {
                              try_to_writeback(obj, shrink);
                              count += obj->base.size >>
PAGE_SHIFT;

...
                              if (!try_to_writeback(obj, shrink))
                                      count += obj->base.size >>
PAGE_SHIFT;

...
                         }
                         if (!ww)
                                 i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c

...
index a77e90f300fe..c7402995a8f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -35,6 +35,8 @@ * @ttm: The base TTM page vector. * @dev: The struct device used for dma mapping and unmapping. * @cached_st: The cached scatter-gather table.

@is_shmem: Set if using shmem.

@filp: The shmem file, if using shmem backend.

Note that DMA may be going on right up to the point where the page-

vector is unpopulated in delayed destroy. Hence keep the @@ -46,6

+48,9 @@ struct i915_ttm_tt {

...
 struct ttm_tt ttm;
 struct device *dev;
 struct sg_table *cached_st;
bool is_shmem;

struct file *filp; };

static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90
@@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj,

...
 placement->busy_placement = busy;
}

+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
                            struct ttm_tt *ttm,
                            struct ttm_operation_ctx *ctx) {
struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
bdev);

...

struct intel_memory_region *mr = i915-

mm.regions[INTEL_MEMORY_SYSTEM];

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

ttm);

...
const unsigned int max_segment = i915_sg_segment_size();

const size_t size = ttm->num_pages << PAGE_SHIFT;

struct file *filp = i915_tt->filp;

struct sgt_iter sgt_iter;

struct sg_table *st;

struct page *page;

unsigned long i;

int err;

if (!filp) {
      struct address_space *mapping;
      gfp_t mask;
      filp = shmem_file_setup("i915-shmem-tt", size,
VM_NORESERVE);

...
      if (IS_ERR(filp))
              return PTR_ERR(filp);
      mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
      mapping = filp->f_mapping;
      mapping_set_gfp_mask(mapping, mask);
      GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
__GFP_RECLAIM));

...
      i915_tt->filp = filp;
}

st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);

if (IS_ERR(st))
      return PTR_ERR(st);
err = dma_map_sg_attrs(i915_tt->dev,
                     st->sgl, st->nents,
                     PCI_DMA_BIDIRECTIONAL,
                     DMA_ATTR_SKIP_CPU_SYNC |
                     DMA_ATTR_NO_KERNEL_MAPPING |
                     DMA_ATTR_NO_WARN);
if (err <= 0) {
      err = -EINVAL;
      goto err_free_st;
}

i = 0;

for_each_sgt_page(page, sgt_iter, st)
      ttm->pages[i++] = page;
if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
      ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
i915_tt->cached_st = st;

return 0;
+err_free_st:

shmem_free_st(st, filp->f_mapping, false, false);

return err;

+}

+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;

dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
           i915_tt->cached_st->nents,
           PCI_DMA_BIDIRECTIONAL);
shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
            file_inode(i915_tt->filp)->i_mapping,
            backup, backup);
Should we do something to undo the shmem_file_setup operation here?
From its implementation it does take a reference counter of inode and allocate file: https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

...
Regards, Oak

Hi, Oak,

That's done in i915_ttm_tt_destroy() afaict.

/Thomas
Do we know whether this tt is back by a shmem at create time? If yes, I think a better place to do the shmem_file_setup is in i915_ttm_tt_create - this pairs with the fput in i915_ttm_tt_destroy. If we don't have such information at tt create time, I agree with the current approach.

IIRC the tt_create is called even if the object will mostly likely just end up being placed in VRAM, so calling shmem_file_setup in there seemed potentially wasteful, since the shmem file might never even be needed. Hence keeping it in populate instead.

...

Regards, Oak

...
...
+}

static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);

enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret;

@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct

ttm_buffer_object *bo,

...
     man->use_tt)
         page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
                i915_ttm_select_tt_caching(obj));
if (ret) {
      kfree(i915_tt);
      return NULL;
if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
      page_flags |= TTM_TT_FLAG_EXTERNAL |
                    TTM_TT_FLAG_EXTERNAL_MAPPABLE;
      i915_tt->is_shmem = true;
}
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);

if (ret)
      goto err_free;
i915_tt->dev = obj->base.dev->dev;

return &i915_tt->ttm;
+err_free:

kfree(i915_tt);

return NULL;

+}

+static int i915_ttm_tt_populate(struct ttm_device *bdev,
                      struct ttm_tt *ttm,
                      struct ttm_operation_ctx *ctx)
+{

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

+ttm);
if (i915_tt->is_shmem)
      return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
return ttm_pool_alloc(&bdev->pool, ttm, ctx); }

static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt
*ttm) {

...
 struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
if (i915_tt->cached_st) {
      dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
                        DMA_BIDIRECTIONAL, 0);
      sg_free_table(i915_tt->cached_st);
      kfree(i915_tt->cached_st);
      i915_tt->cached_st = NULL;
if (i915_tt->is_shmem) {
      i915_ttm_tt_shmem_unpopulate(ttm);
} else {
      if (i915_tt->cached_st) {
              dma_unmap_sgtable(i915_tt->dev, i915_tt-
cached_st,
                                DMA_BIDIRECTIONAL, 0);
              sg_free_table(i915_tt->cached_st);
              kfree(i915_tt->cached_st);
              i915_tt->cached_st = NULL;
      }
      ttm_pool_free(&bdev->pool, ttm);
}
ttm_pool_free(&bdev->pool, ttm); }

static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
*ttm) {

...
 struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
if (i915_tt->filp)
      fput(i915_tt->filp);
ttm_tt_fini(ttm); kfree(i915_tt); }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
ttm_buffer_object *bo, {

...
 struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
/*

EXTERNAL objects should never be swapped out by TTM, instead
we need

...
to handle that ourselves. TTM will already skip such objects for us,

but we would like to avoid grabbing locks for no good reason.

*/

if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
      return -EBUSY;
/* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@
static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj)

...
 i915_gem_object_set_cache_coherency(obj, cache_level);  }
-static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
      container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
drm_i915_gem_object *obj)

...
 int ret;

 if (obj->mm.madv == __I915_MADV_PURGED)
      return;
      return 0;
/* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx);

if (!ret) {
      obj->write_domain = 0;
      obj->read_domains = 0;
      i915_ttm_adjust_gem_after_move(obj);
      i915_ttm_free_cached_io_st(obj);
      obj->mm.madv = __I915_MADV_PURGED;
if (ret)
      return ret;
if (bo->ttm && i915_tt->filp) {
      /*
       * The below fput(which eventually calls shmem_truncate)
might

...
       * be delayed by worker, so when directly called to purge the
       * pages(like by the shrinker) we should try to be more
       * aggressive and release the pages immediately.
       */
      shmem_truncate_range(file_inode(i915_tt->filp),
                           0, (loff_t)-1);
      fput(fetch_and_zero(&i915_tt->filp));
}

obj->write_domain = 0;

obj->read_domains = 0;

i915_ttm_adjust_gem_after_move(obj);

i915_ttm_free_cached_io_st(obj);

obj->mm.madv = __I915_MADV_PURGED;

return 0;
+}

+static void i915_ttm_purge(struct drm_i915_gem_object *obj) {

__i915_ttm_purge(obj);

+}

+static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object
*obj,

...
                                 bool should_writeback)
+{
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);

struct i915_ttm_tt *i915_tt =
      container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = {
      .interruptible = true,
      .no_wait_gpu = false,
};

struct ttm_placement place = {};

int ret;

if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
      return 0;
GEM_BUG_ON(!i915_tt->is_shmem);

if (!i915_tt->filp)
      return 0;
switch (obj->mm.madv) {

case I915_MADV_DONTNEED:
      return __i915_ttm_purge(obj);
case __I915_MADV_PURGED:
      return 0;
}

if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
      return 0;
bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;

ret = ttm_bo_validate(bo, &place, &ctx);

if (ret) {
      bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
      return ret;
}
if (should_writeback)
      __shmem_writeback(obj->base.size, i915_tt->filp-
f_mapping);

return 0; }

static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,

...
static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create,

.ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17

@@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,

...
 }

 if (!i915_gem_object_has_pages(obj)) {
      struct i915_ttm_tt *i915_tt =
              container_of(bo->ttm, typeof(*i915_tt), ttm);
       /* Object either has a page vector or is an iomem object */
       st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
ttm.cached_io_st; if (IS_ERR(st)) return PTR_ERR(st);
         __i915_gem_object_set_pages(obj, st,
i915_sg_dma_sizes(st->sgl));

...
      if (!bo->ttm || !i915_tt->is_shmem)
              i915_gem_object_make_unshrinkable(obj);
}

return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) {

...
 struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
      container_of(bo->ttm, typeof(*i915_tt), ttm);
/* * Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
drm_i915_gem_object *obj)

...
  * Put on the correct LRU list depending on the MADV status
  */
 spin_lock(&bo->bdev->lru_lock);
if (obj->mm.madv != I915_MADV_WILLNEED) {
if (bo->ttm && i915_tt->filp) {
      /* Try to keep shmem_tt from being considered for shrinking.
*/

...
      bo->priority = TTM_MAX_BO_PRIORITY - 1;
} else if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
+1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {

...
 .get_pages = i915_ttm_get_pages,
 .put_pages = i915_ttm_put_pages,
 .truncate = i915_ttm_purge,
.shrinker_release_pages = i915_ttm_shrinker_release_pages,

.adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate,

.mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, };

@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
intel_memory_region *mem,

...
 drm_gem_private_object_init(&i915->drm, &obj->base, size);
 i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
flags);

...
 i915_gem_object_init_memory_region(obj, mem);
i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
__GFP_NOWARN);

...
 mutex_init(&obj->ttm.get_io_page.lock);
 bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
ttm_bo_type_device :

...
-- 2.26.3

Zeng, Oak

6:33 p.m.

New subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

Thanks for explanation. This patch is Acked-by: Oak Zeng Oak.Zeng@intel.com

Regards, Oak

...

-----Original Message----- From: Auld, Matthew matthew.auld@intel.com Sent: October 5, 2021 1:07 PM To: Zeng, Oak oak.zeng@intel.com; Thomas Hellström thomas.hellstrom@linux.intel.com; intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Christian König christian.koenig@amd.com Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On 05/10/2021 15:23, Zeng, Oak wrote:

...
Regards, Oak

...
-----Original Message----- From: Thomas Hellström thomas.hellstrom@linux.intel.com Sent: October 5, 2021 9:48 AM To: Zeng, Oak oak.zeng@intel.com; Auld, Matthew matthew.auld@intel.com; intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Christian König christian.koenig@amd.com Subject: Re: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend

On 10/5/21 04:05, Zeng, Oak wrote:

...
Hi Matthew/Thomas,

See one question inline

Regards, Oak

-----Original Message----- From: Intel-gfx intel-gfx-bounces@lists.freedesktop.org On Behalf Of

Matthew Auld

...
Sent: September 27, 2021 7:41 AM To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Thomas Hellström

thomas.hellstrom@linux.intel.com; Christian König christian.koenig@amd.com

...
Subject: [Intel-gfx] [PATCH v5 09/13] drm/i915/ttm: add tt shmem

backend

...
...
...
For cached objects we can allocate our pages directly in shmem. This

should

...
...
make it possible(in a later patch) to utilise the existing i915-gem shrinker

code

...
...
for such objects. For now this is still disabled.

...
v2(Thomas): - Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we

need

...
...
to

...
  handle objects which don't even have mm.pages, so bundling this into
  put_pages() would require somehow handling that edge case, hence
  just letting the ttm backend handle everything in try_to_writeback
  doesn't seem too bad.
v3(Thomas): - Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from
the

...
...
...
  shrinker on WILLNEED objects.
- s/try_to_writeback/shrinker_release_pages, since this can do more
  than just writeback.
- Get rid of do_backup boolean and just set the SWAPPED flag prior to
  calling unpopulate.
- Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk,
since

...
  these just get skipped anyway. We can try to come up with something
  better later.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com

drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240
++++++++++++++++-

...
...
...
5 files changed, 245 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h

b/drivers/gpu/drm/i915/gem/i915_gem_object.h

...
index 3043fcbd31bd..1c9a1d8d3434 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct

drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj,

...
                                 enum intel_memory_type type);
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
                      size_t size, struct intel_memory_region *mr,
                      struct address_space *mapping,
                      unsigned int max_segment);
+void shmem_free_st(struct sg_table *st, struct address_space
*mapping,

...
...
...
         bool dirty, bool backup);
+void __shmem_writeback(size_t size, struct address_space *mapping);

#ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --
git

...
...
a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h

...
index fa2ba9e2a4d0..f0fb17be2f7a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops { struct sg_table *pages); void (*truncate)(struct drm_i915_gem_object *obj); void (*writeback)(struct drm_i915_gem_object *obj);
int (*shrinker_release_pages)(struct drm_i915_gem_object *obj,
                            bool should_writeback);
int (*pread)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *arg); diff --git
a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c

...
index 36b711ae9e28..19e55cc29a15 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec

*pvec)

...
 cond_resched();
}

-static void shmem_free_st(struct sg_table *st, struct address_space
*mapping,

...
                bool dirty, bool backup)
+void shmem_free_st(struct sg_table *st, struct address_space
*mapping,

...
...
...
         bool dirty, bool backup)
{ struct sgt_iter sgt_iter; struct pagevec pvec;
@@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st,
struct

...
...
address_space *mapping,

...
 kfree(st);
}

-static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
                             size_t size, struct intel_memory_region
*mr,

...
                             struct address_space *mapping,
                             unsigned int max_segment)
+struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
                      size_t size, struct intel_memory_region *mr,
                      struct address_space *mapping,
                      unsigned int max_segment)
{ const unsigned long page_count = size / PAGE_SIZE; unsigned long i;
@@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object
*obj)

...
...
...
 obj->mm.pages = ERR_PTR(-EFAULT);
}

-static void __shmem_writeback(size_t size, struct address_space
*mapping)

...
+void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

...
index e382b7f2353b..cc80bd23d323 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct

drm_i915_gem_object *obj,

...
 return false;
}

-static void try_to_writeback(struct drm_i915_gem_object *obj,
                   unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned +int flags) {
if (obj->ops->shrinker_release_pages)
      return obj->ops->shrinker_release_pages(obj,
                                              flags &
I915_SHRINK_WRITEBACK);

...
switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj);
      return;
      return 0;
case __I915_MADV_PURGED:
      return;
      return 0;
}

if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj);
return 0; }

/**
@@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, }
                         if (!__i915_gem_object_put_pages(obj)) {
                              try_to_writeback(obj, shrink);
                              count += obj->base.size >>
PAGE_SHIFT;

...
                              if (!try_to_writeback(obj, shrink))
                                      count += obj->base.size >>
PAGE_SHIFT;

...
                         }
                         if (!ww)
                                 i915_gem_object_unlock(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c

...
index a77e90f300fe..c7402995a8f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -35,6 +35,8 @@ * @ttm: The base TTM page vector. * @dev: The struct device used for dma mapping and unmapping. * @cached_st: The cached scatter-gather table.

@is_shmem: Set if using shmem.

@filp: The shmem file, if using shmem backend.

Note that DMA may be going on right up to the point where the page-

vector is unpopulated in delayed destroy. Hence keep the @@ -46,6

+48,9 @@ struct i915_ttm_tt {

...
 struct ttm_tt ttm;
 struct device *dev;
 struct sg_table *cached_st;
bool is_shmem;

struct file *filp; };

static const struct ttm_place sys_placement_flags = { @@ -179,12
+184,90

...
...
@@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object

*obj,

...
...
...
 placement->busy_placement = busy;
}

+static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev,
                            struct ttm_tt *ttm,
                            struct ttm_operation_ctx *ctx) {
struct drm_i915_private *i915 = container_of(bdev, typeof(*i915),
bdev);

...

struct intel_memory_region *mr = i915-

mm.regions[INTEL_MEMORY_SYSTEM];

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

ttm);

...
const unsigned int max_segment = i915_sg_segment_size();

const size_t size = ttm->num_pages << PAGE_SHIFT;

struct file *filp = i915_tt->filp;

struct sgt_iter sgt_iter;

struct sg_table *st;

struct page *page;

unsigned long i;

int err;

if (!filp) {
      struct address_space *mapping;
      gfp_t mask;
      filp = shmem_file_setup("i915-shmem-tt", size,
VM_NORESERVE);

...
      if (IS_ERR(filp))
              return PTR_ERR(filp);
      mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
      mapping = filp->f_mapping;
      mapping_set_gfp_mask(mapping, mask);
      GEM_BUG_ON(!(mapping_gfp_mask(mapping) &
__GFP_RECLAIM));

...
      i915_tt->filp = filp;
}

st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment);

if (IS_ERR(st))
      return PTR_ERR(st);
err = dma_map_sg_attrs(i915_tt->dev,
                     st->sgl, st->nents,
                     PCI_DMA_BIDIRECTIONAL,
                     DMA_ATTR_SKIP_CPU_SYNC |
                     DMA_ATTR_NO_KERNEL_MAPPING |
                     DMA_ATTR_NO_WARN);
if (err <= 0) {
      err = -EINVAL;
      goto err_free_st;
}

i = 0;

for_each_sgt_page(page, sgt_iter, st)
      ttm->pages[i++] = page;
if (ttm->page_flags & TTM_TT_FLAG_SWAPPED)
      ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
i915_tt->cached_st = st;

return 0;
+err_free_st:

shmem_free_st(st, filp->f_mapping, false, false);

return err;

+}

+static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) {

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED;

dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl,
           i915_tt->cached_st->nents,
           PCI_DMA_BIDIRECTIONAL);
shmem_free_st(fetch_and_zero(&i915_tt->cached_st),
            file_inode(i915_tt->filp)->i_mapping,
            backup, backup);
Should we do something to undo the shmem_file_setup operation
here?

...
...
From its implementation it does take a reference counter of inode and allocate file: https://elixir.bootlin.com/linux/latest/source/mm/shmem.c#L4084

...
Regards, Oak

Hi, Oak,

That's done in i915_ttm_tt_destroy() afaict.

/Thomas

Do we know whether this tt is back by a shmem at create time? If yes, I

think a better place to do the shmem_file_setup is in i915_ttm_tt_create - this pairs with the fput in i915_ttm_tt_destroy. If we don't have such information at tt create time, I agree with the current approach.

IIRC the tt_create is called even if the object will mostly likely just end up being placed in VRAM, so calling shmem_file_setup in there seemed potentially wasteful, since the shmem file might never even be needed. Hence keeping it in populate instead.

...
Regards, Oak

...
...
+}

static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);

enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret;

@@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct

ttm_buffer_object *bo,

...
     man->use_tt)
         page_flags |= TTM_TT_FLAG_ZERO_ALLOC;
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
                i915_ttm_select_tt_caching(obj));
if (ret) {
      kfree(i915_tt);
      return NULL;
if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) {
      page_flags |= TTM_TT_FLAG_EXTERNAL |
                    TTM_TT_FLAG_EXTERNAL_MAPPABLE;
      i915_tt->is_shmem = true;
}
ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);

if (ret)
      goto err_free;
i915_tt->dev = obj->base.dev->dev;

return &i915_tt->ttm;
+err_free:

kfree(i915_tt);

return NULL;

+}

+static int i915_ttm_tt_populate(struct ttm_device *bdev,
                      struct ttm_tt *ttm,
                      struct ttm_operation_ctx *ctx)
+{

struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),

+ttm);
if (i915_tt->is_shmem)
      return i915_ttm_tt_shmem_populate(bdev, ttm, ctx);
return ttm_pool_alloc(&bdev->pool, ttm, ctx); }

static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct
ttm_tt

...
...
*ttm) {

...
 struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
if (i915_tt->cached_st) {
      dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st,
                        DMA_BIDIRECTIONAL, 0);
      sg_free_table(i915_tt->cached_st);
      kfree(i915_tt->cached_st);
      i915_tt->cached_st = NULL;
if (i915_tt->is_shmem) {
      i915_ttm_tt_shmem_unpopulate(ttm);
} else {
      if (i915_tt->cached_st) {
              dma_unmap_sgtable(i915_tt->dev, i915_tt-
cached_st,
                                DMA_BIDIRECTIONAL, 0);
              sg_free_table(i915_tt->cached_st);
              kfree(i915_tt->cached_st);
              i915_tt->cached_st = NULL;
      }
      ttm_pool_free(&bdev->pool, ttm);
}
ttm_pool_free(&bdev->pool, ttm); }

static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt
*ttm) {

...
 struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt),
ttm);

...
if (i915_tt->filp)
      fput(i915_tt->filp);
ttm_tt_fini(ttm); kfree(i915_tt); }
@@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct
ttm_buffer_object *bo, {

...
 struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
/*

EXTERNAL objects should never be swapped out by TTM, instead
we need

...
to handle that ourselves. TTM will already skip such objects for us,

but we would like to avoid grabbing locks for no good reason.

*/

if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)
      return -EBUSY;
/* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@
static void i915_ttm_adjust_gem_after_move(struct
drm_i915_gem_object

...
...
*obj)

...
 i915_gem_object_set_cache_coherency(obj, cache_level);  }
-static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
      container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false,
@@ -339,17 +458,79 @@ static void i915_ttm_purge(struct
drm_i915_gem_object *obj)

...
 int ret;

 if (obj->mm.madv == __I915_MADV_PURGED)
      return;
      return 0;
/* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx);

if (!ret) {
      obj->write_domain = 0;
      obj->read_domains = 0;
      i915_ttm_adjust_gem_after_move(obj);
      i915_ttm_free_cached_io_st(obj);
      obj->mm.madv = __I915_MADV_PURGED;
if (ret)
      return ret;
if (bo->ttm && i915_tt->filp) {
      /*
       * The below fput(which eventually calls shmem_truncate)
might

...
       * be delayed by worker, so when directly called to purge the
       * pages(like by the shrinker) we should try to be more
       * aggressive and release the pages immediately.
       */
      shmem_truncate_range(file_inode(i915_tt->filp),
                           0, (loff_t)-1);
      fput(fetch_and_zero(&i915_tt->filp));
}

obj->write_domain = 0;

obj->read_domains = 0;

i915_ttm_adjust_gem_after_move(obj);

i915_ttm_free_cached_io_st(obj);

obj->mm.madv = __I915_MADV_PURGED;

return 0;
+}

+static void i915_ttm_purge(struct drm_i915_gem_object *obj) {

__i915_ttm_purge(obj);

+}

+static int i915_ttm_shrinker_release_pages(struct
drm_i915_gem_object

...
...
*obj,

...
                                 bool should_writeback)
+{
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);

struct i915_ttm_tt *i915_tt =
      container_of(bo->ttm, typeof(*i915_tt), ttm);
struct ttm_operation_ctx ctx = {
      .interruptible = true,
      .no_wait_gpu = false,
};

struct ttm_placement place = {};

int ret;

if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM)
      return 0;
GEM_BUG_ON(!i915_tt->is_shmem);

if (!i915_tt->filp)
      return 0;
switch (obj->mm.madv) {

case I915_MADV_DONTNEED:
      return __i915_ttm_purge(obj);
case __I915_MADV_PURGED:
      return 0;
}

if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED)
      return 0;
bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED;

ret = ttm_bo_validate(bo, &place, &ctx);

if (ret) {
      bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED;
      return ret;
}
if (should_writeback)
      __shmem_writeback(obj->base.size, i915_tt->filp-
f_mapping);

return 0; }

static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -
618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo,

...
static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create,

.ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17

@@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,

...
 }

 if (!i915_gem_object_has_pages(obj)) {
      struct i915_ttm_tt *i915_tt =
              container_of(bo->ttm, typeof(*i915_tt), ttm);
       /* Object either has a page vector or is an iomem object */
       st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj-
ttm.cached_io_st; if (IS_ERR(st)) return PTR_ERR(st);
         __i915_gem_object_set_pages(obj, st,
i915_sg_dma_sizes(st->sgl));

...
      if (!bo->ttm || !i915_tt->is_shmem)
              i915_gem_object_make_unshrinkable(obj);
}

return ret;
@@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct
drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) {

...
 struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct i915_ttm_tt *i915_tt =
      container_of(bo->ttm, typeof(*i915_tt), ttm);
/* * Don't manipulate the TTM LRUs while in TTM bo destruction.
@@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct
drm_i915_gem_object *obj)

...
  * Put on the correct LRU list depending on the MADV status
  */
 spin_lock(&bo->bdev->lru_lock);
if (obj->mm.madv != I915_MADV_WILLNEED) {
if (bo->ttm && i915_tt->filp) {
      /* Try to keep shmem_tt from being considered for shrinking.
*/

...
      bo->priority = TTM_MAX_BO_PRIORITY - 1;
} else if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9
+1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = {

...
 .get_pages = i915_ttm_get_pages,
 .put_pages = i915_ttm_put_pages,
 .truncate = i915_ttm_purge,
.shrinker_release_pages = i915_ttm_shrinker_release_pages,

.adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate,

.mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, };

@@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct
intel_memory_region *mem,

...
 drm_gem_private_object_init(&i915->drm, &obj->base, size);
 i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class,
flags);

...
 i915_gem_object_init_memory_region(obj, mem);
i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL |
__GFP_NOWARN);

...
 mutex_init(&obj->ttm.get_io_page.lock);
 bo_type = (obj->flags & I915_BO_ALLOC_USER) ?
ttm_bo_type_device :

...
-- 2.26.3

Matthew Auld

27 Sep 27 Sep

11:41 a.m.

New subject: [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable

Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable update the shrinker visible lists immediately. This at least simplifies the next patch, and does make the behaviour more obvious. The potential downside is that make_unshrinkable now grabs a global lock even when the object itself is no longer shrinkable(transitioning from purgeable <-> shrinkable doesn't seem to be a thing), for example in the ppGTT insertion paths we should now be careful not to needlessly call make_unshrinkable multiple times. Outside of that there is some fallout in intel_context which relies on nesting calls to shrink_pin.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 9 ---- .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 16 +----- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 52 +++++++++++++------ drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 1 - drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 1 - drivers/gpu/drm/i915/gt/intel_context.c | 9 +--- 7 files changed, 41 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 6fb9afb65034..e8265a432fcb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -305,15 +305,6 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj) */ atomic_inc(&i915->mm.free_count);

- /* - * This serializes freeing with the shrinker. Since the free - * is delayed, first by RCU then by the workqueue, we want the - * shrinker to be able to free pages of unreferenced objects, - * or else we may oom whilst there are plenty of deferred - * freed objects. - */ - i915_gem_object_make_unshrinkable(obj); - /* * Since we require blocking on struct_mutex to unbind the freed * object from the GPU before releasing resources back to the diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index f0fb17be2f7a..e4f8a6774da8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -461,7 +461,6 @@ struct drm_i915_gem_object { * instead go through the pin/unpin interfaces. */ atomic_t pages_pin_count; - atomic_t shrink_pin;

/** * Priority list of potential placements for this object. @@ -522,7 +521,7 @@ struct drm_i915_gem_object { struct i915_gem_object_page_iter get_dma_page;

/** - * Element within i915->mm.unbound_list or i915->mm.bound_list, + * Element within i915->mm.shrink_list or i915->mm.purge_list, * locked by i915->mm.obj_lock. */ struct list_head link; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 8eb1c3a6fc9c..f0df1394d7f6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -64,28 +64,16 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj)); i915_gem_object_set_tiling_quirk(obj); GEM_BUG_ON(!list_empty(&obj->mm.link)); - atomic_inc(&obj->mm.shrink_pin); shrinkable = false; }

if (shrinkable) { - struct list_head *list; - unsigned long flags; - assert_object_held(obj); - spin_lock_irqsave(&i915->mm.obj_lock, flags); - - i915->mm.shrink_count++; - i915->mm.shrink_memory += obj->base.size;

if (obj->mm.madv != I915_MADV_WILLNEED) - list = &i915->mm.purge_list; + i915_gem_object_make_purgeable(obj); else - list = &i915->mm.shrink_list; - list_add_tail(&obj->mm.link, list); - - atomic_set(&obj->mm.shrink_pin, 0); - spin_unlock_irqrestore(&i915->mm.obj_lock, flags); + i915_gem_object_make_shrinkable(obj); } }

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index cc80bd23d323..0440696f786a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -460,23 +460,26 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915,

#define obj_to_i915(obj__) to_i915((obj__)->base.dev)

+/** + * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By + * default all object types that support shrinking(see IS_SHRINKABLE), will also + * make the object visible to the shrinker after allocating the system memory + * pages. + * @obj: The GEM object. + * + * This is typically used for special kernel internal objects that can't be + * easily processed by the shrinker, like if they are perma-pinned. + */ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) { struct drm_i915_private *i915 = obj_to_i915(obj); unsigned long flags;

- /* - * We can only be called while the pages are pinned or when - * the pages are released. If pinned, we should only be called - * from a single caller under controlled conditions; and on release - * only one caller may release us. Neither the two may cross. - */ - if (atomic_add_unless(&obj->mm.shrink_pin, 1, 0)) + if (!i915_gem_object_is_shrinkable(obj)) return;

spin_lock_irqsave(&i915->mm.obj_lock, flags); - if (!atomic_fetch_inc(&obj->mm.shrink_pin) && - !list_empty(&obj->mm.link)) { + if (!list_empty(&obj->mm.link)) { list_del_init(&obj->mm.link); i915->mm.shrink_count--; i915->mm.shrink_memory -= obj->base.size; @@ -494,28 +497,45 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, if (!i915_gem_object_is_shrinkable(obj)) return;

- if (atomic_add_unless(&obj->mm.shrink_pin, -1, 1)) - return; - spin_lock_irqsave(&i915->mm.obj_lock, flags); + GEM_BUG_ON(!kref_read(&obj->base.refcount)); - if (atomic_dec_and_test(&obj->mm.shrink_pin)) { - GEM_BUG_ON(!list_empty(&obj->mm.link));

- list_add_tail(&obj->mm.link, head); + if (list_empty(&obj->mm.link)) { i915->mm.shrink_count++; i915->mm.shrink_memory += obj->base.size; - + list_add_tail(&obj->mm.link, head); + } else { + list_move_tail(&obj->mm.link, head); } + spin_unlock_irqrestore(&i915->mm.obj_lock, flags); }

+ +/** + * i915_gem_object_make_shrinkable - Move the object to the tail of the + * shrinkable list. Objects on this list might be swapped out. Used with + * WILLNEED objects. + * @obj: The GEM object. + * + * Should only be called on objects which have backing pages. + */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { __i915_gem_object_make_shrinkable(obj, &obj_to_i915(obj)->mm.shrink_list); }

+/** + * i915_gem_object_make_purgeable - Move the object to the tail of the purgeable + * list. Used with DONTNEED objects. Unlike with shrinkable objects, the + * shrinker will attempt to discard the backing pages, instead of trying to swap + * them out. + * @obj: The GEM object. + * + * Should only be called on objects which have backing pages. + */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { __i915_gem_object_make_shrinkable(obj, diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 890191f286e3..baea9770200a 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -185,7 +185,6 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,

pt = stash->pt[0]; __i915_gem_object_pin_pages(pt->base); - i915_gem_object_make_unshrinkable(pt->base);

fill32_px(pt, vm->scratch[0]->encode);

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 037a9a6e4889..8af2f709571c 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -301,7 +301,6 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm,

pt = stash->pt[!!lvl]; __i915_gem_object_pin_pages(pt->base); - i915_gem_object_make_unshrinkable(pt->base);

fill_px(pt, vm->scratch[lvl]->encode);

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index ff637147b1a9..1b7dc57e6ec1 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -111,11 +111,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) if (err) goto err_unpin;

- /* - * And mark it as a globally pinned object to let the shrinker know - * it cannot reclaim the object until we release it. - */ - i915_vma_make_unshrinkable(vma); vma->obj->mm.dirty = true;

return 0; @@ -127,7 +122,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww)

static void __context_unpin_state(struct i915_vma *vma) { - i915_vma_make_shrinkable(vma); i915_active_release(&vma->active); __i915_vma_unpin(vma); } @@ -180,7 +174,6 @@ static int intel_context_pre_pin(struct intel_context *ce, if (err) goto err_timeline;

- return 0;

err_timeline: @@ -338,6 +331,8 @@ static void __intel_context_retire(struct i915_active *active)

set_bit(CONTEXT_VALID_BIT, &ce->flags); intel_context_post_unpin(ce); + if (ce->state) + i915_vma_make_shrinkable(ce->state); intel_context_put(ce); }

-- 2.26.3

Thomas Hellström

29 Sep 29 Sep

1 p.m.

New subject: [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...

Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable update the shrinker visible lists immediately. This at least simplifies the next patch, and does make the behaviour more obvious. The potential downside is that make_unshrinkable now grabs a global lock even when the object itself is no longer shrinkable(transitioning from purgeable <-

...
shrinkable doesn't seem to be a thing), for example in the ppGTT insertion paths we should now be careful not to needlessly call make_unshrinkable multiple times. Outside of that there is some fallout in intel_context which relies on nesting calls to shrink_pin.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

Hmm. One thing that worries me a bit here: Let's say we have, for example an LMEM context state, and TTM has it made unshrinkable. Then the context becomes active and calls _make_unshrinkable again. And when it retires it callse _make_shrinkable. Doesn't it end up on the shrinker list at that point, even if still in LMEM?

/Thomas

Matthew Auld

27 Sep 27 Sep

11:41 a.m.

New subject: [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker

We currently just evict lmem objects to system memory when under memory pressure, and in the next patch we want to use the shmem backend even for this case. For this case we lack the usual object mm.pages, which effectively hides the pages from the i915-gem shrinker, until we actually "attach" the TT to the object, or in the case of lmem-only objects it just gets migrated back to lmem when touched again.

For all cases we can just adjust the i915 shrinker LRU each time we also adjust the TTM LRU. The two cases we care about are:

1) When something is moved by TTM, including when initially populating an object. Importantly this covers the case where TTM moves something from lmem <-> smem, outside of the normal get_pages() interface, which should still ensure the shmem pages underneath are reclaimable.

2) When calling into i915_gem_object_unlock(). The unlock should ensure the object is removed from the shinker LRU, if it was indeed swapped out, or just purged, when the shrinker drops the object lock.

We can optimise this(if needed) by tracking if the object is already visible to the shrinker(protected by the object lock), so we don't touch the shrinker LRU more than needed.

v2(Thomas) - Handle managing the shrinker LRU in adjust_lru, where it is always safe to touch the object.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 1 + drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++----- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 28 +++++++++++++++---- 3 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 1c9a1d8d3434..640dfbf1f01e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,

void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj);

static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 0440696f786a..4b6b2bb6f180 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) spin_unlock_irqrestore(&i915->mm.obj_lock, flags); }

-static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, - struct list_head *head) +static void ___i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, + struct list_head *head) { struct drm_i915_private *i915 = obj_to_i915(obj); unsigned long flags;

- GEM_BUG_ON(!i915_gem_object_has_pages(obj)); if (!i915_gem_object_is_shrinkable(obj)) return;

@@ -512,6 +511,21 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, spin_unlock_irqrestore(&i915->mm.obj_lock, flags); }

+/** + * __i915_gem_object_make_shrinkable - Move the object to the tail of the + * shrinkable list. Objects on this list might be swapped out. Used with + * WILLNEED objects. + * @obj: The GEM object. + * + * DO NOT USE. This is intended to be called on very special objects that don't + * yet have mm.pages, but are guaranteed to have potentially reclaimable pages + * underneath. + */ +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) +{ + ___i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.shrink_list); +}

/** * i915_gem_object_make_shrinkable - Move the object to the tail of the @@ -523,8 +537,8 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { - __i915_gem_object_make_shrinkable(obj, - &obj_to_i915(obj)->mm.shrink_list); + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + __i915_gem_object_make_shrinkable(obj); }

/** @@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { - __i915_gem_object_make_shrinkable(obj, - &obj_to_i915(obj)->mm.purge_list); + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + ___i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.purge_list); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index c7402995a8f9..194e5f1deda8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, return ret; }

+ i915_ttm_adjust_lru(obj); + dst_st = i915_ttm_resource_get_st(obj, dst_mem); if (IS_ERR(dst_st)) return PTR_ERR(dst_st); @@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, return i915_ttm_err_to_gem(ret); }

- i915_ttm_adjust_lru(obj); if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) { ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx); if (ret) @@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, return PTR_ERR(st);

__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); - if (!bo->ttm || !i915_tt->is_shmem) - i915_gem_object_make_unshrinkable(obj); }

+ i915_ttm_adjust_lru(obj); + return ret; }

@@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj, * If the object is not destroyed next, The TTM eviction logic * and shrinkers will move it out if needed. */ - - i915_ttm_adjust_lru(obj); }

static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) @@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) if (!kref_read(&bo->kref)) return;

+ /* + * Even if we lack mm.pages for this object(which will be the case when + * something is evicted to system memory by TTM), we still want to make + * this object visible to the shrinker, since the underlying ttm_tt + * still has the real shmem pages. + */ + if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm)) + __i915_gem_object_make_shrinkable(obj); + else + i915_gem_object_make_unshrinkable(obj); + /* * Put on the correct LRU list depending on the MADV status */ @@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj) { if (obj->ttm.created) { + /* + * We freely manage the shrinker LRU outide of the mm.pages life + * cycle. As a result when destroying the object it's up to us + * to ensure we remove it from the LRU, before we free the + * object. + */ + i915_gem_object_make_unshrinkable(obj); + ttm_bo_put(i915_gem_to_ttm(obj)); } else { __i915_gem_free_object(obj);

-- 2.26.3

Thomas Hellström

29 Sep 29 Sep

11:47 a.m.

New subject: [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...

We currently just evict lmem objects to system memory when under memory pressure, and in the next patch we want to use the shmem backend even for this case. For this case we lack the usual object mm.pages, which effectively hides the pages from the i915-gem shrinker, until we actually "attach" the TT to the object, or in the case of lmem-only objects it just gets migrated back to lmem when touched again.

For all cases we can just adjust the i915 shrinker LRU each time we also adjust the TTM LRU. The two cases we care about are:

1) When something is moved by TTM, including when initially populating      an object. Importantly this covers the case where TTM moves something from      lmem <-> smem, outside of the normal get_pages() interface, which      should still ensure the shmem pages underneath are reclaimable.

2) When calling into i915_gem_object_unlock(). The unlock should      ensure the object is removed from the shinker LRU, if it was indeed      swapped out, or just purged, when the shrinker drops the object lock.

We can optimise this(if needed) by tracking if the object is already visible to the shrinker(protected by the object lock), so we don't touch the shrinker LRU more than needed.

v2(Thomas) - Handle managing the shrinker LRU in adjust_lru, where it is always     safe to touch the object.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

drivers/gpu/drm/i915/gem/i915_gem_object.h   | 1 + drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++--- -- drivers/gpu/drm/i915/gem/i915_gem_ttm.c      | 28 +++++++++++++++---

3 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 1c9a1d8d3434..640dfbf1f01e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj); static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 0440696f786a..4b6b2bb6f180 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj)         spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } -static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, -                                             struct list_head *head) +static void ___i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, +                                              struct list_head *head) {         struct drm_i915_private *i915 = obj_to_i915(obj);         unsigned long flags; -       GEM_BUG_ON(!i915_gem_object_has_pages(obj));         if (!i915_gem_object_is_shrinkable(obj))                 return; @@ -512,6 +511,21 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj,         spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } +/**

__i915_gem_object_make_shrinkable - Move the object to the tail

of the

shrinkable list. Objects on this list might be swapped out. Used

with

WILLNEED objects.

@obj: The GEM object.

DO NOT USE. This is intended to be called on very special objects

that don't

yet have mm.pages, but are guaranteed to have potentially

reclaimable pages

underneath.

*/

+void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) +{ +       ___i915_gem_object_make_shrinkable(obj, +                                          &obj_to_i915(obj)-

...
mm.shrink_list);

+} /** * i915_gem_object_make_shrinkable - Move the object to the tail of the @@ -523,8 +537,8 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { -       __i915_gem_object_make_shrinkable(obj, -                                         &obj_to_i915(obj)-

...
mm.shrink_list);

+       GEM_BUG_ON(!i915_gem_object_has_pages(obj)); +       __i915_gem_object_make_shrinkable(obj); } /** @@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { -       __i915_gem_object_make_shrinkable(obj, -                                         &obj_to_i915(obj)-

...
mm.purge_list);

+       GEM_BUG_ON(!i915_gem_object_has_pages(obj)); +       ___i915_gem_object_make_shrinkable(obj, +                                          &obj_to_i915(obj)-

...
mm.purge_list);

} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index c7402995a8f9..194e5f1deda8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,                         return ret;         } +       i915_ttm_adjust_lru(obj);

This will put the object on the shrinker list a little earlier than if we rely on the adjust_lru() from object_unlock() only, but is that strictly necessary? I figure even if the shrinker picks the object up, it will fail in the object trylock and ignore the object, until we call object_unlock() anyway?

...

dst_st = i915_ttm_resource_get_st(obj, dst_mem);         if (IS_ERR(dst_st))                 return PTR_ERR(dst_st); @@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,                         return i915_ttm_err_to_gem(ret);         } -       i915_ttm_adjust_lru(obj);         if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) {                 ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx);                 if (ret) @@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj,                         return PTR_ERR(st);                 __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); -               if (!bo->ttm || !i915_tt->is_shmem) -                       i915_gem_object_make_unshrinkable(obj);         } +       i915_ttm_adjust_lru(obj);

return ret; } @@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj,          * If the object is not destroyed next, The TTM eviction logic          * and shrinkers will move it out if needed.          */

-       i915_ttm_adjust_lru(obj); } static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) @@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)         if (!kref_read(&bo->kref))                 return; +       /* +        * Even if we lack mm.pages for this object(which will be the case when +        * something is evicted to system memory by TTM), we still want to make +        * this object visible to the shrinker, since the underlying ttm_tt +        * still has the real shmem pages. +        */ +       if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm)) +               __i915_gem_object_make_shrinkable(obj); +       else +               i915_gem_object_make_unshrinkable(obj);

/*          * Put on the correct LRU list depending on the MADV status          */ @@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj) {         if (obj->ttm.created) { +               /* +                * We freely manage the shrinker LRU outide of the mm.pages life +                * cycle. As a result when destroying the object it's up to us +                * to ensure we remove it from the LRU, before we free the +                * object. +                */ +               i915_gem_object_make_unshrinkable(obj);

I guess this is not *strictly* necessary at this point, since the shrinker has a kref_get_unless_zero() guard, but I guess we need to remove the object from the shrinker LRU at some point during destruction anyway.

Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com

...

ttm_bo_put(i915_gem_to_ttm(obj)); } else { __i915_gem_free_object(obj);

Matthew Auld

27 Sep 27 Sep

11:41 a.m.

New subject: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem

This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 194e5f1deda8..46d57541c0b2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -134,11 +134,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached;

return ttm_write_combined;

-- 2.26.3

Thomas Hellström

29 Sep 29 Sep

11:54 a.m.

New subject: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...

This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 194e5f1deda8..46d57541c0b2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -134,11 +134,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached; return ttm_write_combined;

We should be aware that with TTM, even evicted bos can be mapped by user-space while evicted, and this will appear to user-space like the WC-mapped object suddenly became WB-mapped. But it appears like mesa doesn't care about this as long as the mappings are fully coherent.

Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com

Michel Dänzer

30 Sep 30 Sep

10:04 a.m.

New subject: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem

On 2021-09-29 13:54, Thomas Hellström wrote:

...

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...
This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 194e5f1deda8..46d57541c0b2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -134,11 +134,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached; return ttm_write_combined;

We should be aware that with TTM, even evicted bos can be mapped by user-space while evicted, and this will appear to user-space like the WC-mapped object suddenly became WB-mapped. But it appears like mesa doesn't care about this as long as the mappings are fully coherent.

FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).

-- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer

Matthew Auld

12:27 p.m.

New subject: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem

On 30/09/2021 11:04, Michel Dänzer wrote:

...

On 2021-09-29 13:54, Thomas Hellström wrote:

...
On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...
This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 194e5f1deda8..46d57541c0b2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -134,11 +134,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached;

return ttm_write_combined;

We should be aware that with TTM, even evicted bos can be mapped by user-space while evicted, and this will appear to user-space like the WC-mapped object suddenly became WB-mapped. But it appears like mesa doesn't care about this as long as the mappings are fully coherent.

FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).

Ok, so amdgpu just defaults to cached system memory, even for evicted VRAM, unless userspace requests USWC, in which case it will use WC?

...

Michel Dänzer

12:55 p.m.

New subject: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem

On 2021-09-30 14:27, Matthew Auld wrote:

...

On 30/09/2021 11:04, Michel Dänzer wrote:

...
On 2021-09-29 13:54, Thomas Hellström wrote:

...
On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...
This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 194e5f1deda8..46d57541c0b2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -134,11 +134,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached; return ttm_write_combined;

We should be aware that with TTM, even evicted bos can be mapped by user-space while evicted, and this will appear to user-space like the WC-mapped object suddenly became WB-mapped. But it appears like mesa doesn't care about this as long as the mappings are fully coherent.

FWIW, the Mesa radeonsi driver avoids surprises due to this (e.g. some path which involves CPU access suddenly goes faster if the BO was evicted from VRAM) by asking for WC mapping of BOs intended to be in VRAM even while they're evicted (via the AMDGPU_GEM_CREATE_CPU_GTT_USWC flag).

Ok, so amdgpu just defaults to cached system memory, even for evicted VRAM, unless userspace requests USWC, in which case it will use WC?

Right.

-- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer

Matthew Auld

27 Sep 27 Sep

11:41 a.m.

New subject: [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend

Turn on the shmem tt backend, and enable shrinking.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 46d57541c0b2..4ae630fbc5cd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj)

static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .name = "i915_gem_object_ttm", + .flags = I915_GEM_OBJECT_IS_SHRINKABLE,

.get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages,

-- 2.26.3

Thomas Hellström

29 Sep 29 Sep

noon

New subject: [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend

On Mon, 2021-09-27 at 12:41 +0100, Matthew Auld wrote:

...

Turn on the shmem tt backend, and enable shrinking.

Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com

drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 46d57541c0b2..4ae630fbc5cd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj) static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .name = "i915_gem_object_ttm", + .flags = I915_GEM_OBJECT_IS_SHRINKABLE, .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages,

Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com

Now that BAT is running a DG1 again, it might be worth to give the series a rerun. Perhaps with the "rework object initialization slightly" as a HAX patch to unblock the mman + following selftest.

/Thomas

Christian König

27 Sep 27 Sep

11:47 a.m.

Any objections that I just push patches 1-7 to drm-misc-next?

Christian.

Am 27.09.21 um 13:41 schrieb Matthew Auld:

...

In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling Felix.Kuehling@amd.com Date: Thu Jul 13 17:01:16 2017 -0400
 drm/ttm: Implement vm_operations_struct.access v2
we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com

drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 ----- 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..5b9b7fd01a69 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,

switch (bo->resource->mem_type) { case TTM_PL_SYSTEM:
if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
	ret = ttm_tt_swapin(bo->ttm);
	if (unlikely(ret != 0))
		return ret;
}
fallthrough; case TTM_PL_TT: ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);

Matthew Auld

4:14 p.m.

New subject: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access

On Mon, 27 Sept 2021 at 12:47, Christian König christian.koenig@amd.com wrote:

...

Any objections that I just push patches 1-7 to drm-misc-next?

Please go ahead Christian. Thanks.

...

Christian.

Am 27.09.21 um 13:41 schrieb Matthew Auld:

...
In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling Felix.Kuehling@amd.com Date: Thu Jul 13 17:01:16 2017 -0400
 drm/ttm: Implement vm_operations_struct.access v2
we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com

drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 ----- 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..5b9b7fd01a69 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
  switch (bo->resource->mem_type) {
  case TTM_PL_SYSTEM:
        if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
                ret = ttm_tt_swapin(bo->ttm);
                if (unlikely(ret != 0))
                        return ret;
        }
        fallthrough;
case TTM_PL_TT:
        ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);

Christian König

29 Sep 29 Sep

12:01 p.m.

New subject: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access

Am 27.09.21 um 18:14 schrieb Matthew Auld:

...

On Mon, 27 Sept 2021 at 12:47, Christian König christian.koenig@amd.com wrote:

...
Any objections that I just push patches 1-7 to drm-misc-next?

Please go ahead Christian. Thanks.

Well I've pushed patches #1-#4 because #5 won't apply on current drm-misc-next (some conflict in i915).

Could you rebase this an/or request backmerging of drm-next into drm-misc-next when potential i915 prerequisites have landed there.

Thanks, Christian.

...

...
Christian.

Am 27.09.21 um 13:41 schrieb Matthew Auld:

...
In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling Felix.Kuehling@amd.com Date: Thu Jul 13 17:01:16 2017 -0400
  drm/ttm: Implement vm_operations_struct.access v2
we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com

drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 ----- 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..5b9b7fd01a69 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
   switch (bo->resource->mem_type) {
   case TTM_PL_SYSTEM:
        if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
                ret = ttm_tt_swapin(bo->ttm);
                if (unlikely(ret != 0))
                        return ret;
        }
         fallthrough;
 case TTM_PL_TT:
         ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);

Matthew Auld

1:45 p.m.

New subject: [Intel-gfx] [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access

On Wed, 29 Sept 2021 at 13:01, Christian König christian.koenig@amd.com wrote:

...

Am 27.09.21 um 18:14 schrieb Matthew Auld:

...
On Mon, 27 Sept 2021 at 12:47, Christian König christian.koenig@amd.com wrote:

...
Any objections that I just push patches 1-7 to drm-misc-next?

Please go ahead Christian. Thanks.

Well I've pushed patches #1-#4 because #5 won't apply on current drm-misc-next (some conflict in i915).

Could you rebase this an/or request backmerging of drm-next into drm-misc-next when potential i915 prerequisites have landed there.

Version which should apply to drm-misc-next: https://patchwork.freedesktop.org/series/95219/

...

Thanks, Christian.

...
...
Christian.

Am 27.09.21 um 13:41 schrieb Matthew Auld:

...
In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling Felix.Kuehling@amd.com Date: Thu Jul 13 17:01:16 2017 -0400
  drm/ttm: Implement vm_operations_struct.access v2
we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Christian König christian.koenig@amd.com

drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 ----- 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..5b9b7fd01a69 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
   switch (bo->resource->mem_type) {
   case TTM_PL_SYSTEM:
        if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
                ret = ttm_tt_swapin(bo->ttm);
                if (unlikely(ret != 0))
                        return ret;
        }
         fallthrough;
 case TTM_PL_TT:
         ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);

1294

Age (days ago)

1302

Last active (days ago)

dri-devel@lists.freedesktop.org

29 comments

6 participants

tags (0)

participants (6)

Christian König
Matthew Auld
Matthew Auld
Michel Dänzer
Thomas Hellström
Zeng, Oak