On 18/10/2021 10:10, Matthew Auld wrote:
For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled.
v2(Thomas):
- Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need to handle objects which don't even have mm.pages, so bundling this into put_pages() would require somehow handling that edge case, hence just letting the ttm backend handle everything in try_to_writeback doesn't seem too bad.
v3(Thomas):
- Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects.
- s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback.
- Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate.
- Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since these just get skipped anyway. We can try to come up with something better later.
v4(Thomas):
- s/PCI_DMA/DMA/. Also drop NO_KERNEL_MAPPING and NO_WARN, which apparently doesn't do anything with streaming mappings.
- Just pass along the error for ->truncate, and assume nothing.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Christian König christian.koenig@amd.com Cc: Oak Zeng oak.zeng@intel.com Reviewed-by: Thomas Hellström thomas.hellstrom@linux.intel.com Acked-by: Oak Zeng oak.zeng@intel.com
[snip]
-static void try_to_writeback(struct drm_i915_gem_object *obj,
unsigned int flags)
+static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned int flags) {
- if (obj->ops->shrinker_release_pages)
return obj->ops->shrinker_release_pages(obj,
flags & I915_SHRINK_WRITEBACK);
I have a high level question about how does this new vfunc fit with the existing code.
Because I notice in the implementation (i915_ttm_shrinker_release_pages) it ends up doing: ...
switch (obj->mm.madv) { case I915_MADV_DONTNEED: return i915_ttm_purge(obj); case __I915_MADV_PURGED: return 0; }
... and then...
if (should_writeback) __shmem_writeback(obj->base.size, i915_tt->filp->f_mapping);
Which on a glance looks like a possible conceptual duplication of the concepts in this very function (try_to_writeback):
- switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj);
return;
case __I915_MADV_PURGED:return 0;
return;
return 0;
}
if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj);
So question is this the final design or some futher tidy is possible/planned?
Background to my question is that I will float a patch which proposes to remove writeback altogether. There are reports from the fields that it causes deadlocks and looking at 2d6692e642e7 ("drm/i915: Start writeback from the shrinker") and its history it seems like it was a known risk.
Apart from the code organisation questions, on the practical level - do you need writeback from the TTM backend or while I am proposing to remove it from the "legacy" paths, I can propose removing it from the TTM flow as well?
Regards,
Tvrtko