Hi,
On Mon, 2022-01-17 at 21:07 +0000, Robert Beckett wrote:
On 10/01/2022 17:22, Thomas Hellström wrote:
Implement async (non-blocking) unbinding by not syncing the vma before calling unbind on the vma_resource. Add the resulting unbind fence to the object's dma_resv from where it is picked up by the ttm migration code. Ideally these unbind fences should be coalesced with the migration blit fence to avoid stalling the migration blit waiting for unbind, as they can certainly go on in parallel, but since we don't yet have a reasonable data structure to use to coalesce fences and attach the resulting fence to a timeline, we defer that for now.
Note that with async unbinding, even while the unbind waits for the preceding bind to complete before unbinding, the vma itself might have been destroyed in the process, clearing the vma pages. Therefore we can only allow async unbinding if we have a refcounted sg-list and keep a refcount on that for the vma resource pages to stay intact until binding occurs. If this condition is not met, a request for an async unbind is diverted to a sync unbind.
v2:
- Use a separate kmem_cache for vma resources for now to isolate
their memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread.
Regardless of whether the vm is closed, we need the unbind fence to properly wait for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
pending unbind. (Matthew Auld) v5:
- Fix timeout and error check in
i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
async unbind capable.
- Fix braces around a single-line if statement.
v6:
- Fix up the cache coloring adjustment. (Kernel test robot
- Don't allow async unbinding if the vma_res pages are not the same
as the object pages. (Matthew Auld) v7:
- s/unsigned long/u64/ in a number of places (Matthew Auld)
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com
drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 11 +- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_gtt.c | 4 + drivers/gpu/drm/i915/gt/intel_gtt.h | 3 + drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 12 +- drivers/gpu/drm/i915/i915_module.c | 3 + drivers/gpu/drm/i915/i915_vma.c | 205 +++++++++-- drivers/gpu/drm/i915/i915_vma.h | 3 +- drivers/gpu/drm/i915/i915_vma_resource.c | 354 +++++++++++++++++-- drivers/gpu/drm/i915/i915_vma_resource.h | 48 +++ 11 files changed, 579 insertions(+), 67 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c index 8653855d808b..1de306c03aaf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c @@ -142,7 +142,16 @@ int i915_ttm_move_notify(struct ttm_buffer_object *bo) struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); int ret; - ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE); + /* + * Note: The async unbinding here will actually transform the + * blocking wait for unbind into a wait before finally submitting + * evict / migration blit and thus stall the migration timeline + * which may not be good for overall throughput. We should make + * sure we await the unbind fences *after* the migration blit + * instead of *before* as we currently do. + */ + ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE | + I915_GEM_OBJECT_UNBIND_ASYNC); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index e49b6250c4b7..a1b2761bc16e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -142,7 +142,7 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm) continue; if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) { - __i915_vma_evict(vma); + __i915_vma_evict(vma, false); drm_mm_remove_node(&vma->node); } } diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index a94be0306464..46be4197b93f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -161,6 +161,9 @@ static void __i915_vm_release(struct work_struct *work) struct i915_address_space *vm = container_of(work, struct i915_address_space, release_work); + /* Synchronize async unbinds. */ + i915_vma_resource_bind_dep_sync_all(vm);
vm->cleanup(vm); i915_address_space_fini(vm); @@ -189,6 +192,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) if (!kref_read(&vm->resv_ref)) kref_init(&vm->resv_ref); + vm->pending_unbind = RB_ROOT_CACHED; INIT_WORK(&vm->release_work, __i915_vm_release); atomic_set(&vm->open, 1); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 676b839d1a34..8073438b67c8 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -265,6 +265,9 @@ struct i915_address_space { /* Flags used when creating page-table objects for this vm */ unsigned long lmem_pt_obj_flags; + /* Interval tree for pending unbind vma resources */ + struct rb_root_cached pending_unbind;
struct drm_i915_gem_object * (*alloc_pt_dma)(struct i915_address_space *vm, int sz); struct drm_i915_gem_object * diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a44e0c3298fc..6caec2eca8cd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1661,6 +1661,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj, #define I915_GEM_OBJECT_UNBIND_BARRIER BIT(1) #define I915_GEM_OBJECT_UNBIND_TEST BIT(2) #define I915_GEM_OBJECT_UNBIND_VM_TRYLOCK BIT(3) +#define I915_GEM_OBJECT_UNBIND_ASYNC BIT(4) void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3729ec01b5bc..7188db42c85b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -157,10 +157,16 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj, spin_unlock(&obj->vma.lock); if (vma) { + bool vm_trylock = !!(flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK); ret = -EBUSY; - if (flags & I915_GEM_OBJECT_UNBIND_ACTIVE || - !i915_vma_is_active(vma)) { - if (flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK) { + if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) { + assert_object_held(vma->obj); + ret = i915_vma_unbind_async(vma, vm_trylock); + }
+ if (ret == -EBUSY && (flags & I915_GEM_OBJECT_UNBIND_ACTIVE || + !i915_vma_is_active(vma))) { + if (vm_trylock) { if (mutex_trylock(&vma->vm-
mutex)) {
ret = __i915_vma_unbind(vma); mutex_unlock(&vma-
vm->mutex);
diff --git a/drivers/gpu/drm/i915/i915_module.c b/drivers/gpu/drm/i915/i915_module.c index f6bcd2f89257..a8f175960b34 100644 --- a/drivers/gpu/drm/i915/i915_module.c +++ b/drivers/gpu/drm/i915/i915_module.c @@ -17,6 +17,7 @@ #include "i915_scheduler.h" #include "i915_selftest.h" #include "i915_vma.h" +#include "i915_vma_resource.h" static int i915_check_nomodeset(void) { @@ -64,6 +65,8 @@ static const struct { .exit = i915_scheduler_module_exit }, { .init = i915_vma_module_init, .exit = i915_vma_module_exit }, + { .init = i915_vma_resource_module_init, + .exit = i915_vma_resource_module_exit }, { .init = i915_mock_selftests }, { .init = i915_pmu_init, .exit = i915_pmu_exit }, diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 29c770a764aa..5881b0713b1a 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -286,9 +286,10 @@ struct i915_vma_work { struct dma_fence_work base; struct i915_address_space *vm; struct i915_vm_pt_stash stash; - struct i915_vma *vma; + struct i915_vma_resource *vma_res; struct drm_i915_gem_object *pinned; struct i915_sw_dma_fence_cb cb; + struct i915_refct_sgt *rsgt; enum i915_cache_level cache_level; unsigned int flags; }; @@ -296,10 +297,11 @@ struct i915_vma_work { static void __vma_bind(struct dma_fence_work *work) { struct i915_vma_work *vw = container_of(work, typeof(*vw), base); - struct i915_vma *vma = vw->vma; + struct i915_vma_resource *vma_res = vw->vma_res;
+ vma_res->ops->bind_vma(vma_res->vm, &vw->stash, + vma_res, vw->cache_level, vw-
flags);
- vma->ops->bind_vma(vw->vm, &vw->stash, - vma->resource, vw->cache_level, vw-
flags);
} static void __vma_release(struct dma_fence_work *work) @@ -311,6 +313,10 @@ static void __vma_release(struct dma_fence_work *work) i915_vm_free_pt_stash(vw->vm, &vw->stash); i915_vm_put(vw->vm); + if (vw->vma_res) + i915_vma_resource_put(vw->vma_res); + if (vw->rsgt) + i915_refct_sgt_put(vw->rsgt); } static const struct dma_fence_work_ops bind_ops = { @@ -380,13 +386,11 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res, { struct drm_i915_gem_object *obj = vma->obj; - i915_vma_resource_init(vma_res, vma->pages, &vma-
page_sizes,
+ i915_vma_resource_init(vma_res, vma->vm, vma->pages, &vma-
page_sizes,
i915_gem_object_is_readonly(obj), i915_gem_object_is_lmem(obj), - vma->private, - vma->node.start, - vma->node.size, - vma->size); + vma->ops, vma->private, vma-
node.start,
+ vma->node.size, vma->size); } /** @@ -410,6 +414,7 @@ int i915_vma_bind(struct i915_vma *vma, { u32 bind_flags; u32 vma_flags; + int ret; lockdep_assert_held(&vma->vm->mutex); GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); @@ -418,12 +423,12 @@ int i915_vma_bind(struct i915_vma *vma, if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start, vma->node.size, vma->vm->total))) { - kfree(vma_res); + i915_vma_resource_free(vma_res); return -ENODEV; } if (GEM_DEBUG_WARN_ON(!flags)) { - kfree(vma_res); + i915_vma_resource_free(vma_res); return -EINVAL; } @@ -435,12 +440,30 @@ int i915_vma_bind(struct i915_vma *vma, bind_flags &= ~vma_flags; if (bind_flags == 0) { - kfree(vma_res); + i915_vma_resource_free(vma_res); return 0; } GEM_BUG_ON(!atomic_read(&vma->pages_count)); + /* Wait for or await async unbinds touching our range */ + if (work && bind_flags & vma->vm->bind_async_flags) + ret = i915_vma_resource_bind_dep_await(vma->vm, + &work-
base.chain,
+ vma-
node.start,
+ vma-
node.size,
+ true, + GFP_NOWAIT | + __GFP_RETRY_MAYFAIL | + __GFP_NOWARN); + else + ret = i915_vma_resource_bind_dep_sync(vma->vm, vma-
node.start,
+ vma-
node.size, true);
+ if (ret) { + i915_vma_resource_free(vma_res); + return ret; + }
if (vma->resource || !vma_res) { /* Rebinding with an additional I915_VMA_*_BIND */ GEM_WARN_ON(!vma_flags);
You missed an extra `kfree(vma_res)` on the next line from here (not in patch context), which would presumably need to be `i915_vma_resource_free(vma_res)`
It should, thanks. I see I've missed that in an error path as well, as you mention below.
Further to that, you are freeing it where it can already be known to be null, which, while relying on known zero side effect of double free will work, looks very unintuitive.
This stems from NULL checking before kfree() is typically not done. IIRC even checkpatch.pl warns if doing that, but with transition to kmem_cache_free() that doesn't hold anymore so I need to fix this up.
Firther to that, once this series is applied, `i915_vma_bind` only appears to be called from `eb_relocate_entry()` with `vma_res` always null, and from `i915_vma_pin_ww()` where it allocates `vma_res`.
It would be much cleaner to just allocate the `vma_res` in `i915_vma_bind` if required. That way it will avoid the null asignment on return to `i915_vma_pin_ww` with the equally unintuitive `kfree(vma_res)` during the error fallthrough path at the end of `i915_vma_pin_ww` on a knowingly null pointer (also that would need to be a `i915_vma_resource_free` if you keep it the way it is).
We can't allocate (GFP_KERNEL) under the vm mutex, hence need to pass it as an argument.
Thanks, Thomas