On Wed, 2022-01-05 at 15:52 +0000, Matthew Auld wrote:
On 04/01/2022 12:51, Thomas Hellström wrote:
Implement async (non-blocking) unbinding by not syncing the vma before calling unbind on the vma_resource. Add the resulting unbind fence to the object's dma_resv from where it is picked up by the ttm migration code. Ideally these unbind fences should be coalesced with the migration blit fence to avoid stalling the migration blit waiting for unbind, as they can certainly go on in parallel, but since we don't yet have a reasonable data structure to use to coalesce fences and attach the resulting fence to a timeline, we defer that for now.
Note that with async unbinding, even while the unbind waits for the preceding bind to complete before unbinding, the vma itself might have been destroyed in the process, clearing the vma pages. Therefore we can only allow async unbinding if we have a refcounted sg-list and keep a refcount on that for the vma resource pages to stay intact until binding occurs. If this condition is not met, a request for an async unbind is diverted to a sync unbind.
v2:
- Use a separate kmem_cache for vma resources for now to isolate
their memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread.
Regardless of whether the vm is closed, we need the unbind fence to properly wait for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
pending unbind. (Matthew Auld) v5:
- Fix timeout and error check in
i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
async unbind capable.
- Fix braces around a single-line if statement.
Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
<snip>
@@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma, bind_flags &= ~vma_flags; if (bind_flags == 0) { - kfree(vma_res); + i915_vma_resource_free(vma_res); return 0; } GEM_BUG_ON(!atomic_read(&vma->pages_count)); + /* Wait for or await async unbinds touching our range */ + if (work && bind_flags & vma->vm->bind_async_flags) + ret = i915_vma_resource_bind_dep_await(vma->vm, + &work-
base.chain,
+ vma-
node.start,
+ vma-
node.size,
+ true, + GFP_NOWAIT | + __GFP_RETRY_MAYFAIL | + __GFP_NOWARN); + else + ret = i915_vma_resource_bind_dep_sync(vma->vm, vma-
node.start,
+ vma-
node.size, true);
+ if (ret) { + i915_vma_resource_free(vma_res); + return ret; + }
if (vma->resource || !vma_res) { /* Rebinding with an additional I915_VMA_*_BIND */ GEM_WARN_ON(!vma_flags); @@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma, if (work && bind_flags & vma->vm->bind_async_flags) { struct dma_fence *prev; - work->vma = vma; + work->vma_res = i915_vma_resource_get(vma-
resource);
work->cache_level = cache_level; work->flags = bind_flags; + if (vma->obj->mm.rsgt) + work->rsgt = i915_refct_sgt_get(vma->obj-
mm.rsgt);
Hmmm, at a glance I would have expected this to use the vma->pages. I think with the GGTT the vma will often create its own sg layout which != obj->mm.sgt. IIUC the async unbind will still call vma_unbind_pages which might nuke the vma sgt? Or is something else going on here?
Yes, the binding code is only using vma_res->pages, which should have been copied from vma->pages, and keeps a reference to the rsgt just in case we do an async unbind.
However good point we should refuse async unbind for now if vma_res-
pages != &rsgt->table, because the former might otherwise be nuked
before the async unbind actually happens. Will fix that for next version.
/Thomas