Re: [RFC PATCH 3/5] drm/amdgpu: Allow explicit sync for VM ops.

3 Jun 2022

On Fri, Jun 3, 2022 at 12:16 PM Christian König
christian.koenig@amd.com wrote:
...
Am 03.06.22 um 12:08 schrieb Bas Nieuwenhuizen:
...
[SNIP]
...
...
I do have to fix some stuff indeed, especially for the GEM close but
with that we should be able to keep the same basic approach?
Nope, not even remotely.
What we need is the following:

Rolling out my drm_exec patch set, so that we can lock buffers as needed.
When we get a VM operation we not only lock the VM page tables, but

also all buffers we potentially need to unmap.
3. Nuking the freed list in the amdgpu_vm structure by updating freed
areas directly when they are unmapped.
4. Tracking those updates inside the bo_va structure for the BO+VM
combination.
5. When the bo_va structure is destroy because of closing the handle
move the last clear operation over to the VM as implicit sync.
Hi Christian, isn't that a different problem though (that we're also
trying to solve, but in your series)?
What this patch tries to achieve:
(t+0) CS submission setting BOOKKEEP fences (i.e. no implicit sync)
(t+1) a VM operation on a BO/VM accessed by the CS.
to run concurrently. What it *doesn't* try is
(t+0) a VM operation on a BO/VM accessed by the CS.
(t+1) CS submission setting BOOKKEEP fences (i.e. no implicit sync)
to run concurrently. When you write
...
Only when all this is done we then can resolve the dependency that the
CS currently must wait for any clear operation on the VM.
isn't that all about the second problem?
No, it's the same.
See what we do in the VM code is to artificially insert a bubble so that
all VM clear operations wait for all CS operations and then use the
clear fence to indicate when the backing store of the BO can be freed.
Isn't that remediated with something like the code below? At least the
gem_close case should be handled with this, and the move case was
already handled by the copy operation.

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -187,6 +187,39 @@ static int amdgpu_gem_object_open(struct
drm_gem_object *obj,
       return 0;
}
+static void dma_resv_copy(struct dma_resv *src, struct dma_resv *dst)
+{
+       struct dma_resv_iter cursor;
+       struct dma_fence *f;
+       int r;
+       unsigned num_fences = 0;
+
+       if (src == dst)
+               return;
+
+       /* We assume the later loops get the same fences as the caller should
+        * lock the resv. */
+       dma_resv_for_each_fence(&cursor, src, DMA_RESV_USAGE_BOOKKEEP, f) {
+               ++num_fences;
+               dma_fence_put(f);
+       }
+
+       r = dma_resv_reserve_fences(dst, num_fences);
+       if (r) {
+               /* As last resort on OOM we block for the fence */
+               dma_resv_for_each_fence(&cursor, src,
DMA_RESV_USAGE_BOOKKEEP, f) {
+                       dma_fence_wait(f, false);
+                       dma_fence_put(f);
+               }
+       }
+
+       dma_resv_for_each_fence(&cursor, src, DMA_RESV_USAGE_BOOKKEEP, f) {
+               dma_resv_add_fence(dst, f, dma_resv_iter_usage(&cursor));
+               dma_fence_put(f);
+       }
+}
+
+
static void amdgpu_gem_object_close(struct drm_gem_object *obj,
                                   struct drm_file *file_priv)
{
@@ -233,6 +266,8 @@ static void amdgpu_gem_object_close(struct
drm_gem_object *obj,
       amdgpu_bo_fence(bo, fence, true);
       dma_fence_put(fence);
+       dma_resv_copy(vm->root.bo->tbo.base.resv, bo->tbo.base.resv);
+
out_unlock:
       if (unlikely(r < 0))
               dev_err(adev->dev, "failed to clear page "
...
When you want to remove this bubble (which is certainly a good idea) you
need to first come up with a different approach to handle the clear
operations.
Regards,
Christian.
...
...
Regards,
Christian.

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [RFC PATCH 3/5] drm/amdgpu: Allow explicit sync for VM ops.