On Mon, 29 Nov 2021 at 13:58, Maarten Lankhorst maarten.lankhorst@linux.intel.com wrote:
We want to remove more members of i915_vma, which requires the locking to be held more often.
Start requiring gem object lock for i915_vma_unbind, as it's one of the callers that may unpin pages.
Some special care is needed when evicting, because the last reference to the object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, and we need to cache vma->obj before unlocking.
Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
<snip>
@@ -129,22 +129,47 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm)
drm_WARN_ON(&vm->i915->drm, !vm->is_ggtt && !vm->is_dpt);
+retry:
i915_gem_drain_freed_objects(vm->i915);
mutex_lock(&vm->mutex); /* Skip rewriting PTE on VMA unbind. */ open = atomic_xchg(&vm->open, 0); list_for_each_entry_safe(vma, vn, &vm->bound_list, vm_link) {
struct drm_i915_gem_object *obj = vma->obj;
GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
i915_vma_wait_for_bind(vma);
if (i915_vma_is_pinned(vma))
if (i915_vma_is_pinned(vma) || !i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) continue;
if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) {
__i915_vma_evict(vma);
drm_mm_remove_node(&vma->node);
/* unlikely to race when GPU is idle, so no worry about slowpath.. */
if (!i915_gem_object_trylock(obj, NULL)) {
atomic_set(&vm->open, open);
Does this need a comment about barriers?
i915_gem_object_get(obj);
Should this not be kref_get_unless_zero? Assuming the vm->mutex is the only thing keeping the object alive here, won't this lead to potential uaf/double-free or something? Also should we not plonk this before the trylock? Or maybe I'm missing something here?
mutex_unlock(&vm->mutex);
i915_gem_object_lock(obj, NULL);
open = i915_vma_unbind(vma);
i915_gem_object_unlock(obj);
GEM_WARN_ON(open);
i915_gem_object_put(obj);
goto retry; }
i915_vma_wait_for_bind(vma);
We also call wait_for_bind above, is that intentional?