The commit "drm/ttm: make ttm reservation calls behave like reservation calls" introduced a false lockdep warning on vmwgfx that hit on surface eviction, due to a recursive reservation. The fix silences the warning by using a tryreserve, which is not optimal, but will do for now
If no reservation ticket is given to the execbuf reservation utilities, try reservation with non-blocking semantics. This is intended for eviction paths that use the execbuf reservation utilities for convenience rather than for deadlock avoidance.
Signed-off-by: Thomas Hellstrom thellstrom@vmware.com --- drivers/gpu/drm/ttm/ttm_execbuf_util.c | 32 +++++++++++++++++++------------- include/drm/ttm/ttm_execbuf_util.h | 3 ++- 2 files changed, 21 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/ttm/ttm_execbuf_util.c b/drivers/gpu/drm/ttm/ttm_execbuf_util.c index 6c91178..479e941 100644 --- a/drivers/gpu/drm/ttm/ttm_execbuf_util.c +++ b/drivers/gpu/drm/ttm/ttm_execbuf_util.c @@ -32,8 +32,7 @@ #include <linux/sched.h> #include <linux/module.h>
-static void ttm_eu_backoff_reservation_locked(struct list_head *list, - struct ww_acquire_ctx *ticket) +static void ttm_eu_backoff_reservation_locked(struct list_head *list) { struct ttm_validate_buffer *entry;
@@ -93,8 +92,9 @@ void ttm_eu_backoff_reservation(struct ww_acquire_ctx *ticket, entry = list_first_entry(list, struct ttm_validate_buffer, head); glob = entry->bo->glob; spin_lock(&glob->lru_lock); - ttm_eu_backoff_reservation_locked(list, ticket); - ww_acquire_fini(ticket); + ttm_eu_backoff_reservation_locked(list); + if (ticket) + ww_acquire_fini(ticket); spin_unlock(&glob->lru_lock); } EXPORT_SYMBOL(ttm_eu_backoff_reservation); @@ -130,7 +130,8 @@ int ttm_eu_reserve_buffers(struct ww_acquire_ctx *ticket, entry = list_first_entry(list, struct ttm_validate_buffer, head); glob = entry->bo->glob;
- ww_acquire_init(ticket, &reservation_ww_class); + if (ticket) + ww_acquire_init(ticket, &reservation_ww_class); retry: list_for_each_entry(entry, list, head) { struct ttm_buffer_object *bo = entry->bo; @@ -139,16 +140,17 @@ retry: if (entry->reserved) continue;
- - ret = ttm_bo_reserve_nolru(bo, true, false, true, ticket); + ret = ttm_bo_reserve_nolru(bo, true, (ticket == NULL), true, + ticket);
if (ret == -EDEADLK) { /* uh oh, we lost out, drop every reservation and try * to only reserve this buffer, then start over if * this succeeds. */ + BUG_ON(ticket == NULL); spin_lock(&glob->lru_lock); - ttm_eu_backoff_reservation_locked(list, ticket); + ttm_eu_backoff_reservation_locked(list); spin_unlock(&glob->lru_lock); ttm_eu_list_ref_sub(list); ret = ww_mutex_lock_slow_interruptible(&bo->resv->lock, @@ -175,7 +177,8 @@ retry: } }
- ww_acquire_done(ticket); + if (ticket) + ww_acquire_done(ticket); spin_lock(&glob->lru_lock); ttm_eu_del_from_lru_locked(list); spin_unlock(&glob->lru_lock); @@ -184,12 +187,14 @@ retry:
err: spin_lock(&glob->lru_lock); - ttm_eu_backoff_reservation_locked(list, ticket); + ttm_eu_backoff_reservation_locked(list); spin_unlock(&glob->lru_lock); ttm_eu_list_ref_sub(list); err_fini: - ww_acquire_done(ticket); - ww_acquire_fini(ticket); + if (ticket) { + ww_acquire_done(ticket); + ww_acquire_fini(ticket); + } return ret; } EXPORT_SYMBOL(ttm_eu_reserve_buffers); @@ -224,7 +229,8 @@ void ttm_eu_fence_buffer_objects(struct ww_acquire_ctx *ticket, } spin_unlock(&bdev->fence_lock); spin_unlock(&glob->lru_lock); - ww_acquire_fini(ticket); + if (ticket) + ww_acquire_fini(ticket);
list_for_each_entry(entry, list, head) { if (entry->old_sync_obj) diff --git a/include/drm/ttm/ttm_execbuf_util.h b/include/drm/ttm/ttm_execbuf_util.h index ec8a1d3..16db7d0 100644 --- a/include/drm/ttm/ttm_execbuf_util.h +++ b/include/drm/ttm/ttm_execbuf_util.h @@ -70,7 +70,8 @@ extern void ttm_eu_backoff_reservation(struct ww_acquire_ctx *ticket, /** * function ttm_eu_reserve_buffers * - * @ticket: [out] ww_acquire_ctx returned by call. + * @ticket: [out] ww_acquire_ctx filled in by call, or NULL if only + * non-blocking reserves should be tried. * @list: thread private list of ttm_validate_buffer structs. * * Tries to reserve bos pointed to by the list entries for validation.
A lockdep warning is hit when evicting surfaces and reserving the backup buffer. Since this buffer can only be reserved by the process holding the surface reservation or by the buffer eviction processes that use tryreserve, there is no real deadlock here, but there's no other way to silence lockdep than to use a tryreserve. This means the reservation might fail if the buffer is about to be evicted or swapped out, but we now have code in place to handle that reasonably well.
Signed-off-by: Thomas Hellstrom thellstrom@vmware.com --- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index 0e67cf4..4ea0be2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -992,7 +992,6 @@ void vmw_resource_unreserve(struct vmw_resource *res, */ static int vmw_resource_check_buffer(struct vmw_resource *res, - struct ww_acquire_ctx *ticket, bool interruptible, struct ttm_validate_buffer *val_buf) { @@ -1009,7 +1008,7 @@ vmw_resource_check_buffer(struct vmw_resource *res, INIT_LIST_HEAD(&val_list); val_buf->bo = ttm_bo_reference(&res->backup->base); list_add_tail(&val_buf->head, &val_list); - ret = ttm_eu_reserve_buffers(ticket, &val_list); + ret = ttm_eu_reserve_buffers(NULL, &val_list); if (unlikely(ret != 0)) goto out_no_reserve;
@@ -1027,12 +1026,13 @@ vmw_resource_check_buffer(struct vmw_resource *res, return 0;
out_no_validate: - ttm_eu_backoff_reservation(ticket, &val_list); + ttm_eu_backoff_reservation(NULL, &val_list); out_no_reserve: ttm_bo_unref(&val_buf->bo); if (backup_dirty) vmw_dmabuf_unreference(&res->backup);
+ DRM_INFO("Check buffer ret %d\n", ret); return ret; }
@@ -1072,8 +1072,7 @@ int vmw_resource_reserve(struct vmw_resource *res, bool no_backup) * @val_buf: Backup buffer information. */ static void -vmw_resource_backoff_reservation(struct ww_acquire_ctx *ticket, - struct ttm_validate_buffer *val_buf) +vmw_resource_backoff_reservation(struct ttm_validate_buffer *val_buf) { struct list_head val_list;
@@ -1082,7 +1081,7 @@ vmw_resource_backoff_reservation(struct ww_acquire_ctx *ticket,
INIT_LIST_HEAD(&val_list); list_add_tail(&val_buf->head, &val_list); - ttm_eu_backoff_reservation(ticket, &val_list); + ttm_eu_backoff_reservation(NULL, &val_list); ttm_bo_unref(&val_buf->bo); }
@@ -1096,13 +1095,12 @@ int vmw_resource_do_evict(struct vmw_resource *res) { struct ttm_validate_buffer val_buf; const struct vmw_res_func *func = res->func; - struct ww_acquire_ctx ticket; int ret;
BUG_ON(!func->may_evict);
val_buf.bo = NULL; - ret = vmw_resource_check_buffer(res, &ticket, true, &val_buf); + ret = vmw_resource_check_buffer(res, true, &val_buf); if (unlikely(ret != 0)) return ret;
@@ -1117,7 +1115,7 @@ int vmw_resource_do_evict(struct vmw_resource *res) res->backup_dirty = true; res->res_dirty = false; out_no_unbind: - vmw_resource_backoff_reservation(&ticket, &val_buf); + vmw_resource_backoff_reservation(&val_buf);
return ret; }
On Fri, Nov 15, 2013 at 12:24:32AM -0800, Thomas Hellstrom wrote:
A lockdep warning is hit when evicting surfaces and reserving the backup buffer. Since this buffer can only be reserved by the process holding the surface reservation or by the buffer eviction processes that use tryreserve, there is no real deadlock here, but there's no other way to silence lockdep than to use a tryreserve. This means the reservation might fail if the buffer is about to be evicted or swapped out, but we now have code in place to handle that reasonably well.
Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
Hm, for similar cases where there's an additional hirarchy imposed onto the locking order lockdep supports subclases. Block devices use that to nest partitions within the overall block device.
Have you looked into wiring this up for ww_mutexes, i.e. fix lockdep instaed of working around it in the code? I'm thinking of a ww_mutex_lock_nested similar to mutex_lock_nested.
Cheers, Daniel
drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c index 0e67cf4..4ea0be2 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c @@ -992,7 +992,6 @@ void vmw_resource_unreserve(struct vmw_resource *res, */ static int vmw_resource_check_buffer(struct vmw_resource *res,
struct ww_acquire_ctx *ticket, bool interruptible, struct ttm_validate_buffer *val_buf)
{ @@ -1009,7 +1008,7 @@ vmw_resource_check_buffer(struct vmw_resource *res, INIT_LIST_HEAD(&val_list); val_buf->bo = ttm_bo_reference(&res->backup->base); list_add_tail(&val_buf->head, &val_list);
- ret = ttm_eu_reserve_buffers(ticket, &val_list);
- ret = ttm_eu_reserve_buffers(NULL, &val_list); if (unlikely(ret != 0)) goto out_no_reserve;
@@ -1027,12 +1026,13 @@ vmw_resource_check_buffer(struct vmw_resource *res, return 0;
out_no_validate:
- ttm_eu_backoff_reservation(ticket, &val_list);
- ttm_eu_backoff_reservation(NULL, &val_list);
out_no_reserve: ttm_bo_unref(&val_buf->bo); if (backup_dirty) vmw_dmabuf_unreference(&res->backup);
- DRM_INFO("Check buffer ret %d\n", ret); return ret;
}
@@ -1072,8 +1072,7 @@ int vmw_resource_reserve(struct vmw_resource *res, bool no_backup)
- @val_buf: Backup buffer information.
*/ static void -vmw_resource_backoff_reservation(struct ww_acquire_ctx *ticket,
struct ttm_validate_buffer *val_buf)
+vmw_resource_backoff_reservation(struct ttm_validate_buffer *val_buf) { struct list_head val_list;
@@ -1082,7 +1081,7 @@ vmw_resource_backoff_reservation(struct ww_acquire_ctx *ticket,
INIT_LIST_HEAD(&val_list); list_add_tail(&val_buf->head, &val_list);
- ttm_eu_backoff_reservation(ticket, &val_list);
- ttm_eu_backoff_reservation(NULL, &val_list); ttm_bo_unref(&val_buf->bo);
}
@@ -1096,13 +1095,12 @@ int vmw_resource_do_evict(struct vmw_resource *res) { struct ttm_validate_buffer val_buf; const struct vmw_res_func *func = res->func;
struct ww_acquire_ctx ticket; int ret;
BUG_ON(!func->may_evict);
val_buf.bo = NULL;
ret = vmw_resource_check_buffer(res, &ticket, true, &val_buf);
- ret = vmw_resource_check_buffer(res, true, &val_buf); if (unlikely(ret != 0)) return ret;
@@ -1117,7 +1115,7 @@ int vmw_resource_do_evict(struct vmw_resource *res) res->backup_dirty = true; res->res_dirty = false; out_no_unbind:
- vmw_resource_backoff_reservation(&ticket, &val_buf);
vmw_resource_backoff_reservation(&val_buf);
return ret;
}
1.7.10.4 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On 11/15/2013 10:29 AM, Daniel Vetter wrote:
On Fri, Nov 15, 2013 at 12:24:32AM -0800, Thomas Hellstrom wrote:
A lockdep warning is hit when evicting surfaces and reserving the backup buffer. Since this buffer can only be reserved by the process holding the surface reservation or by the buffer eviction processes that use tryreserve, there is no real deadlock here, but there's no other way to silence lockdep than to use a tryreserve. This means the reservation might fail if the buffer is about to be evicted or swapped out, but we now have code in place to handle that reasonably well.
Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
Hm, for similar cases where there's an additional hirarchy imposed onto the locking order lockdep supports subclases. Block devices use that to nest partitions within the overall block device.
Have you looked into wiring this up for ww_mutexes, i.e. fix lockdep instaed of working around it in the code? I'm thinking of a ww_mutex_lock_nested similar to mutex_lock_nested.
Cheers, Daniel
Yeah, I've thought of that, and that would definitely come in handy for upcoming page-table-bos where reservation should not fail, and where I need to play other tricks, and I might have a chance to look at that before merging that code.
However in this particular case I'm not sure it will work, because the same buffers can be reserved at different nesting levels either as part of command submission or as part of eviction. In the block device analogy, a mutex could be taken both as a device mutex or a partition mutex. I haven't looked into the lock_nested semantics close enough to figure out whether that would work or confuse lockdep beyond recovery :).
In any case, this needs a quick workaround for now.
/Thomas
dri-devel@lists.freedesktop.org