On Fri, Jan 15, 2021 at 02:35:50PM +0100, Christian König wrote:
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT can't be used when a reservation object lock is help or otherwise we can deadlock with page faults.
Make lockdep complain badly about that.
Signed-off-by: Christian König christian.koenig@amd.com
drivers/gpu/drm/drm_syncobj.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 6e74e6745eca..6228e9cd089a 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -387,6 +387,20 @@ int drm_syncobj_find_fence(struct drm_file *file_private, if (!syncobj) return -ENOENT;
- if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT &&
IS_ENABLED(CONFIG_LOCKDEP)) {
struct dma_resv robj;
/* Waiting for userspace with a reservation lock help is illegal
* cause that can deadlock with page faults. Make lockdep
* complain about it early on.
Not sure this is a good enough explanation, since anything that holds up userspace can result in a functional deadlock (i.e. user observes no forward progress, gets angry and decides that our gpu driver stack is garbage). It's by far not pagefault.
I'd put something like
/* We must not impede forward progress of userspace in any * way, for otherwise the future fence never materializes * and the application grinds to a full halt. Check for * the worst offenders in terms of locking issues. */
Feel free to bikeshed further.
*/
dma_resv_init(&robj);
dma_resv_lock(&robj, NULL);
dma_resv_unlock(&robj);
dma_resv_fini(&robj);
I think you want to go stronger, since it's not just dma_resv, it's holding anything that might hold up userspace that's illegal here. A lockdep_assert_no_locks_held might be ideal, but a good second-best option would be to grab mmap_lock. Since dma_resv (and a lot of other things, like gup in general) nest within that it would be a substantially stronger asssertion.
Specifically this should also go boom when you do it in places like serving (hmm) page faults, which I think we want. Just locking dma_resv_lock wont go boom like that (since taking the dma_resv_lock from a page fault handler is explicitly allowed, it nests within mmap_lock).
Conceptually I think it's otherwise all fine and at the right spot. -Daniel
- }
- *fence = drm_syncobj_fence_get(syncobj); drm_syncobj_put(syncobj);
-- 2.25.1