Hi all,
While looking at (dynamic) dma-buf issues and ideas I've spotted a bit more room for locking down the cross-driver dma_resv rules.
Only functional fallout is a tiny patch for msm, assuming I didn't botch anything in the auditing of all relevant code.
Comments, review and testing very much appreciated, as usual.
Cheers, Daniel
Daniel Vetter (3): drm/modeset: Prime modeset lock vs dma_resv dma-resv: Also prime acquire ctx for lockdep drm/msm: Don't init ww_mutec acquire ctx before needed
drivers/dma-buf/dma-resv.c | 8 +++++++- drivers/gpu/drm/drm_mode_config.c | 28 ++++++++++++++++++++++++++++ drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- 3 files changed, 36 insertions(+), 2 deletions(-)
It's kinda really hard to get this wrong on a driver with both display and dma_resv locking. But who ever knows, so better to make sure that really all drivers nest these the same way.
For actual lock semantics the acquire context nesting doesn't matter. But to teach lockdep what's going on with ww_mutex the acquire ctx is a fake lockdep lock, hence from a lockdep pov it does matter. That's why I figured better to include it.
Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Christian König christian.koenig@amd.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com --- drivers/gpu/drm/drm_mode_config.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c index 3b570a404933..08e6eff6a179 100644 --- a/drivers/gpu/drm/drm_mode_config.c +++ b/drivers/gpu/drm/drm_mode_config.c @@ -27,6 +27,7 @@ #include <drm/drm_file.h> #include <drm/drm_mode_config.h> #include <drm/drm_print.h> +#include <linux/dma-resv.h>
#include "drm_crtc_internal.h" #include "drm_internal.h" @@ -415,6 +416,33 @@ void drm_mode_config_init(struct drm_device *dev) dev->mode_config.num_crtc = 0; dev->mode_config.num_encoder = 0; dev->mode_config.num_total_plane = 0; + + if (IS_ENABLED(CONFIG_LOCKDEP)) { + struct drm_modeset_acquire_ctx modeset_ctx; + struct ww_acquire_ctx resv_ctx; + struct dma_resv resv; + int ret; + + dma_resv_init(&resv); + + drm_modeset_acquire_init(&modeset_ctx, 0); + ret = drm_modeset_lock(&dev->mode_config.connection_mutex, + &modeset_ctx); + if (ret == -EDEADLK) + ret = drm_modeset_backoff(&modeset_ctx); + + ww_acquire_init(&resv_ctx, &reservation_ww_class); + ret = dma_resv_lock(&resv, &resv_ctx); + if (ret == -EDEADLK) + dma_resv_lock_slow(&resv, &resv_ctx); + + dma_resv_unlock(&resv); + ww_acquire_fini(&resv_ctx); + + drm_modeset_drop_locks(&modeset_ctx); + drm_modeset_acquire_fini(&modeset_ctx); + dma_resv_fini(&resv); + } } EXPORT_SYMBOL(drm_mode_config_init);
Am 19.11.19 um 22:08 schrieb Daniel Vetter:
It's kinda really hard to get this wrong on a driver with both display and dma_resv locking. But who ever knows, so better to make sure that really all drivers nest these the same way.
For actual lock semantics the acquire context nesting doesn't matter. But to teach lockdep what's going on with ww_mutex the acquire ctx is a fake lockdep lock, hence from a lockdep pov it does matter. That's why I figured better to include it.
Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Christian König christian.koenig@amd.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Why not add another __init function like we did for dma_resv? That looked rather clean to me.
Either why feel free to add an Acked-by: Christian König christian.koenig@amd.com to the patch.
Regards, Christian.
drivers/gpu/drm/drm_mode_config.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c index 3b570a404933..08e6eff6a179 100644 --- a/drivers/gpu/drm/drm_mode_config.c +++ b/drivers/gpu/drm/drm_mode_config.c @@ -27,6 +27,7 @@ #include <drm/drm_file.h> #include <drm/drm_mode_config.h> #include <drm/drm_print.h> +#include <linux/dma-resv.h>
#include "drm_crtc_internal.h" #include "drm_internal.h" @@ -415,6 +416,33 @@ void drm_mode_config_init(struct drm_device *dev) dev->mode_config.num_crtc = 0; dev->mode_config.num_encoder = 0; dev->mode_config.num_total_plane = 0;
- if (IS_ENABLED(CONFIG_LOCKDEP)) {
struct drm_modeset_acquire_ctx modeset_ctx;
struct ww_acquire_ctx resv_ctx;
struct dma_resv resv;
int ret;
dma_resv_init(&resv);
drm_modeset_acquire_init(&modeset_ctx, 0);
ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
&modeset_ctx);
if (ret == -EDEADLK)
ret = drm_modeset_backoff(&modeset_ctx);
ww_acquire_init(&resv_ctx, &reservation_ww_class);
ret = dma_resv_lock(&resv, &resv_ctx);
if (ret == -EDEADLK)
dma_resv_lock_slow(&resv, &resv_ctx);
dma_resv_unlock(&resv);
ww_acquire_fini(&resv_ctx);
drm_modeset_drop_locks(&modeset_ctx);
drm_modeset_acquire_fini(&modeset_ctx);
dma_resv_fini(&resv);
- } } EXPORT_SYMBOL(drm_mode_config_init);
On Wed, Nov 20, 2019 at 09:34:03AM +0100, Christian König wrote:
Am 19.11.19 um 22:08 schrieb Daniel Vetter:
It's kinda really hard to get this wrong on a driver with both display and dma_resv locking. But who ever knows, so better to make sure that really all drivers nest these the same way.
For actual lock semantics the acquire context nesting doesn't matter. But to teach lockdep what's going on with ww_mutex the acquire ctx is a fake lockdep lock, hence from a lockdep pov it does matter. That's why I figured better to include it.
Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Christian König christian.koenig@amd.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Why not add another __init function like we did for dma_resv? That looked rather clean to me.
Hm, I've only done that because callers of dma_resv_init where holding lots of locks already (ttm ghost objects). At least in i915 we try to do all lockdep priming as close as possible to the mutex_init calls, so it's all together. Since often there's no separate obj_init function, and you need to use the same mutex_init to make sure you have the same static lockdep key.
No strong opinion here from me, just wanted to explain why I'm biased to this way of doing things.
Either why feel free to add an Acked-by: Christian König christian.koenig@amd.com to the patch.
Thanks. Can you pls also look at patch 2, at least from a ttm/amdgpu pov?
Cheers, Daniel
Regards, Christian.
drivers/gpu/drm/drm_mode_config.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c index 3b570a404933..08e6eff6a179 100644 --- a/drivers/gpu/drm/drm_mode_config.c +++ b/drivers/gpu/drm/drm_mode_config.c @@ -27,6 +27,7 @@ #include <drm/drm_file.h> #include <drm/drm_mode_config.h> #include <drm/drm_print.h> +#include <linux/dma-resv.h> #include "drm_crtc_internal.h" #include "drm_internal.h" @@ -415,6 +416,33 @@ void drm_mode_config_init(struct drm_device *dev) dev->mode_config.num_crtc = 0; dev->mode_config.num_encoder = 0; dev->mode_config.num_total_plane = 0;
- if (IS_ENABLED(CONFIG_LOCKDEP)) {
struct drm_modeset_acquire_ctx modeset_ctx;
struct ww_acquire_ctx resv_ctx;
struct dma_resv resv;
int ret;
dma_resv_init(&resv);
drm_modeset_acquire_init(&modeset_ctx, 0);
ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
&modeset_ctx);
if (ret == -EDEADLK)
ret = drm_modeset_backoff(&modeset_ctx);
ww_acquire_init(&resv_ctx, &reservation_ww_class);
ret = dma_resv_lock(&resv, &resv_ctx);
if (ret == -EDEADLK)
dma_resv_lock_slow(&resv, &resv_ctx);
dma_resv_unlock(&resv);
ww_acquire_fini(&resv_ctx);
drm_modeset_drop_locks(&modeset_ctx);
drm_modeset_acquire_fini(&modeset_ctx);
dma_resv_fini(&resv);
- } } EXPORT_SYMBOL(drm_mode_config_init);
Semnatically it really doesn't matter where we grab the ticket. But since the ticket is a fake lockdep lock, it matters for lockdep validation purposes.
This means stuff like grabbing a ticket and then doing copy_from/to_user isn't allowed anymore. This is a changed compared to the current ttm fault handler, which doesn't bother with having a full reservation. Since I'm looking into fixing the TODO entry in ttm_mem_evict_wait_busy() I think that'll have to change sooner or later anyway, better get started. A bit more context on why I'm looking into this: For backwards compat with existing i915 gem code I think we'll have to do full slowpath locking in the i915 equivalent of the eviction code. And with dynamic dma-buf that will leak across drivers, so another thing we need to standardize and make sure it's done the same way everyway.
Unfortunately this means another full audit of all drivers:
- gem helpers: acquire_init is done right before taking locks, so no problem. Same for acquire_fini and unlocking, which means nothing that's not already covered by the dma_resv_lock rules will be caught with this extension here to the acquire_ctx.
- etnaviv: An absolute massive amount of code is run between the acquire_init and the first lock acquisition in submit_lock_objects. But nothing that would touch user memory and could cause a fault. Furthermore nothing that uses the ticket, so even if I missed something, it would be easy to fix by pushing the acquire_init right before the first use. Similar on the unlock/acquire_fini side.
- i915: Right now (and this will likely change a lot rsn) the acquire ctx and actual locks are right next to each another. No problem.
- msm has a problem: submit_create calls acquire_init, but then submit_lookup_objects() has a bunch of copy_from_user to do the object lookups. That's the only thing before submit_lock_objects call dma_resv_lock(). Despite all the copypasta to etnaviv, etnaviv does not have this issue since it copies all the userspace structs earlier. submit_cleanup does not have any such issues.
With the prep patch to pull out the acquire_ctx and reorder it msm is going to be safe too.
- nouveau: acquire_init is right next to ttm_bo_reserve, so all good. Similar on the acquire_fini/ttm_bo_unreserve side.
- ttm execbuf utils: acquire context and locking are even in the same functions here (one function to reserve everything, the other to unreserve), so all good.
- vc4: Another case where acquire context and locking are handled in the same functions (one function to lock everything, the other to unlock).
Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Christian König christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Huang Rui ray.huang@amd.com Cc: Eric Anholt eric@anholt.net Cc: Ben Skeggs bskeggs@redhat.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Rob Herring robh@kernel.org Cc: Lucas Stach l.stach@pengutronix.de Cc: Russell King linux+etnaviv@armlinux.org.uk Cc: Christian Gmeiner christian.gmeiner@gmail.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Signed-off-by: Daniel Vetter daniel.vetter@intel.com --- drivers/dma-buf/dma-resv.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index d3c760e19991..079e38fde33a 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -100,7 +100,9 @@ static void dma_resv_list_free(struct dma_resv_list *list) static void __init dma_resv_lockdep(void) { struct mm_struct *mm = mm_alloc(); + struct ww_acquire_ctx ctx; struct dma_resv obj; + int ret;
if (!mm) return; @@ -108,10 +110,14 @@ static void __init dma_resv_lockdep(void) dma_resv_init(&obj);
down_read(&mm->mmap_sem); - ww_mutex_lock(&obj.lock, NULL); + ww_acquire_init(&ctx, &reservation_ww_class); + ret = dma_resv_lock(&obj, &ctx); + if (ret == -EDEADLK) + dma_resv_lock_slow(&obj, &ctx); fs_reclaim_acquire(GFP_KERNEL); fs_reclaim_release(GFP_KERNEL); ww_mutex_unlock(&obj.lock); + ww_acquire_fini(&ctx); up_read(&mm->mmap_sem); mmput(mm);
Am 19.11.19 um 22:08 schrieb Daniel Vetter:
Semnatically it really doesn't matter where we grab the ticket. But since the ticket is a fake lockdep lock, it matters for lockdep validation purposes.
This means stuff like grabbing a ticket and then doing copy_from/to_user isn't allowed anymore. This is a changed compared to the current ttm fault handler, which doesn't bother with having a full reservation. Since I'm looking into fixing the TODO entry in ttm_mem_evict_wait_busy() I think that'll have to change sooner or later anyway, better get started. A bit more context on why I'm looking into this: For backwards compat with existing i915 gem code I think we'll have to do full slowpath locking in the i915 equivalent of the eviction code. And with dynamic dma-buf that will leak across drivers, so another thing we need to standardize and make sure it's done the same way everyway.
Unfortunately this means another full audit of all drivers:
gem helpers: acquire_init is done right before taking locks, so no problem. Same for acquire_fini and unlocking, which means nothing that's not already covered by the dma_resv_lock rules will be caught with this extension here to the acquire_ctx.
etnaviv: An absolute massive amount of code is run between the acquire_init and the first lock acquisition in submit_lock_objects. But nothing that would touch user memory and could cause a fault. Furthermore nothing that uses the ticket, so even if I missed something, it would be easy to fix by pushing the acquire_init right before the first use. Similar on the unlock/acquire_fini side.
i915: Right now (and this will likely change a lot rsn) the acquire ctx and actual locks are right next to each another. No problem.
msm has a problem: submit_create calls acquire_init, but then submit_lookup_objects() has a bunch of copy_from_user to do the object lookups. That's the only thing before submit_lock_objects call dma_resv_lock(). Despite all the copypasta to etnaviv, etnaviv does not have this issue since it copies all the userspace structs earlier. submit_cleanup does not have any such issues.
With the prep patch to pull out the acquire_ctx and reorder it msm is going to be safe too.
nouveau: acquire_init is right next to ttm_bo_reserve, so all good. Similar on the acquire_fini/ttm_bo_unreserve side.
ttm execbuf utils: acquire context and locking are even in the same functions here (one function to reserve everything, the other to unreserve), so all good.
vc4: Another case where acquire context and locking are handled in the same functions (one function to lock everything, the other to unlock).
Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Christian König christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Huang Rui ray.huang@amd.com Cc: Eric Anholt eric@anholt.net Cc: Ben Skeggs bskeggs@redhat.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Rob Herring robh@kernel.org Cc: Lucas Stach l.stach@pengutronix.de Cc: Russell King linux+etnaviv@armlinux.org.uk Cc: Christian Gmeiner christian.gmeiner@gmail.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Acked-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-resv.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index d3c760e19991..079e38fde33a 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -100,7 +100,9 @@ static void dma_resv_list_free(struct dma_resv_list *list) static void __init dma_resv_lockdep(void) { struct mm_struct *mm = mm_alloc();
struct ww_acquire_ctx ctx; struct dma_resv obj;
int ret;
if (!mm) return;
@@ -108,10 +110,14 @@ static void __init dma_resv_lockdep(void) dma_resv_init(&obj);
down_read(&mm->mmap_sem);
- ww_mutex_lock(&obj.lock, NULL);
ww_acquire_init(&ctx, &reservation_ww_class);
ret = dma_resv_lock(&obj, &ctx);
if (ret == -EDEADLK)
dma_resv_lock_slow(&obj, &ctx);
fs_reclaim_acquire(GFP_KERNEL); fs_reclaim_release(GFP_KERNEL); ww_mutex_unlock(&obj.lock);
ww_acquire_fini(&ctx); up_read(&mm->mmap_sem);
mmput(mm);
Op 20-11-2019 om 12:30 schreef Christian König:
Am 19.11.19 um 22:08 schrieb Daniel Vetter:
Semnatically it really doesn't matter where we grab the ticket. But since the ticket is a fake lockdep lock, it matters for lockdep validation purposes.
This means stuff like grabbing a ticket and then doing copy_from/to_user isn't allowed anymore. This is a changed compared to the current ttm fault handler, which doesn't bother with having a full reservation. Since I'm looking into fixing the TODO entry in ttm_mem_evict_wait_busy() I think that'll have to change sooner or later anyway, better get started. A bit more context on why I'm looking into this: For backwards compat with existing i915 gem code I think we'll have to do full slowpath locking in the i915 equivalent of the eviction code. And with dynamic dma-buf that will leak across drivers, so another thing we need to standardize and make sure it's done the same way everyway.
Unfortunately this means another full audit of all drivers:
- gem helpers: acquire_init is done right before taking locks, so no
problem. Same for acquire_fini and unlocking, which means nothing that's not already covered by the dma_resv_lock rules will be caught with this extension here to the acquire_ctx.
- etnaviv: An absolute massive amount of code is run between the
acquire_init and the first lock acquisition in submit_lock_objects. But nothing that would touch user memory and could cause a fault. Furthermore nothing that uses the ticket, so even if I missed something, it would be easy to fix by pushing the acquire_init right before the first use. Similar on the unlock/acquire_fini side.
- i915: Right now (and this will likely change a lot rsn) the acquire
ctx and actual locks are right next to each another. No problem.
- msm has a problem: submit_create calls acquire_init, but then
submit_lookup_objects() has a bunch of copy_from_user to do the object lookups. That's the only thing before submit_lock_objects call dma_resv_lock(). Despite all the copypasta to etnaviv, etnaviv does not have this issue since it copies all the userspace structs earlier. submit_cleanup does not have any such issues.
With the prep patch to pull out the acquire_ctx and reorder it msm is going to be safe too.
- nouveau: acquire_init is right next to ttm_bo_reserve, so all good.
Similar on the acquire_fini/ttm_bo_unreserve side.
- ttm execbuf utils: acquire context and locking are even in the same
functions here (one function to reserve everything, the other to unreserve), so all good.
- vc4: Another case where acquire context and locking are handled in
the same functions (one function to lock everything, the other to unlock).
Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Christian König christian.koenig@amd.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Huang Rui ray.huang@amd.com Cc: Eric Anholt eric@anholt.net Cc: Ben Skeggs bskeggs@redhat.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Rob Herring robh@kernel.org Cc: Lucas Stach l.stach@pengutronix.de Cc: Russell King linux+etnaviv@armlinux.org.uk Cc: Christian Gmeiner christian.gmeiner@gmail.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Acked-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-resv.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index d3c760e19991..079e38fde33a 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -100,7 +100,9 @@ static void dma_resv_list_free(struct dma_resv_list *list) static void __init dma_resv_lockdep(void) { struct mm_struct *mm = mm_alloc(); + struct ww_acquire_ctx ctx; struct dma_resv obj; + int ret; if (!mm) return; @@ -108,10 +110,14 @@ static void __init dma_resv_lockdep(void) dma_resv_init(&obj); down_read(&mm->mmap_sem); - ww_mutex_lock(&obj.lock, NULL); + ww_acquire_init(&ctx, &reservation_ww_class); + ret = dma_resv_lock(&obj, &ctx); + if (ret == -EDEADLK) + dma_resv_lock_slow(&obj, &ctx); fs_reclaim_acquire(GFP_KERNEL); fs_reclaim_release(GFP_KERNEL); ww_mutex_unlock(&obj.lock); + ww_acquire_fini(&ctx); up_read(&mm->mmap_sem); mmput(mm);
For whole series:
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
typo in patch 3 btw :)
For locking semantics it really doesn't matter when we grab the ticket. But for lockdep validation it does: the acquire ctx is a fake lockdep. Since other drivers might want to do a full multi-lock dance in their fault-handler, not just lock a single dma_resv. Therefore we must init the acquire_ctx only after we've done all the copy_*_user or anything else that might trigger a pagefault. For msm this means we need to move it past submit_lookup_objects.
Aside: Why is msm still using struct_mutex, it seems to be using dma_resv_lock for general buffer state protection?
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index fb1fdd7fa3ef..126b2f62bfe7 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -54,7 +54,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
INIT_LIST_HEAD(&submit->node); INIT_LIST_HEAD(&submit->bo_list); - ww_acquire_init(&submit->ticket, &reservation_ww_class);
return submit; } @@ -489,6 +488,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
+ ww_acquire_init(&submit->ticket, &reservation_ww_class); ret = submit_lock_objects(submit); if (ret) goto out;
On Tue, Nov 19, 2019 at 1:08 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
For locking semantics it really doesn't matter when we grab the ticket. But for lockdep validation it does: the acquire ctx is a fake lockdep. Since other drivers might want to do a full multi-lock dance in their fault-handler, not just lock a single dma_resv. Therefore we must init the acquire_ctx only after we've done all the copy_*_user or anything else that might trigger a pagefault. For msm this means we need to move it past submit_lookup_objects.
seems reasonable.. but maybe a comment would be useful to prevent future-me from "cleaning-this-up" back to the way it was
with that, r-b
Aside: Why is msm still using struct_mutex, it seems to be using dma_resv_lock for general buffer state protection?
only because I (or anyone else) hasn't had time to revisit the struct_mutex usage since we moved to per-object-locks.. the downside, I suppose, of kernel generally working ok and this not being a big enough fire to bubble up to the top of my todo list
BR, -R
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org
drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index fb1fdd7fa3ef..126b2f62bfe7 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -54,7 +54,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
INIT_LIST_HEAD(&submit->node); INIT_LIST_HEAD(&submit->bo_list);
ww_acquire_init(&submit->ticket, &reservation_ww_class); return submit;
} @@ -489,6 +488,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
ww_acquire_init(&submit->ticket, &reservation_ww_class); ret = submit_lock_objects(submit); if (ret) goto out;
-- 2.24.0
On Wed, Nov 20, 2019 at 3:07 AM Rob Clark robdclark@gmail.com wrote:
On Tue, Nov 19, 2019 at 1:08 PM Daniel Vetter daniel.vetter@ffwll.ch wrote:
For locking semantics it really doesn't matter when we grab the ticket. But for lockdep validation it does: the acquire ctx is a fake lockdep. Since other drivers might want to do a full multi-lock dance in their fault-handler, not just lock a single dma_resv. Therefore we must init the acquire_ctx only after we've done all the copy_*_user or anything else that might trigger a pagefault. For msm this means we need to move it past submit_lookup_objects.
seems reasonable.. but maybe a comment would be useful to prevent future-me from "cleaning-this-up" back to the way it was
I'll add a comment.
with that, r-b
Well I spotted a bug for the error handling, I'll need to respin. -Daniel
Aside: Why is msm still using struct_mutex, it seems to be using dma_resv_lock for general buffer state protection?
only because I (or anyone else) hasn't had time to revisit the struct_mutex usage since we moved to per-object-locks.. the downside, I suppose, of kernel generally working ok and this not being a big enough fire to bubble up to the top of my todo list
BR, -R
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org
drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index fb1fdd7fa3ef..126b2f62bfe7 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -54,7 +54,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
INIT_LIST_HEAD(&submit->node); INIT_LIST_HEAD(&submit->bo_list);
ww_acquire_init(&submit->ticket, &reservation_ww_class); return submit;
} @@ -489,6 +488,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
ww_acquire_init(&submit->ticket, &reservation_ww_class); ret = submit_lock_objects(submit); if (ret) goto out;
-- 2.24.0
For locking semantics it really doesn't matter when we grab the ticket. But for lockdep validation it does: the acquire ctx is a fake lockdep. Since other drivers might want to do a full multi-lock dance in their fault-handler, not just lock a single dma_resv. Therefore we must init the acquire_ctx only after we've done all the copy_*_user or anything else that might trigger a pagefault. For msm this means we need to move it past submit_lookup_objects.
Aside: Why is msm still using struct_mutex, it seems to be using dma_resv_lock for general buffer state protection?
v2: - Add comment to explain why the ww ticket setup is separate (Rob) - Fix up error handling, we need to make sure we don't call ww_acquire_fini without _init.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index fb1fdd7fa3ef..385d4965a8d0 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -54,7 +54,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
INIT_LIST_HEAD(&submit->node); INIT_LIST_HEAD(&submit->bo_list); - ww_acquire_init(&submit->ticket, &reservation_ww_class);
return submit; } @@ -390,8 +389,6 @@ static void submit_cleanup(struct msm_gem_submit *submit) list_del_init(&msm_obj->submit_entry); drm_gem_object_put(&msm_obj->base); } - - ww_acquire_fini(&submit->ticket); }
int msm_ioctl_gem_submit(struct drm_device *dev, void *data, @@ -408,6 +405,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_ringbuffer *ring; int out_fence_fd = -1; struct pid *pid = get_pid(task_pid(current)); + bool has_ww_ticket = false; unsigned i; int ret, submitid; if (!gpu) @@ -489,6 +487,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
+ /* copy_*_user while holding a ww ticket upsets lockdep */ + ww_acquire_init(&submit->ticket, &reservation_ww_class); + has_ww_ticket = true; ret = submit_lock_objects(submit); if (ret) goto out; @@ -588,6 +589,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
out: submit_cleanup(submit); + if (has_ww_ticket) + ww_acquire_fini(&submit->ticket); if (ret) msm_gem_submit_free(submit); out_unlock:
On Wed, Nov 20, 2019 at 2:56 AM Daniel Vetter daniel.vetter@ffwll.ch wrote:
For locking semantics it really doesn't matter when we grab the ticket. But for lockdep validation it does: the acquire ctx is a fake lockdep. Since other drivers might want to do a full multi-lock dance in their fault-handler, not just lock a single dma_resv. Therefore we must init the acquire_ctx only after we've done all the copy_*_user or anything else that might trigger a pagefault. For msm this means we need to move it past submit_lookup_objects.
Aside: Why is msm still using struct_mutex, it seems to be using dma_resv_lock for general buffer state protection?
v2:
- Add comment to explain why the ww ticket setup is separate (Rob)
- Fix up error handling, we need to make sure we don't call ww_acquire_fini without _init.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org
found a few minutes to take this for a spin and seems fine.. t-b && r-b
BR, -R
drivers/gpu/drm/msm/msm_gem_submit.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index fb1fdd7fa3ef..385d4965a8d0 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -54,7 +54,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
INIT_LIST_HEAD(&submit->node); INIT_LIST_HEAD(&submit->bo_list);
ww_acquire_init(&submit->ticket, &reservation_ww_class); return submit;
} @@ -390,8 +389,6 @@ static void submit_cleanup(struct msm_gem_submit *submit) list_del_init(&msm_obj->submit_entry); drm_gem_object_put(&msm_obj->base); }
ww_acquire_fini(&submit->ticket);
}
int msm_ioctl_gem_submit(struct drm_device *dev, void *data, @@ -408,6 +405,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_ringbuffer *ring; int out_fence_fd = -1; struct pid *pid = get_pid(task_pid(current));
bool has_ww_ticket = false; unsigned i; int ret, submitid; if (!gpu)
@@ -489,6 +487,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
/* copy_*_user while holding a ww ticket upsets lockdep */
ww_acquire_init(&submit->ticket, &reservation_ww_class);
has_ww_ticket = true; ret = submit_lock_objects(submit); if (ret) goto out;
@@ -588,6 +589,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
out: submit_cleanup(submit);
if (has_ww_ticket)
ww_acquire_fini(&submit->ticket); if (ret) msm_gem_submit_free(submit);
out_unlock:
2.24.0
On Wed, Nov 20, 2019 at 04:24:48PM -0800, Rob Clark wrote:
On Wed, Nov 20, 2019 at 2:56 AM Daniel Vetter daniel.vetter@ffwll.ch wrote:
For locking semantics it really doesn't matter when we grab the ticket. But for lockdep validation it does: the acquire ctx is a fake lockdep. Since other drivers might want to do a full multi-lock dance in their fault-handler, not just lock a single dma_resv. Therefore we must init the acquire_ctx only after we've done all the copy_*_user or anything else that might trigger a pagefault. For msm this means we need to move it past submit_lookup_objects.
Aside: Why is msm still using struct_mutex, it seems to be using dma_resv_lock for general buffer state protection?
v2:
- Add comment to explain why the ww ticket setup is separate (Rob)
- Fix up error handling, we need to make sure we don't call ww_acquire_fini without _init.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org
found a few minutes to take this for a spin and seems fine.. t-b && r-b
Thanks for taking this for a spin, entire series applied. -Daniel
BR, -R
drivers/gpu/drm/msm/msm_gem_submit.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index fb1fdd7fa3ef..385d4965a8d0 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -54,7 +54,6 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
INIT_LIST_HEAD(&submit->node); INIT_LIST_HEAD(&submit->bo_list);
ww_acquire_init(&submit->ticket, &reservation_ww_class); return submit;
} @@ -390,8 +389,6 @@ static void submit_cleanup(struct msm_gem_submit *submit) list_del_init(&msm_obj->submit_entry); drm_gem_object_put(&msm_obj->base); }
ww_acquire_fini(&submit->ticket);
}
int msm_ioctl_gem_submit(struct drm_device *dev, void *data, @@ -408,6 +405,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct msm_ringbuffer *ring; int out_fence_fd = -1; struct pid *pid = get_pid(task_pid(current));
bool has_ww_ticket = false; unsigned i; int ret, submitid; if (!gpu)
@@ -489,6 +487,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
/* copy_*_user while holding a ww ticket upsets lockdep */
ww_acquire_init(&submit->ticket, &reservation_ww_class);
has_ww_ticket = true; ret = submit_lock_objects(submit); if (ret) goto out;
@@ -588,6 +589,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
out: submit_cleanup(submit);
if (has_ww_ticket)
ww_acquire_fini(&submit->ticket); if (ret) msm_gem_submit_free(submit);
out_unlock:
2.24.0
dri-devel@lists.freedesktop.org