From: Rob Clark robdclark@chromium.org
This doesn't remove *all* the struct_mutex, but it covers the worst of it, ie. shrinker/madvise/free/retire. The submit path still uses struct_mutex, but it still needs *something* serialize a portion of the submit path, and lock_stat mostly just shows the lock contention there being with other submits. And there are a few other bits of struct_mutex usage in less critical paths (debugfs, etc). But this seems like a reasonable step in the right direction.
Rob Clark (14): drm/msm: Use correct drm_gem_object_put() in fail case drm/msm: Drop chatty trace drm/msm: Move update_fences() drm/msm: Add priv->mm_lock to protect active/inactive lists drm/msm: Document and rename preempt_lock drm/msm: Protect ring->submits with it's own lock drm/msm: Refcount submits drm/msm: Remove obj->gpu drm/msm: Drop struct_mutex from the retire path drm/msm: Drop struct_mutex in free_object() path drm/msm: remove msm_gem_free_work drm/msm: drop struct_mutex in madvise path drm/msm: Drop struct_mutex in shrinker path drm/msm: Don't implicit-sync if only a single ring
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +- drivers/gpu/drm/msm/msm_debugfs.c | 7 ++ drivers/gpu/drm/msm/msm_drv.c | 15 +--- drivers/gpu/drm/msm/msm_drv.h | 19 +++-- drivers/gpu/drm/msm/msm_gem.c | 76 ++++++------------ drivers/gpu/drm/msm/msm_gem.h | 53 +++++++++---- drivers/gpu/drm/msm/msm_gem_shrinker.c | 58 ++------------ drivers/gpu/drm/msm/msm_gem_submit.c | 17 ++-- drivers/gpu/drm/msm/msm_gpu.c | 96 ++++++++++++++--------- drivers/gpu/drm/msm/msm_gpu.h | 5 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 3 +- drivers/gpu/drm/msm/msm_ringbuffer.h | 13 ++- 14 files changed, 188 insertions(+), 194 deletions(-)
From: Rob Clark robdclark@chromium.org
We only want to use the _unlocked() variant in the unlocked case.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 14e14caf90f9..a870b3ad129d 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -1115,7 +1115,11 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, return obj;
fail: - drm_gem_object_put(obj); + if (struct_mutex_locked) { + drm_gem_object_put_locked(obj); + } else { + drm_gem_object_put(obj); + } return ERR_PTR(ret); }
From: Rob Clark robdclark@chromium.org
It is somewhat redundant with the gpu tracepoints, and anyways not too useful to justify spamming the log when debug traces are enabled.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gpu.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d16489d0f3..31fce3ac0cdc 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -535,7 +535,6 @@ static void recover_worker(struct work_struct *work)
static void hangcheck_timer_reset(struct msm_gpu *gpu) { - DBG("%s", gpu->name); mod_timer(&gpu->hangcheck_timer, round_jiffies_up(jiffies + DRM_MSM_HANGCHECK_JIFFIES)); }
On Sun, Oct 04, 2020 at 12:21:34PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
It is somewhat redundant with the gpu tracepoints, and anyways not too useful to justify spamming the log when debug traces are enabled.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gpu.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d16489d0f3..31fce3ac0cdc 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -535,7 +535,6 @@ static void recover_worker(struct work_struct *work)
static void hangcheck_timer_reset(struct msm_gpu *gpu) {
- DBG("%s", gpu->name); mod_timer(&gpu->hangcheck_timer, round_jiffies_up(jiffies + DRM_MSM_HANGCHECK_JIFFIES));
}
2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
Small cleanup, update_fences() is used in the hangcheck path, but also in the normal retire path.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gpu.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 31fce3ac0cdc..ca8c95b32c8b 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -265,6 +265,20 @@ int msm_gpu_hw_init(struct msm_gpu *gpu) return ret; }
+static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, + uint32_t fence) +{ + struct msm_gem_submit *submit; + + list_for_each_entry(submit, &ring->submits, node) { + if (submit->seqno > fence) + break; + + msm_update_fence(submit->ring->fctx, + submit->fence->seqno); + } +} + #ifdef CONFIG_DEV_COREDUMP static ssize_t msm_gpu_devcoredump_read(char *buffer, loff_t offset, size_t count, void *data, size_t datalen) @@ -411,20 +425,6 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu, * Hangcheck detection for locked gpu: */
-static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, - uint32_t fence) -{ - struct msm_gem_submit *submit; - - list_for_each_entry(submit, &ring->submits, node) { - if (submit->seqno > fence) - break; - - msm_update_fence(submit->ring->fctx, - submit->fence->seqno); - } -} - static struct msm_gem_submit * find_submit(struct msm_ringbuffer *ring, uint32_t fence) {
On Sun, Oct 04, 2020 at 12:21:35PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Small cleanup, update_fences() is used in the hangcheck path, but also in the normal retire path.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gpu.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 31fce3ac0cdc..ca8c95b32c8b 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -265,6 +265,20 @@ int msm_gpu_hw_init(struct msm_gpu *gpu) return ret; }
+static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
uint32_t fence)
+{
- struct msm_gem_submit *submit;
- list_for_each_entry(submit, &ring->submits, node) {
if (submit->seqno > fence)
break;
msm_update_fence(submit->ring->fctx,
submit->fence->seqno);
- }
+}
#ifdef CONFIG_DEV_COREDUMP static ssize_t msm_gpu_devcoredump_read(char *buffer, loff_t offset, size_t count, void *data, size_t datalen) @@ -411,20 +425,6 @@ static void msm_gpu_crashstate_capture(struct msm_gpu *gpu,
- Hangcheck detection for locked gpu:
*/
-static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
uint32_t fence)
-{
- struct msm_gem_submit *submit;
- list_for_each_entry(submit, &ring->submits, node) {
if (submit->seqno > fence)
break;
msm_update_fence(submit->ring->fctx,
submit->fence->seqno);
- }
-}
static struct msm_gem_submit * find_submit(struct msm_ringbuffer *ring, uint32_t fence) { -- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
Rather than relying on the big dev->struct_mutex hammer, introduce a more specific lock for protecting the bo lists.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_debugfs.c | 7 +++++++ drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 13 +++++++++++- drivers/gpu/drm/msm/msm_gem.c | 28 +++++++++++++++----------- drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++++++++++ drivers/gpu/drm/msm/msm_gpu.h | 5 ++++- 6 files changed, 52 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c index ee2e270f464c..64afbed89821 100644 --- a/drivers/gpu/drm/msm/msm_debugfs.c +++ b/drivers/gpu/drm/msm/msm_debugfs.c @@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) { struct msm_drm_private *priv = dev->dev_private; struct msm_gpu *gpu = priv->gpu; + int ret; + + ret = mutex_lock_interruptible(&priv->mm_lock); + if (ret) + return ret;
if (gpu) { seq_printf(m, "Active Objects (%s):\n", gpu->name); @@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) seq_printf(m, "Inactive Objects:\n"); msm_gem_describe_objects(&priv->inactive_list, m);
+ mutex_unlock(&priv->mm_lock); + return 0; }
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 49685571dc0e..dc6efc089285 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv) init_llist_head(&priv->free_list);
INIT_LIST_HEAD(&priv->inactive_list); + mutex_init(&priv->mm_lock);
drm_mode_config_init(ddev);
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index b9dd8f8f4887..50978e5db376 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -174,8 +174,19 @@ struct msm_drm_private { struct msm_rd_state *hangrd; /* debugfs to dump hanging submits */ struct msm_perf_state *perf;
- /* list of GEM objects: */ + /* + * List of inactive GEM objects. Every bo is either in the inactive_list + * or gpu->active_list (for the gpu it is active on[1]) + * + * These lists are protected by mm_lock. If struct_mutex is involved, it + * should be aquired prior to mm_lock. One should *not* hold mm_lock in + * get_pages()/vmap()/etc paths, as they can trigger the shrinker. + * + * [1] if someone ever added support for the old 2d cores, there could be + * more than one gpu object + */ struct list_head inactive_list; + struct mutex mm_lock;
/* worker for delayed free of objects: */ struct work_struct free_work; diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index a870b3ad129d..b04ed8b52f9d 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -746,13 +746,17 @@ int msm_gem_sync_object(struct drm_gem_object *obj, void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) { struct msm_gem_object *msm_obj = to_msm_bo(obj); - WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex)); + struct msm_drm_private *priv = obj->dev->dev_private; + + might_sleep(); WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED);
if (!atomic_fetch_inc(&msm_obj->active_count)) { + mutex_lock(&priv->mm_lock); msm_obj->gpu = gpu; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &gpu->active_list); + mutex_unlock(&priv->mm_lock); } }
@@ -761,12 +765,14 @@ void msm_gem_active_put(struct drm_gem_object *obj) struct msm_gem_object *msm_obj = to_msm_bo(obj); struct msm_drm_private *priv = obj->dev->dev_private;
- WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex)); + might_sleep();
if (!atomic_dec_return(&msm_obj->active_count)) { + mutex_lock(&priv->mm_lock); msm_obj->gpu = NULL; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &priv->inactive_list); + mutex_unlock(&priv->mm_lock); } }
@@ -921,13 +927,16 @@ static void free_object(struct msm_gem_object *msm_obj) { struct drm_gem_object *obj = &msm_obj->base; struct drm_device *dev = obj->dev; + struct msm_drm_private *priv = dev->dev_private;
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
/* object should not be on active list: */ WARN_ON(is_active(msm_obj));
+ mutex_lock(&priv->mm_lock); list_del(&msm_obj->mm_list); + mutex_unlock(&priv->mm_lock);
mutex_lock(&msm_obj->lock);
@@ -1103,14 +1112,9 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, mapping_set_gfp_mask(obj->filp->f_mapping, GFP_HIGHUSER); }
- if (struct_mutex_locked) { - WARN_ON(!mutex_is_locked(&dev->struct_mutex)); - list_add_tail(&msm_obj->mm_list, &priv->inactive_list); - } else { - mutex_lock(&dev->struct_mutex); - list_add_tail(&msm_obj->mm_list, &priv->inactive_list); - mutex_unlock(&dev->struct_mutex); - } + mutex_lock(&priv->mm_lock); + list_add_tail(&msm_obj->mm_list, &priv->inactive_list); + mutex_unlock(&priv->mm_lock);
return obj;
@@ -1178,9 +1182,9 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,
mutex_unlock(&msm_obj->lock);
- mutex_lock(&dev->struct_mutex); + mutex_lock(&priv->mm_lock); list_add_tail(&msm_obj->mm_list, &priv->inactive_list); - mutex_unlock(&dev->struct_mutex); + mutex_unlock(&priv->mm_lock);
return obj;
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 482576d7a39a..c41b84a3a484 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -51,11 +51,15 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) if (!msm_gem_shrinker_lock(dev, &unlock)) return 0;
+ mutex_lock(&priv->mm_lock); + list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (is_purgeable(msm_obj)) count += msm_obj->base.size >> PAGE_SHIFT; }
+ mutex_unlock(&priv->mm_lock); + if (unlock) mutex_unlock(&dev->struct_mutex);
@@ -75,6 +79,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) if (!msm_gem_shrinker_lock(dev, &unlock)) return SHRINK_STOP;
+ mutex_lock(&priv->mm_lock); + list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (freed >= sc->nr_to_scan) break; @@ -84,6 +90,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) } }
+ mutex_unlock(&priv->mm_lock); + if (unlock) mutex_unlock(&dev->struct_mutex);
@@ -106,6 +114,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) if (!msm_gem_shrinker_lock(dev, &unlock)) return NOTIFY_DONE;
+ mutex_lock(&priv->mm_lock); + list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (is_vunmapable(msm_obj)) { msm_gem_vunmap(&msm_obj->base, OBJ_LOCK_SHRINKER); @@ -118,6 +128,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) } }
+ mutex_unlock(&priv->mm_lock); + if (unlock) mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fdc1a76..1806e87600c0 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -94,7 +94,10 @@ struct msm_gpu { struct msm_ringbuffer *rb[MSM_GPU_MAX_RINGS]; int nr_rings;
- /* list of GEM active objects: */ + /* + * List of GEM active objects on this gpu. Protected by + * msm_drm_private::mm_lock + */ struct list_head active_list;
/* does gpu need hw_init? */
On Sun, Oct 4, 2020 at 9:21 PM Rob Clark robdclark@gmail.com wrote:
From: Rob Clark robdclark@chromium.org
Rather than relying on the big dev->struct_mutex hammer, introduce a more specific lock for protecting the bo lists.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_debugfs.c | 7 +++++++ drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 13 +++++++++++- drivers/gpu/drm/msm/msm_gem.c | 28 +++++++++++++++----------- drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++++++++++ drivers/gpu/drm/msm/msm_gpu.h | 5 ++++- 6 files changed, 52 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c index ee2e270f464c..64afbed89821 100644 --- a/drivers/gpu/drm/msm/msm_debugfs.c +++ b/drivers/gpu/drm/msm/msm_debugfs.c @@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) { struct msm_drm_private *priv = dev->dev_private; struct msm_gpu *gpu = priv->gpu;
int ret;
ret = mutex_lock_interruptible(&priv->mm_lock);
if (ret)
return ret; if (gpu) { seq_printf(m, "Active Objects (%s):\n", gpu->name);
@@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) seq_printf(m, "Inactive Objects:\n"); msm_gem_describe_objects(&priv->inactive_list, m);
mutex_unlock(&priv->mm_lock);
return 0;
}
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 49685571dc0e..dc6efc089285 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv) init_llist_head(&priv->free_list);
INIT_LIST_HEAD(&priv->inactive_list);
mutex_init(&priv->mm_lock);
I highly recommend you drop a
fs_reclaim_acquire(GFP_KERNEL); might_lock(&priv->mm_lock); fs_reclaim_release(GFP_KERNEL);
in here to teach lockdep about your ordering against the shrinker. Gives you full testing every boot, even if your shrinker never gets called. -Daniel
drm_mode_config_init(ddev);
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index b9dd8f8f4887..50978e5db376 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -174,8 +174,19 @@ struct msm_drm_private { struct msm_rd_state *hangrd; /* debugfs to dump hanging submits */ struct msm_perf_state *perf;
/* list of GEM objects: */
/*
* List of inactive GEM objects. Every bo is either in the inactive_list
* or gpu->active_list (for the gpu it is active on[1])
*
* These lists are protected by mm_lock. If struct_mutex is involved, it
* should be aquired prior to mm_lock. One should *not* hold mm_lock in
* get_pages()/vmap()/etc paths, as they can trigger the shrinker.
*
* [1] if someone ever added support for the old 2d cores, there could be
* more than one gpu object
*/ struct list_head inactive_list;
struct mutex mm_lock; /* worker for delayed free of objects: */ struct work_struct free_work;
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index a870b3ad129d..b04ed8b52f9d 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -746,13 +746,17 @@ int msm_gem_sync_object(struct drm_gem_object *obj, void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) { struct msm_gem_object *msm_obj = to_msm_bo(obj);
WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
struct msm_drm_private *priv = obj->dev->dev_private;
might_sleep(); WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED); if (!atomic_fetch_inc(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock); msm_obj->gpu = gpu; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &gpu->active_list);
mutex_unlock(&priv->mm_lock); }
}
@@ -761,12 +765,14 @@ void msm_gem_active_put(struct drm_gem_object *obj) struct msm_gem_object *msm_obj = to_msm_bo(obj); struct msm_drm_private *priv = obj->dev->dev_private;
WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
might_sleep(); if (!atomic_dec_return(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock); msm_obj->gpu = NULL; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&priv->mm_lock); }
}
@@ -921,13 +927,16 @@ static void free_object(struct msm_gem_object *msm_obj) { struct drm_gem_object *obj = &msm_obj->base; struct drm_device *dev = obj->dev;
struct msm_drm_private *priv = dev->dev_private; WARN_ON(!mutex_is_locked(&dev->struct_mutex)); /* object should not be on active list: */ WARN_ON(is_active(msm_obj));
mutex_lock(&priv->mm_lock); list_del(&msm_obj->mm_list);
mutex_unlock(&priv->mm_lock); mutex_lock(&msm_obj->lock);
@@ -1103,14 +1112,9 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, mapping_set_gfp_mask(obj->filp->f_mapping, GFP_HIGHUSER); }
if (struct_mutex_locked) {
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
} else {
mutex_lock(&dev->struct_mutex);
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&dev->struct_mutex);
}
mutex_lock(&priv->mm_lock);
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&priv->mm_lock); return obj;
@@ -1178,9 +1182,9 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,
mutex_unlock(&msm_obj->lock);
mutex_lock(&dev->struct_mutex);
mutex_lock(&priv->mm_lock); list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&dev->struct_mutex);
mutex_unlock(&priv->mm_lock); return obj;
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 482576d7a39a..c41b84a3a484 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -51,11 +51,15 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) if (!msm_gem_shrinker_lock(dev, &unlock)) return 0;
mutex_lock(&priv->mm_lock);
list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (is_purgeable(msm_obj)) count += msm_obj->base.size >> PAGE_SHIFT; }
mutex_unlock(&priv->mm_lock);
if (unlock) mutex_unlock(&dev->struct_mutex);
@@ -75,6 +79,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) if (!msm_gem_shrinker_lock(dev, &unlock)) return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (freed >= sc->nr_to_scan) break;
@@ -84,6 +90,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) } }
mutex_unlock(&priv->mm_lock);
if (unlock) mutex_unlock(&dev->struct_mutex);
@@ -106,6 +114,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) if (!msm_gem_shrinker_lock(dev, &unlock)) return NOTIFY_DONE;
mutex_lock(&priv->mm_lock);
list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (is_vunmapable(msm_obj)) { msm_gem_vunmap(&msm_obj->base, OBJ_LOCK_SHRINKER);
@@ -118,6 +128,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) } }
mutex_unlock(&priv->mm_lock);
if (unlock) mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fdc1a76..1806e87600c0 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -94,7 +94,10 @@ struct msm_gpu { struct msm_ringbuffer *rb[MSM_GPU_MAX_RINGS]; int nr_rings;
/* list of GEM active objects: */
/*
* List of GEM active objects on this gpu. Protected by
* msm_drm_private::mm_lock
*/ struct list_head active_list; /* does gpu need hw_init? */
-- 2.26.2
On Sun, Oct 4, 2020 at 3:15 PM Daniel Vetter daniel@ffwll.ch wrote:
On Sun, Oct 4, 2020 at 9:21 PM Rob Clark robdclark@gmail.com wrote:
From: Rob Clark robdclark@chromium.org
Rather than relying on the big dev->struct_mutex hammer, introduce a more specific lock for protecting the bo lists.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_debugfs.c | 7 +++++++ drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 13 +++++++++++- drivers/gpu/drm/msm/msm_gem.c | 28 +++++++++++++++----------- drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++++++++++ drivers/gpu/drm/msm/msm_gpu.h | 5 ++++- 6 files changed, 52 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c index ee2e270f464c..64afbed89821 100644 --- a/drivers/gpu/drm/msm/msm_debugfs.c +++ b/drivers/gpu/drm/msm/msm_debugfs.c @@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) { struct msm_drm_private *priv = dev->dev_private; struct msm_gpu *gpu = priv->gpu;
int ret;
ret = mutex_lock_interruptible(&priv->mm_lock);
if (ret)
return ret; if (gpu) { seq_printf(m, "Active Objects (%s):\n", gpu->name);
@@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) seq_printf(m, "Inactive Objects:\n"); msm_gem_describe_objects(&priv->inactive_list, m);
mutex_unlock(&priv->mm_lock);
return 0;
}
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 49685571dc0e..dc6efc089285 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv) init_llist_head(&priv->free_list);
INIT_LIST_HEAD(&priv->inactive_list);
mutex_init(&priv->mm_lock);
I highly recommend you drop a
fs_reclaim_acquire(GFP_KERNEL); might_lock(&priv->mm_lock); fs_reclaim_release(GFP_KERNEL);
in here to teach lockdep about your ordering against the shrinker. Gives you full testing every boot, even if your shrinker never gets called.
Good idea..
(tbf, I have tested this with android+lockdep which pretty is great shrinker exercise.. but immediate notification of future problems is a good plan)
BR, -R
On Sun, Oct 04, 2020 at 12:21:36PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Rather than relying on the big dev->struct_mutex hammer, introduce a more specific lock for protecting the bo lists.
Most excellent.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_debugfs.c | 7 +++++++ drivers/gpu/drm/msm/msm_drv.c | 1 + drivers/gpu/drm/msm/msm_drv.h | 13 +++++++++++- drivers/gpu/drm/msm/msm_gem.c | 28 +++++++++++++++----------- drivers/gpu/drm/msm/msm_gem_shrinker.c | 12 +++++++++++ drivers/gpu/drm/msm/msm_gpu.h | 5 ++++- 6 files changed, 52 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_debugfs.c b/drivers/gpu/drm/msm/msm_debugfs.c index ee2e270f464c..64afbed89821 100644 --- a/drivers/gpu/drm/msm/msm_debugfs.c +++ b/drivers/gpu/drm/msm/msm_debugfs.c @@ -112,6 +112,11 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) { struct msm_drm_private *priv = dev->dev_private; struct msm_gpu *gpu = priv->gpu;
int ret;
ret = mutex_lock_interruptible(&priv->mm_lock);
if (ret)
return ret;
if (gpu) { seq_printf(m, "Active Objects (%s):\n", gpu->name);
@@ -121,6 +126,8 @@ static int msm_gem_show(struct drm_device *dev, struct seq_file *m) seq_printf(m, "Inactive Objects:\n"); msm_gem_describe_objects(&priv->inactive_list, m);
- mutex_unlock(&priv->mm_lock);
- return 0;
}
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 49685571dc0e..dc6efc089285 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -441,6 +441,7 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv) init_llist_head(&priv->free_list);
INIT_LIST_HEAD(&priv->inactive_list);
mutex_init(&priv->mm_lock);
drm_mode_config_init(ddev);
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index b9dd8f8f4887..50978e5db376 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -174,8 +174,19 @@ struct msm_drm_private { struct msm_rd_state *hangrd; /* debugfs to dump hanging submits */ struct msm_perf_state *perf;
- /* list of GEM objects: */
/*
* List of inactive GEM objects. Every bo is either in the inactive_list
* or gpu->active_list (for the gpu it is active on[1])
*
* These lists are protected by mm_lock. If struct_mutex is involved, it
* should be aquired prior to mm_lock. One should *not* hold mm_lock in
* get_pages()/vmap()/etc paths, as they can trigger the shrinker.
*
* [1] if someone ever added support for the old 2d cores, there could be
* more than one gpu object
*/
struct list_head inactive_list;
struct mutex mm_lock;
/* worker for delayed free of objects: */ struct work_struct free_work;
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index a870b3ad129d..b04ed8b52f9d 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -746,13 +746,17 @@ int msm_gem_sync_object(struct drm_gem_object *obj, void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) { struct msm_gem_object *msm_obj = to_msm_bo(obj);
- WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
struct msm_drm_private *priv = obj->dev->dev_private;
might_sleep(); WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED);
if (!atomic_fetch_inc(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock);
msm_obj->gpu = gpu; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &gpu->active_list);
mutex_unlock(&priv->mm_lock);
}
}
@@ -761,12 +765,14 @@ void msm_gem_active_put(struct drm_gem_object *obj) struct msm_gem_object *msm_obj = to_msm_bo(obj); struct msm_drm_private *priv = obj->dev->dev_private;
- WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex));
might_sleep();
if (!atomic_dec_return(&msm_obj->active_count)) {
mutex_lock(&priv->mm_lock);
msm_obj->gpu = NULL; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&priv->mm_lock);
}
}
@@ -921,13 +927,16 @@ static void free_object(struct msm_gem_object *msm_obj) { struct drm_gem_object *obj = &msm_obj->base; struct drm_device *dev = obj->dev;
struct msm_drm_private *priv = dev->dev_private;
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
/* object should not be on active list: */ WARN_ON(is_active(msm_obj));
mutex_lock(&priv->mm_lock); list_del(&msm_obj->mm_list);
mutex_unlock(&priv->mm_lock);
mutex_lock(&msm_obj->lock);
@@ -1103,14 +1112,9 @@ static struct drm_gem_object *_msm_gem_new(struct drm_device *dev, mapping_set_gfp_mask(obj->filp->f_mapping, GFP_HIGHUSER); }
- if (struct_mutex_locked) {
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
- } else {
mutex_lock(&dev->struct_mutex);
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&dev->struct_mutex);
- }
mutex_lock(&priv->mm_lock);
list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
mutex_unlock(&priv->mm_lock);
return obj;
@@ -1178,9 +1182,9 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,
mutex_unlock(&msm_obj->lock);
- mutex_lock(&dev->struct_mutex);
- mutex_lock(&priv->mm_lock); list_add_tail(&msm_obj->mm_list, &priv->inactive_list);
- mutex_unlock(&dev->struct_mutex);
mutex_unlock(&priv->mm_lock);
return obj;
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 482576d7a39a..c41b84a3a484 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -51,11 +51,15 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) if (!msm_gem_shrinker_lock(dev, &unlock)) return 0;
mutex_lock(&priv->mm_lock);
list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (is_purgeable(msm_obj)) count += msm_obj->base.size >> PAGE_SHIFT; }
mutex_unlock(&priv->mm_lock);
if (unlock) mutex_unlock(&dev->struct_mutex);
@@ -75,6 +79,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) if (!msm_gem_shrinker_lock(dev, &unlock)) return SHRINK_STOP;
- mutex_lock(&priv->mm_lock);
- list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (freed >= sc->nr_to_scan) break;
@@ -84,6 +90,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) } }
- mutex_unlock(&priv->mm_lock);
- if (unlock) mutex_unlock(&dev->struct_mutex);
@@ -106,6 +114,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) if (!msm_gem_shrinker_lock(dev, &unlock)) return NOTIFY_DONE;
- mutex_lock(&priv->mm_lock);
- list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (is_vunmapable(msm_obj)) { msm_gem_vunmap(&msm_obj->base, OBJ_LOCK_SHRINKER);
@@ -118,6 +128,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) } }
- mutex_unlock(&priv->mm_lock);
- if (unlock) mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 6c9e1fdc1a76..1806e87600c0 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -94,7 +94,10 @@ struct msm_gpu { struct msm_ringbuffer *rb[MSM_GPU_MAX_RINGS]; int nr_rings;
- /* list of GEM active objects: */
/*
* List of GEM active objects on this gpu. Protected by
* msm_drm_private::mm_lock
*/
struct list_head active_list;
/* does gpu need hw_init? */
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
Before adding another lock, give ring->lock a more descriptive name.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 ++-- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 ++++++------ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 ++-- drivers/gpu/drm/msm/msm_ringbuffer.c | 2 +- drivers/gpu/drm/msm/msm_ringbuffer.h | 7 ++++++- 5 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index c941c8138f25..543437a2186e 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -36,7 +36,7 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring))); }
- spin_lock_irqsave(&ring->lock, flags); + spin_lock_irqsave(&ring->preempt_lock, flags);
/* Copy the shadow to the actual register */ ring->cur = ring->next; @@ -44,7 +44,7 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, /* Make sure to wrap wptr if we need to */ wptr = get_wptr(ring);
- spin_unlock_irqrestore(&ring->lock, flags); + spin_unlock_irqrestore(&ring->preempt_lock, flags);
/* Make sure everything is posted before making a decision */ mb(); diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c index 7e04509c4e1f..183de1139eeb 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c @@ -45,9 +45,9 @@ static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) if (!ring) return;
- spin_lock_irqsave(&ring->lock, flags); + spin_lock_irqsave(&ring->preempt_lock, flags); wptr = get_wptr(ring); - spin_unlock_irqrestore(&ring->lock, flags); + spin_unlock_irqrestore(&ring->preempt_lock, flags);
gpu_write(gpu, REG_A5XX_CP_RB_WPTR, wptr); } @@ -62,9 +62,9 @@ static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu) bool empty; struct msm_ringbuffer *ring = gpu->rb[i];
- spin_lock_irqsave(&ring->lock, flags); + spin_lock_irqsave(&ring->preempt_lock, flags); empty = (get_wptr(ring) == ring->memptrs->rptr); - spin_unlock_irqrestore(&ring->lock, flags); + spin_unlock_irqrestore(&ring->preempt_lock, flags);
if (!empty) return ring; @@ -132,9 +132,9 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu) }
/* Make sure the wptr doesn't update while we're in motion */ - spin_lock_irqsave(&ring->lock, flags); + spin_lock_irqsave(&ring->preempt_lock, flags); a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring); - spin_unlock_irqrestore(&ring->lock, flags); + spin_unlock_irqrestore(&ring->preempt_lock, flags);
/* Set the address of the incoming preemption record */ gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO, diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 8915882e4444..fc85f008d69d 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -65,7 +65,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring) OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring))); }
- spin_lock_irqsave(&ring->lock, flags); + spin_lock_irqsave(&ring->preempt_lock, flags);
/* Copy the shadow to the actual register */ ring->cur = ring->next; @@ -73,7 +73,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring) /* Make sure to wrap wptr if we need to */ wptr = get_wptr(ring);
- spin_unlock_irqrestore(&ring->lock, flags); + spin_unlock_irqrestore(&ring->preempt_lock, flags);
/* Make sure everything is posted before making a decision */ mb(); diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 935bf9b1d941..1b6958e908dc 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -46,7 +46,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id, ring->memptrs_iova = memptrs_iova;
INIT_LIST_HEAD(&ring->submits); - spin_lock_init(&ring->lock); + spin_lock_init(&ring->preempt_lock);
snprintf(name, sizeof(name), "gpu-ring-%d", ring->id);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h index 0987d6bf848c..4956d1bc5d0e 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.h +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h @@ -46,7 +46,12 @@ struct msm_ringbuffer { struct msm_rbmemptrs *memptrs; uint64_t memptrs_iova; struct msm_fence_context *fctx; - spinlock_t lock; + + /* + * preempt_lock protects preemption and serializes wptr updates against + * preemption. Can be aquired from irq context. + */ + spinlock_t preempt_lock; };
struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
On Sun, Oct 04, 2020 at 12:21:37PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Before adding another lock, give ring->lock a more descriptive name.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 ++-- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 ++++++------ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 ++-- drivers/gpu/drm/msm/msm_ringbuffer.c | 2 +- drivers/gpu/drm/msm/msm_ringbuffer.h | 7 ++++++- 5 files changed, 17 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index c941c8138f25..543437a2186e 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -36,7 +36,7 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, OUT_RING(ring, upper_32_bits(shadowptr(a5xx_gpu, ring))); }
- spin_lock_irqsave(&ring->lock, flags);
spin_lock_irqsave(&ring->preempt_lock, flags);
/* Copy the shadow to the actual register */ ring->cur = ring->next;
@@ -44,7 +44,7 @@ void a5xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring, /* Make sure to wrap wptr if we need to */ wptr = get_wptr(ring);
- spin_unlock_irqrestore(&ring->lock, flags);
spin_unlock_irqrestore(&ring->preempt_lock, flags);
/* Make sure everything is posted before making a decision */ mb();
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c index 7e04509c4e1f..183de1139eeb 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c @@ -45,9 +45,9 @@ static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring) if (!ring) return;
- spin_lock_irqsave(&ring->lock, flags);
- spin_lock_irqsave(&ring->preempt_lock, flags); wptr = get_wptr(ring);
- spin_unlock_irqrestore(&ring->lock, flags);
spin_unlock_irqrestore(&ring->preempt_lock, flags);
gpu_write(gpu, REG_A5XX_CP_RB_WPTR, wptr);
} @@ -62,9 +62,9 @@ static struct msm_ringbuffer *get_next_ring(struct msm_gpu *gpu) bool empty; struct msm_ringbuffer *ring = gpu->rb[i];
spin_lock_irqsave(&ring->lock, flags);
empty = (get_wptr(ring) == ring->memptrs->rptr);spin_lock_irqsave(&ring->preempt_lock, flags);
spin_unlock_irqrestore(&ring->lock, flags);
spin_unlock_irqrestore(&ring->preempt_lock, flags);
if (!empty) return ring;
@@ -132,9 +132,9 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu) }
/* Make sure the wptr doesn't update while we're in motion */
- spin_lock_irqsave(&ring->lock, flags);
- spin_lock_irqsave(&ring->preempt_lock, flags); a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
- spin_unlock_irqrestore(&ring->lock, flags);
spin_unlock_irqrestore(&ring->preempt_lock, flags);
/* Set the address of the incoming preemption record */ gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 8915882e4444..fc85f008d69d 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -65,7 +65,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring) OUT_RING(ring, upper_32_bits(shadowptr(a6xx_gpu, ring))); }
- spin_lock_irqsave(&ring->lock, flags);
spin_lock_irqsave(&ring->preempt_lock, flags);
/* Copy the shadow to the actual register */ ring->cur = ring->next;
@@ -73,7 +73,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring) /* Make sure to wrap wptr if we need to */ wptr = get_wptr(ring);
- spin_unlock_irqrestore(&ring->lock, flags);
spin_unlock_irqrestore(&ring->preempt_lock, flags);
/* Make sure everything is posted before making a decision */ mb();
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 935bf9b1d941..1b6958e908dc 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -46,7 +46,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id, ring->memptrs_iova = memptrs_iova;
INIT_LIST_HEAD(&ring->submits);
- spin_lock_init(&ring->lock);
spin_lock_init(&ring->preempt_lock);
snprintf(name, sizeof(name), "gpu-ring-%d", ring->id);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h index 0987d6bf848c..4956d1bc5d0e 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.h +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h @@ -46,7 +46,12 @@ struct msm_ringbuffer { struct msm_rbmemptrs *memptrs; uint64_t memptrs_iova; struct msm_fence_context *fctx;
- spinlock_t lock;
- /*
* preempt_lock protects preemption and serializes wptr updates against
* preemption. Can be aquired from irq context.
*/
- spinlock_t preempt_lock;
};
struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id,
2.26.2
From: Rob Clark robdclark@chromium.org
One less place to rely on dev->struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 2 ++ drivers/gpu/drm/msm/msm_gpu.c | 37 ++++++++++++++++++++++------ drivers/gpu/drm/msm/msm_ringbuffer.c | 1 + drivers/gpu/drm/msm/msm_ringbuffer.h | 6 +++++ 4 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index aa5c60a7132d..e1d1f005b3d4 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -63,7 +63,9 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, void msm_gem_submit_free(struct msm_gem_submit *submit) { dma_fence_put(submit->fence); + spin_lock(&submit->ring->submit_lock); list_del(&submit->node); + spin_unlock(&submit->ring->submit_lock); put_pid(submit->pid); msm_submitqueue_put(submit->queue);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index ca8c95b32c8b..8d1e254f964a 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -270,6 +270,7 @@ static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, { struct msm_gem_submit *submit;
+ spin_lock(&ring->submit_lock); list_for_each_entry(submit, &ring->submits, node) { if (submit->seqno > fence) break; @@ -277,6 +278,7 @@ static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, msm_update_fence(submit->ring->fctx, submit->fence->seqno); } + spin_unlock(&ring->submit_lock); }
#ifdef CONFIG_DEV_COREDUMP @@ -430,11 +432,14 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence) { struct msm_gem_submit *submit;
- WARN_ON(!mutex_is_locked(&ring->gpu->dev->struct_mutex)); - - list_for_each_entry(submit, &ring->submits, node) - if (submit->seqno == fence) + spin_lock(&ring->submit_lock); + list_for_each_entry(submit, &ring->submits, node) { + if (submit->seqno == fence) { + spin_unlock(&ring->submit_lock); return submit; + } + } + spin_unlock(&ring->submit_lock);
return NULL; } @@ -523,8 +528,10 @@ static void recover_worker(struct work_struct *work) for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i];
+ spin_lock(&ring->submit_lock); list_for_each_entry(submit, &ring->submits, node) gpu->funcs->submit(gpu, submit); + spin_unlock(&ring->submit_lock); } }
@@ -711,7 +718,6 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, static void retire_submits(struct msm_gpu *gpu) { struct drm_device *dev = gpu->dev; - struct msm_gem_submit *submit, *tmp; int i;
WARN_ON(!mutex_is_locked(&dev->struct_mutex)); @@ -720,9 +726,24 @@ static void retire_submits(struct msm_gpu *gpu) for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i];
- list_for_each_entry_safe(submit, tmp, &ring->submits, node) { - if (dma_fence_is_signaled(submit->fence)) + while (true) { + struct msm_gem_submit *submit = NULL; + + spin_lock(&ring->submit_lock); + submit = list_first_entry_or_null(&ring->submits, + struct msm_gem_submit, node); + spin_unlock(&ring->submit_lock); + + /* + * If no submit, we are done. If submit->fence hasn't + * been signalled, then later submits are not signalled + * either, so we are also done. + */ + if (submit && dma_fence_is_signaled(submit->fence)) { retire_submit(gpu, ring, submit); + } else { + break; + } } } } @@ -765,7 +786,9 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
submit->seqno = ++ring->seqno;
+ spin_lock(&ring->submit_lock); list_add_tail(&submit->node, &ring->submits); + spin_unlock(&ring->submit_lock);
msm_rd_dump_submit(priv->rd, submit, NULL);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 1b6958e908dc..4d2a2a4abef8 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -46,6 +46,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id, ring->memptrs_iova = memptrs_iova;
INIT_LIST_HEAD(&ring->submits); + spin_lock_init(&ring->submit_lock); spin_lock_init(&ring->preempt_lock);
snprintf(name, sizeof(name), "gpu-ring-%d", ring->id); diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h index 4956d1bc5d0e..fe55d4a1aa16 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.h +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h @@ -39,7 +39,13 @@ struct msm_ringbuffer { int id; struct drm_gem_object *bo; uint32_t *start, *end, *cur, *next; + + /* + * List of in-flight submits on this ring. Protected by submit_lock. + */ struct list_head submits; + spinlock_t submit_lock; + uint64_t iova; uint32_t seqno; uint32_t hangcheck_fence;
On Sun, Oct 04, 2020 at 12:21:38PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
One less place to rely on dev->struct_mutex.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem_submit.c | 2 ++ drivers/gpu/drm/msm/msm_gpu.c | 37 ++++++++++++++++++++++------ drivers/gpu/drm/msm/msm_ringbuffer.c | 1 + drivers/gpu/drm/msm/msm_ringbuffer.h | 6 +++++ 4 files changed, 39 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index aa5c60a7132d..e1d1f005b3d4 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -63,7 +63,9 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, void msm_gem_submit_free(struct msm_gem_submit *submit) { dma_fence_put(submit->fence);
- spin_lock(&submit->ring->submit_lock); list_del(&submit->node);
- spin_unlock(&submit->ring->submit_lock); put_pid(submit->pid); msm_submitqueue_put(submit->queue);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index ca8c95b32c8b..8d1e254f964a 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -270,6 +270,7 @@ static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, { struct msm_gem_submit *submit;
- spin_lock(&ring->submit_lock); list_for_each_entry(submit, &ring->submits, node) { if (submit->seqno > fence) break;
@@ -277,6 +278,7 @@ static void update_fences(struct msm_gpu *gpu, struct msm_ringbuffer *ring, msm_update_fence(submit->ring->fctx, submit->fence->seqno); }
- spin_unlock(&ring->submit_lock);
}
#ifdef CONFIG_DEV_COREDUMP @@ -430,11 +432,14 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence) { struct msm_gem_submit *submit;
- WARN_ON(!mutex_is_locked(&ring->gpu->dev->struct_mutex));
- list_for_each_entry(submit, &ring->submits, node)
if (submit->seqno == fence)
spin_lock(&ring->submit_lock);
list_for_each_entry(submit, &ring->submits, node) {
if (submit->seqno == fence) {
spin_unlock(&ring->submit_lock); return submit;
}
}
spin_unlock(&ring->submit_lock);
return NULL;
} @@ -523,8 +528,10 @@ static void recover_worker(struct work_struct *work) for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i];
spin_lock(&ring->submit_lock); list_for_each_entry(submit, &ring->submits, node) gpu->funcs->submit(gpu, submit);
} }spin_unlock(&ring->submit_lock);
@@ -711,7 +718,6 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring, static void retire_submits(struct msm_gpu *gpu) { struct drm_device *dev = gpu->dev;
struct msm_gem_submit *submit, *tmp; int i;
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -720,9 +726,24 @@ static void retire_submits(struct msm_gpu *gpu) for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i];
list_for_each_entry_safe(submit, tmp, &ring->submits, node) {
if (dma_fence_is_signaled(submit->fence))
while (true) {
struct msm_gem_submit *submit = NULL;
spin_lock(&ring->submit_lock);
submit = list_first_entry_or_null(&ring->submits,
struct msm_gem_submit, node);
spin_unlock(&ring->submit_lock);
/*
* If no submit, we are done. If submit->fence hasn't
* been signalled, then later submits are not signalled
* either, so we are also done.
*/
if (submit && dma_fence_is_signaled(submit->fence)) { retire_submit(gpu, ring, submit);
} else {
break;
} }}
} @@ -765,7 +786,9 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
submit->seqno = ++ring->seqno;
spin_lock(&ring->submit_lock); list_add_tail(&submit->node, &ring->submits);
spin_unlock(&ring->submit_lock);
msm_rd_dump_submit(priv->rd, submit, NULL);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 1b6958e908dc..4d2a2a4abef8 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -46,6 +46,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id, ring->memptrs_iova = memptrs_iova;
INIT_LIST_HEAD(&ring->submits);
spin_lock_init(&ring->submit_lock); spin_lock_init(&ring->preempt_lock);
snprintf(name, sizeof(name), "gpu-ring-%d", ring->id);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h index 4956d1bc5d0e..fe55d4a1aa16 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.h +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h @@ -39,7 +39,13 @@ struct msm_ringbuffer { int id; struct drm_gem_object *bo; uint32_t *start, *end, *cur, *next;
- /*
* List of in-flight submits on this ring. Protected by submit_lock.
struct list_head submits;*/
- spinlock_t submit_lock;
- uint64_t iova; uint32_t seqno; uint32_t hangcheck_fence;
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
Before we remove dev->struct_mutex from the retire path, we have to deal with the situation of a submit retiring before the submit ioctl returns.
To deal with this, ring->submits will hold a reference to the submit, which is dropped when the submit is retired. And the submit ioctl path holds it's own ref, which it drops when it is done with the submit.
Also, add to submit list *after* getting/pinning bo's, to prevent badness in case the completed fence is corrupted, and retire_worker mistakenly believes the submit is done too early.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_drv.h | 1 - drivers/gpu/drm/msm/msm_gem.h | 13 +++++++++++++ drivers/gpu/drm/msm/msm_gem_submit.c | 12 ++++++------ drivers/gpu/drm/msm/msm_gpu.c | 21 ++++++++++++++++----- 4 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 50978e5db376..535f9e718e2d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -277,7 +277,6 @@ void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
bool msm_use_mmu(struct drm_device *dev);
-void msm_gem_submit_free(struct msm_gem_submit *submit); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index a1bf741b9b89..e05b1530aca6 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -136,6 +136,7 @@ void msm_gem_free_work(struct work_struct *work); * lasts for the duration of the submit-ioctl. */ struct msm_gem_submit { + struct kref ref; struct drm_device *dev; struct msm_gpu *gpu; struct msm_gem_address_space *aspace; @@ -169,6 +170,18 @@ struct msm_gem_submit { } bos[]; };
+void __msm_gem_submit_destroy(struct kref *kref); + +static inline void msm_gem_submit_get(struct msm_gem_submit *submit) +{ + kref_get(&submit->ref); +} + +static inline void msm_gem_submit_put(struct msm_gem_submit *submit) +{ + kref_put(&submit->ref, __msm_gem_submit_destroy); +} + /* helper to determine of a buffer in submit should be dumped, used for both * devcoredump and debugfs cmdstream dumping: */ diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index e1d1f005b3d4..7d653bdc92dc 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -42,6 +42,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, if (!submit) return NULL;
+ kref_init(&submit->ref); submit->dev = dev; submit->aspace = queue->ctx->aspace; submit->gpu = gpu; @@ -60,12 +61,12 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, return submit; }
-void msm_gem_submit_free(struct msm_gem_submit *submit) +void __msm_gem_submit_destroy(struct kref *kref) { + struct msm_gem_submit *submit = + container_of(kref, struct msm_gem_submit, ref); + dma_fence_put(submit->fence); - spin_lock(&submit->ring->submit_lock); - list_del(&submit->node); - spin_unlock(&submit->ring->submit_lock); put_pid(submit->pid); msm_submitqueue_put(submit->queue);
@@ -805,8 +806,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, submit_cleanup(submit); if (has_ww_ticket) ww_acquire_fini(&submit->ticket); - if (ret) - msm_gem_submit_free(submit); + msm_gem_submit_put(submit); out_unlock: if (ret && (out_fence_fd >= 0)) put_unused_fd(out_fence_fd); diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 8d1e254f964a..fd3fc6f36ab1 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -712,7 +712,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
pm_runtime_mark_last_busy(&gpu->pdev->dev); pm_runtime_put_autosuspend(&gpu->pdev->dev); - msm_gem_submit_free(submit); + + spin_lock(&ring->submit_lock); + list_del(&submit->node); + spin_unlock(&ring->submit_lock); + + msm_gem_submit_put(submit); }
static void retire_submits(struct msm_gpu *gpu) @@ -786,10 +791,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
submit->seqno = ++ring->seqno;
- spin_lock(&ring->submit_lock); - list_add_tail(&submit->node, &ring->submits); - spin_unlock(&ring->submit_lock); - msm_rd_dump_submit(priv->rd, submit, NULL);
update_sw_cntrs(gpu); @@ -816,6 +817,16 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) msm_gem_active_get(drm_obj, gpu); }
+ /* + * ring->submits holds a ref to the submit, to deal with the case + * that a submit completes before msm_ioctl_gem_submit() returns. + */ + msm_gem_submit_get(submit); + + spin_lock(&ring->submit_lock); + list_add_tail(&submit->node, &ring->submits); + spin_unlock(&ring->submit_lock); + gpu->funcs->submit(gpu, submit); priv->lastctx = submit->queue->ctx;
On Sun, Oct 04, 2020 at 12:21:39PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Before we remove dev->struct_mutex from the retire path, we have to deal with the situation of a submit retiring before the submit ioctl returns.
To deal with this, ring->submits will hold a reference to the submit, which is dropped when the submit is retired. And the submit ioctl path holds it's own ref, which it drops when it is done with the submit.
Also, add to submit list *after* getting/pinning bo's, to prevent badness in case the completed fence is corrupted, and retire_worker mistakenly believes the submit is done too early.
Signed-off-by: Rob Clark robdclark@chromium.org
Why not embed the dma_fence instead of pointer and use that refcount? i915 does that, and imo kinda makes sense instead of more refcounted things. But might not make sense for msm. -Daniel
drivers/gpu/drm/msm/msm_drv.h | 1 - drivers/gpu/drm/msm/msm_gem.h | 13 +++++++++++++ drivers/gpu/drm/msm/msm_gem_submit.c | 12 ++++++------ drivers/gpu/drm/msm/msm_gpu.c | 21 ++++++++++++++++----- 4 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 50978e5db376..535f9e718e2d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -277,7 +277,6 @@ void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
bool msm_use_mmu(struct drm_device *dev);
-void msm_gem_submit_free(struct msm_gem_submit *submit); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index a1bf741b9b89..e05b1530aca6 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -136,6 +136,7 @@ void msm_gem_free_work(struct work_struct *work);
- lasts for the duration of the submit-ioctl.
*/ struct msm_gem_submit {
- struct kref ref; struct drm_device *dev; struct msm_gpu *gpu; struct msm_gem_address_space *aspace;
@@ -169,6 +170,18 @@ struct msm_gem_submit { } bos[]; };
+void __msm_gem_submit_destroy(struct kref *kref);
+static inline void msm_gem_submit_get(struct msm_gem_submit *submit) +{
- kref_get(&submit->ref);
+}
+static inline void msm_gem_submit_put(struct msm_gem_submit *submit) +{
- kref_put(&submit->ref, __msm_gem_submit_destroy);
+}
/* helper to determine of a buffer in submit should be dumped, used for both
- devcoredump and debugfs cmdstream dumping:
*/ diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index e1d1f005b3d4..7d653bdc92dc 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -42,6 +42,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, if (!submit) return NULL;
- kref_init(&submit->ref); submit->dev = dev; submit->aspace = queue->ctx->aspace; submit->gpu = gpu;
@@ -60,12 +61,12 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, return submit; }
-void msm_gem_submit_free(struct msm_gem_submit *submit) +void __msm_gem_submit_destroy(struct kref *kref) {
- struct msm_gem_submit *submit =
container_of(kref, struct msm_gem_submit, ref);
- dma_fence_put(submit->fence);
- spin_lock(&submit->ring->submit_lock);
- list_del(&submit->node);
- spin_unlock(&submit->ring->submit_lock); put_pid(submit->pid); msm_submitqueue_put(submit->queue);
@@ -805,8 +806,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, submit_cleanup(submit); if (has_ww_ticket) ww_acquire_fini(&submit->ticket);
- if (ret)
msm_gem_submit_free(submit);
- msm_gem_submit_put(submit);
out_unlock: if (ret && (out_fence_fd >= 0)) put_unused_fd(out_fence_fd); diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 8d1e254f964a..fd3fc6f36ab1 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -712,7 +712,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
pm_runtime_mark_last_busy(&gpu->pdev->dev); pm_runtime_put_autosuspend(&gpu->pdev->dev);
- msm_gem_submit_free(submit);
- spin_lock(&ring->submit_lock);
- list_del(&submit->node);
- spin_unlock(&ring->submit_lock);
- msm_gem_submit_put(submit);
}
static void retire_submits(struct msm_gpu *gpu) @@ -786,10 +791,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
submit->seqno = ++ring->seqno;
spin_lock(&ring->submit_lock);
list_add_tail(&submit->node, &ring->submits);
spin_unlock(&ring->submit_lock);
msm_rd_dump_submit(priv->rd, submit, NULL);
update_sw_cntrs(gpu);
@@ -816,6 +817,16 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) msm_gem_active_get(drm_obj, gpu); }
- /*
* ring->submits holds a ref to the submit, to deal with the case
* that a submit completes before msm_ioctl_gem_submit() returns.
*/
- msm_gem_submit_get(submit);
- spin_lock(&ring->submit_lock);
- list_add_tail(&submit->node, &ring->submits);
- spin_unlock(&ring->submit_lock);
- gpu->funcs->submit(gpu, submit); priv->lastctx = submit->queue->ctx;
-- 2.26.2
On Mon, Oct 5, 2020 at 6:56 AM Daniel Vetter daniel@ffwll.ch wrote:
On Sun, Oct 04, 2020 at 12:21:39PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Before we remove dev->struct_mutex from the retire path, we have to deal with the situation of a submit retiring before the submit ioctl returns.
To deal with this, ring->submits will hold a reference to the submit, which is dropped when the submit is retired. And the submit ioctl path holds it's own ref, which it drops when it is done with the submit.
Also, add to submit list *after* getting/pinning bo's, to prevent badness in case the completed fence is corrupted, and retire_worker mistakenly believes the submit is done too early.
Signed-off-by: Rob Clark robdclark@chromium.org
Why not embed the dma_fence instead of pointer and use that refcount? i915 does that, and imo kinda makes sense instead of more refcounted things. But might not make sense for msm.
I guess that might work.. the one thing I'd be concerned about is that the submit (indirectly) holds reference to the file ctx, so userspace keeping around a fence fd by mistake could keep a set of pgtables live unnecessarily.. I suppose we could re-work where we drop that reference.
six of one, half-dozen of the other, I guess
BR, -R
-Daniel
drivers/gpu/drm/msm/msm_drv.h | 1 - drivers/gpu/drm/msm/msm_gem.h | 13 +++++++++++++ drivers/gpu/drm/msm/msm_gem_submit.c | 12 ++++++------ drivers/gpu/drm/msm/msm_gpu.c | 21 ++++++++++++++++----- 4 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 50978e5db376..535f9e718e2d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -277,7 +277,6 @@ void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
bool msm_use_mmu(struct drm_device *dev);
-void msm_gem_submit_free(struct msm_gem_submit *submit); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index a1bf741b9b89..e05b1530aca6 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -136,6 +136,7 @@ void msm_gem_free_work(struct work_struct *work);
- lasts for the duration of the submit-ioctl.
*/ struct msm_gem_submit {
struct kref ref; struct drm_device *dev; struct msm_gpu *gpu; struct msm_gem_address_space *aspace;
@@ -169,6 +170,18 @@ struct msm_gem_submit { } bos[]; };
+void __msm_gem_submit_destroy(struct kref *kref);
+static inline void msm_gem_submit_get(struct msm_gem_submit *submit) +{
kref_get(&submit->ref);
+}
+static inline void msm_gem_submit_put(struct msm_gem_submit *submit) +{
kref_put(&submit->ref, __msm_gem_submit_destroy);
+}
/* helper to determine of a buffer in submit should be dumped, used for both
- devcoredump and debugfs cmdstream dumping:
*/ diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index e1d1f005b3d4..7d653bdc92dc 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -42,6 +42,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, if (!submit) return NULL;
kref_init(&submit->ref); submit->dev = dev; submit->aspace = queue->ctx->aspace; submit->gpu = gpu;
@@ -60,12 +61,12 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, return submit; }
-void msm_gem_submit_free(struct msm_gem_submit *submit) +void __msm_gem_submit_destroy(struct kref *kref) {
struct msm_gem_submit *submit =
container_of(kref, struct msm_gem_submit, ref);
dma_fence_put(submit->fence);
spin_lock(&submit->ring->submit_lock);
list_del(&submit->node);
spin_unlock(&submit->ring->submit_lock); put_pid(submit->pid); msm_submitqueue_put(submit->queue);
@@ -805,8 +806,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, submit_cleanup(submit); if (has_ww_ticket) ww_acquire_fini(&submit->ticket);
if (ret)
msm_gem_submit_free(submit);
msm_gem_submit_put(submit);
out_unlock: if (ret && (out_fence_fd >= 0)) put_unused_fd(out_fence_fd); diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 8d1e254f964a..fd3fc6f36ab1 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -712,7 +712,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
pm_runtime_mark_last_busy(&gpu->pdev->dev); pm_runtime_put_autosuspend(&gpu->pdev->dev);
msm_gem_submit_free(submit);
spin_lock(&ring->submit_lock);
list_del(&submit->node);
spin_unlock(&ring->submit_lock);
msm_gem_submit_put(submit);
}
static void retire_submits(struct msm_gpu *gpu) @@ -786,10 +791,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
submit->seqno = ++ring->seqno;
spin_lock(&ring->submit_lock);
list_add_tail(&submit->node, &ring->submits);
spin_unlock(&ring->submit_lock);
msm_rd_dump_submit(priv->rd, submit, NULL); update_sw_cntrs(gpu);
@@ -816,6 +817,16 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) msm_gem_active_get(drm_obj, gpu); }
/*
* ring->submits holds a ref to the submit, to deal with the case
* that a submit completes before msm_ioctl_gem_submit() returns.
*/
msm_gem_submit_get(submit);
spin_lock(&ring->submit_lock);
list_add_tail(&submit->node, &ring->submits);
spin_unlock(&ring->submit_lock);
gpu->funcs->submit(gpu, submit); priv->lastctx = submit->queue->ctx;
-- 2.26.2
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Sun, Oct 04, 2020 at 12:21:39PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Before we remove dev->struct_mutex from the retire path, we have to deal with the situation of a submit retiring before the submit ioctl returns.
To deal with this, ring->submits will hold a reference to the submit, which is dropped when the submit is retired. And the submit ioctl path holds it's own ref, which it drops when it is done with the submit.
Also, add to submit list *after* getting/pinning bo's, to prevent badness in case the completed fence is corrupted, and retire_worker mistakenly believes the submit is done too early.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_drv.h | 1 - drivers/gpu/drm/msm/msm_gem.h | 13 +++++++++++++ drivers/gpu/drm/msm/msm_gem_submit.c | 12 ++++++------ drivers/gpu/drm/msm/msm_gpu.c | 21 ++++++++++++++++----- 4 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 50978e5db376..535f9e718e2d 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -277,7 +277,6 @@ void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
bool msm_use_mmu(struct drm_device *dev);
-void msm_gem_submit_free(struct msm_gem_submit *submit); int msm_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index a1bf741b9b89..e05b1530aca6 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -136,6 +136,7 @@ void msm_gem_free_work(struct work_struct *work);
- lasts for the duration of the submit-ioctl.
*/ struct msm_gem_submit {
- struct kref ref; struct drm_device *dev; struct msm_gpu *gpu; struct msm_gem_address_space *aspace;
@@ -169,6 +170,18 @@ struct msm_gem_submit { } bos[]; };
+void __msm_gem_submit_destroy(struct kref *kref);
+static inline void msm_gem_submit_get(struct msm_gem_submit *submit) +{
- kref_get(&submit->ref);
+}
+static inline void msm_gem_submit_put(struct msm_gem_submit *submit) +{
- kref_put(&submit->ref, __msm_gem_submit_destroy);
+}
/* helper to determine of a buffer in submit should be dumped, used for both
- devcoredump and debugfs cmdstream dumping:
*/ diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index e1d1f005b3d4..7d653bdc92dc 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -42,6 +42,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, if (!submit) return NULL;
- kref_init(&submit->ref); submit->dev = dev; submit->aspace = queue->ctx->aspace; submit->gpu = gpu;
@@ -60,12 +61,12 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, return submit; }
-void msm_gem_submit_free(struct msm_gem_submit *submit) +void __msm_gem_submit_destroy(struct kref *kref) {
- struct msm_gem_submit *submit =
container_of(kref, struct msm_gem_submit, ref);
- dma_fence_put(submit->fence);
- spin_lock(&submit->ring->submit_lock);
- list_del(&submit->node);
- spin_unlock(&submit->ring->submit_lock); put_pid(submit->pid); msm_submitqueue_put(submit->queue);
@@ -805,8 +806,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, submit_cleanup(submit); if (has_ww_ticket) ww_acquire_fini(&submit->ticket);
- if (ret)
msm_gem_submit_free(submit);
- msm_gem_submit_put(submit);
out_unlock: if (ret && (out_fence_fd >= 0)) put_unused_fd(out_fence_fd); diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 8d1e254f964a..fd3fc6f36ab1 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -712,7 +712,12 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
pm_runtime_mark_last_busy(&gpu->pdev->dev); pm_runtime_put_autosuspend(&gpu->pdev->dev);
- msm_gem_submit_free(submit);
- spin_lock(&ring->submit_lock);
- list_del(&submit->node);
- spin_unlock(&ring->submit_lock);
- msm_gem_submit_put(submit);
}
static void retire_submits(struct msm_gpu *gpu) @@ -786,10 +791,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
submit->seqno = ++ring->seqno;
spin_lock(&ring->submit_lock);
list_add_tail(&submit->node, &ring->submits);
spin_unlock(&ring->submit_lock);
msm_rd_dump_submit(priv->rd, submit, NULL);
update_sw_cntrs(gpu);
@@ -816,6 +817,16 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) msm_gem_active_get(drm_obj, gpu); }
- /*
* ring->submits holds a ref to the submit, to deal with the case
* that a submit completes before msm_ioctl_gem_submit() returns.
*/
- msm_gem_submit_get(submit);
- spin_lock(&ring->submit_lock);
- list_add_tail(&submit->node, &ring->submits);
- spin_unlock(&ring->submit_lock);
- gpu->funcs->submit(gpu, submit); priv->lastctx = submit->queue->ctx;
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
It cannot be atomically updated with obj->active_count, and the only purpose is a useless WARN_ON() (which becomes a buggy WARN_ON() once retire_submits() is not serialized with incoming submits via struct_mutex)
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem.c | 2 -- drivers/gpu/drm/msm/msm_gem.h | 1 - drivers/gpu/drm/msm/msm_gpu.c | 5 ----- 3 files changed, 8 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index b04ed8b52f9d..c52a3969e60b 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -753,7 +753,6 @@ void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu)
if (!atomic_fetch_inc(&msm_obj->active_count)) { mutex_lock(&priv->mm_lock); - msm_obj->gpu = gpu; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &gpu->active_list); mutex_unlock(&priv->mm_lock); @@ -769,7 +768,6 @@ void msm_gem_active_put(struct drm_gem_object *obj)
if (!atomic_dec_return(&msm_obj->active_count)) { mutex_lock(&priv->mm_lock); - msm_obj->gpu = NULL; list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &priv->inactive_list); mutex_unlock(&priv->mm_lock); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index e05b1530aca6..61147bd96b06 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -64,7 +64,6 @@ struct msm_gem_object { * */ struct list_head mm_list; - struct msm_gpu *gpu; /* non-null if active */
/* Transiently in the process of submit ioctl, objects associated * with the submit are on submit->bo_list.. this only lasts for diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index fd3fc6f36ab1..c9ff19a75169 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -800,11 +800,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct drm_gem_object *drm_obj = &msm_obj->base; uint64_t iova;
- /* can't happen yet.. but when we add 2d support we'll have - * to deal w/ cross-ring synchronization: - */ - WARN_ON(is_active(msm_obj) && (msm_obj->gpu != gpu)); - /* submit takes a reference to the bo and iova until retired: */ drm_gem_object_get(&msm_obj->base); msm_gem_get_and_pin_iova(&msm_obj->base, submit->aspace, &iova);
On Sun, Oct 04, 2020 at 12:21:40PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
It cannot be atomically updated with obj->active_count, and the only purpose is a useless WARN_ON() (which becomes a buggy WARN_ON() once retire_submits() is not serialized with incoming submits via struct_mutex)
Somebody will prove me wrong but the longer we go without 2D the less likely it is that we'll ever see it.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 2 -- drivers/gpu/drm/msm/msm_gem.h | 1 - drivers/gpu/drm/msm/msm_gpu.c | 5 ----- 3 files changed, 8 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index b04ed8b52f9d..c52a3969e60b 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -753,7 +753,6 @@ void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu)
if (!atomic_fetch_inc(&msm_obj->active_count)) { mutex_lock(&priv->mm_lock);
list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &gpu->active_list); mutex_unlock(&priv->mm_lock);msm_obj->gpu = gpu;
@@ -769,7 +768,6 @@ void msm_gem_active_put(struct drm_gem_object *obj)
if (!atomic_dec_return(&msm_obj->active_count)) { mutex_lock(&priv->mm_lock);
list_del_init(&msm_obj->mm_list); list_add_tail(&msm_obj->mm_list, &priv->inactive_list); mutex_unlock(&priv->mm_lock);msm_obj->gpu = NULL;
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index e05b1530aca6..61147bd96b06 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -64,7 +64,6 @@ struct msm_gem_object { * */ struct list_head mm_list;
struct msm_gpu *gpu; /* non-null if active */
/* Transiently in the process of submit ioctl, objects associated
- with the submit are on submit->bo_list.. this only lasts for
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index fd3fc6f36ab1..c9ff19a75169 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -800,11 +800,6 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit) struct drm_gem_object *drm_obj = &msm_obj->base; uint64_t iova;
/* can't happen yet.. but when we add 2d support we'll have
* to deal w/ cross-ring synchronization:
*/
WARN_ON(is_active(msm_obj) && (msm_obj->gpu != gpu));
- /* submit takes a reference to the bo and iova until retired: */ drm_gem_object_get(&msm_obj->base); msm_gem_get_and_pin_iova(&msm_obj->base, submit->aspace, &iova);
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
Now that we are not relying on dev->struct_mutex to protect the ring->submits lists, drop the struct_mutex lock.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gpu.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index c9ff19a75169..5e351d1c00e9 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -707,7 +707,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
msm_gem_active_put(&msm_obj->base); msm_gem_unpin_iova(&msm_obj->base, submit->aspace); - drm_gem_object_put_locked(&msm_obj->base); + drm_gem_object_put(&msm_obj->base); }
pm_runtime_mark_last_busy(&gpu->pdev->dev); @@ -722,11 +722,8 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
static void retire_submits(struct msm_gpu *gpu) { - struct drm_device *dev = gpu->dev; int i;
- WARN_ON(!mutex_is_locked(&dev->struct_mutex)); - /* Retire the commits starting with highest priority */ for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i]; @@ -756,15 +753,12 @@ static void retire_submits(struct msm_gpu *gpu) static void retire_worker(struct work_struct *work) { struct msm_gpu *gpu = container_of(work, struct msm_gpu, retire_work); - struct drm_device *dev = gpu->dev; int i;
for (i = 0; i < gpu->nr_rings; i++) update_fences(gpu, gpu->rb[i], gpu->rb[i]->memptrs->fence);
- mutex_lock(&dev->struct_mutex); retire_submits(gpu); - mutex_unlock(&dev->struct_mutex); }
/* call from irq handler to schedule work to retire bo's */
On Sun, Oct 04, 2020 at 12:21:41PM -0700, Rob Clark wrote:
From: Rob Clark robdclark@chromium.org
Now that we are not relying on dev->struct_mutex to protect the ring->submits lists, drop the struct_mutex lock.
Reviewed-by: Jordan Crouse jcrouse@codeaurora.org
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gpu.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index c9ff19a75169..5e351d1c00e9 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -707,7 +707,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
msm_gem_active_put(&msm_obj->base); msm_gem_unpin_iova(&msm_obj->base, submit->aspace);
drm_gem_object_put_locked(&msm_obj->base);
drm_gem_object_put(&msm_obj->base);
}
pm_runtime_mark_last_busy(&gpu->pdev->dev);
@@ -722,11 +722,8 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
static void retire_submits(struct msm_gpu *gpu) {
struct drm_device *dev = gpu->dev; int i;
WARN_ON(!mutex_is_locked(&dev->struct_mutex));
/* Retire the commits starting with highest priority */ for (i = 0; i < gpu->nr_rings; i++) { struct msm_ringbuffer *ring = gpu->rb[i];
@@ -756,15 +753,12 @@ static void retire_submits(struct msm_gpu *gpu) static void retire_worker(struct work_struct *work) { struct msm_gpu *gpu = container_of(work, struct msm_gpu, retire_work);
struct drm_device *dev = gpu->dev; int i;
for (i = 0; i < gpu->nr_rings; i++) update_fences(gpu, gpu->rb[i], gpu->rb[i]->memptrs->fence);
mutex_lock(&dev->struct_mutex); retire_submits(gpu);
mutex_unlock(&dev->struct_mutex);
}
/* call from irq handler to schedule work to retire bo's */
2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
From: Rob Clark robdclark@chromium.org
Now that active_list/inactive_list is protected by mm_lock, we no longer need dev->struct_mutex in the free_object() path.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem.c | 8 -------- 1 file changed, 8 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index c52a3969e60b..126d92fd21cd 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -927,8 +927,6 @@ static void free_object(struct msm_gem_object *msm_obj) struct drm_device *dev = obj->dev; struct msm_drm_private *priv = dev->dev_private;
- WARN_ON(!mutex_is_locked(&dev->struct_mutex)); - /* object should not be on active list: */ WARN_ON(is_active(msm_obj));
@@ -965,20 +963,14 @@ void msm_gem_free_work(struct work_struct *work) { struct msm_drm_private *priv = container_of(work, struct msm_drm_private, free_work); - struct drm_device *dev = priv->dev; struct llist_node *freed; struct msm_gem_object *msm_obj, *next;
while ((freed = llist_del_all(&priv->free_list))) { - - mutex_lock(&dev->struct_mutex); - llist_for_each_entry_safe(msm_obj, next, freed, freed) free_object(msm_obj);
- mutex_unlock(&dev->struct_mutex); - if (need_resched()) break; }
From: Rob Clark robdclark@chromium.org
Now that we don't need struct_mutex in the free path, we can get rid of the asynchronous free altogether.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_drv.c | 3 --- drivers/gpu/drm/msm/msm_drv.h | 5 ----- drivers/gpu/drm/msm/msm_gem.c | 27 --------------------------- drivers/gpu/drm/msm/msm_gem.h | 1 - 4 files changed, 36 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index dc6efc089285..e766c1f45045 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -437,9 +437,6 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv)
priv->wq = alloc_ordered_workqueue("msm", 0);
- INIT_WORK(&priv->free_work, msm_gem_free_work); - init_llist_head(&priv->free_list); - INIT_LIST_HEAD(&priv->inactive_list); mutex_init(&priv->mm_lock);
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 535f9e718e2d..96f8009e247c 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -188,10 +188,6 @@ struct msm_drm_private { struct list_head inactive_list; struct mutex mm_lock;
- /* worker for delayed free of objects: */ - struct work_struct free_work; - struct llist_head free_list; - struct workqueue_struct *wq;
unsigned int num_planes; @@ -340,7 +336,6 @@ void msm_gem_kernel_put(struct drm_gem_object *bo, struct msm_gem_address_space *aspace, bool locked); struct drm_gem_object *msm_gem_import(struct drm_device *dev, struct dma_buf *dmabuf, struct sg_table *sgt); -void msm_gem_free_work(struct work_struct *work);
__printf(2, 3) void msm_gem_object_set_name(struct drm_gem_object *bo, const char *fmt, ...); diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 126d92fd21cd..5e75d644ce41 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -917,16 +917,6 @@ void msm_gem_free_object(struct drm_gem_object *obj) struct drm_device *dev = obj->dev; struct msm_drm_private *priv = dev->dev_private;
- if (llist_add(&msm_obj->freed, &priv->free_list)) - queue_work(priv->wq, &priv->free_work); -} - -static void free_object(struct msm_gem_object *msm_obj) -{ - struct drm_gem_object *obj = &msm_obj->base; - struct drm_device *dev = obj->dev; - struct msm_drm_private *priv = dev->dev_private; - /* object should not be on active list: */ WARN_ON(is_active(msm_obj));
@@ -959,23 +949,6 @@ static void free_object(struct msm_gem_object *msm_obj) kfree(msm_obj); }
-void msm_gem_free_work(struct work_struct *work) -{ - struct msm_drm_private *priv = - container_of(work, struct msm_drm_private, free_work); - struct llist_node *freed; - struct msm_gem_object *msm_obj, *next; - - while ((freed = llist_del_all(&priv->free_list))) { - llist_for_each_entry_safe(msm_obj, next, - freed, freed) - free_object(msm_obj); - - if (need_resched()) - break; - } -} - /* convenience method to construct a GEM buffer object, and userspace handle */ int msm_gem_new_handle(struct drm_device *dev, struct drm_file *file, uint32_t size, uint32_t flags, uint32_t *handle, diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 61147bd96b06..e98a8004813b 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -127,7 +127,6 @@ enum msm_gem_lock {
void msm_gem_purge(struct drm_gem_object *obj, enum msm_gem_lock subclass); void msm_gem_vunmap(struct drm_gem_object *obj, enum msm_gem_lock subclass); -void msm_gem_free_work(struct work_struct *work);
/* Created per submit-ioctl, to track bo's and cmdstream bufs, etc, * associated with the cmdstream submission for synchronization (and
From: Rob Clark robdclark@chromium.org
The obj->lock is sufficient for what we need.
This *does* have the implication that userspace can try to shoot themselves in the foot by racing madvise(DONTNEED) with submit. But the result will be about the same if they did madvise(DONTNEED) before the submit ioctl, ie. they might not get want they want if they race with shrinker. But iova fault handling is robust enough, and userspace is only shooting it's own foot.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_drv.c | 11 ++------ drivers/gpu/drm/msm/msm_gem.c | 6 ++-- drivers/gpu/drm/msm/msm_gem.h | 38 ++++++++++++++++++-------- drivers/gpu/drm/msm/msm_gem_shrinker.c | 4 +-- 4 files changed, 32 insertions(+), 27 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index e766c1f45045..d2488816ce48 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -906,14 +906,9 @@ static int msm_ioctl_gem_madvise(struct drm_device *dev, void *data, return -EINVAL; }
- ret = mutex_lock_interruptible(&dev->struct_mutex); - if (ret) - return ret; - obj = drm_gem_object_lookup(file, args->handle); if (!obj) { - ret = -ENOENT; - goto unlock; + return -ENOENT; }
ret = msm_gem_madvise(obj, args->madv); @@ -922,10 +917,8 @@ static int msm_ioctl_gem_madvise(struct drm_device *dev, void *data, ret = 0; }
- drm_gem_object_put_locked(obj); + drm_gem_object_put(obj);
-unlock: - mutex_unlock(&dev->struct_mutex); return ret; }
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 5e75d644ce41..9cdac4f7228c 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -639,8 +639,6 @@ int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv)
mutex_lock(&msm_obj->lock);
- WARN_ON(!mutex_is_locked(&obj->dev->struct_mutex)); - if (msm_obj->madv != __MSM_MADV_PURGED) msm_obj->madv = madv;
@@ -657,7 +655,7 @@ void msm_gem_purge(struct drm_gem_object *obj, enum msm_gem_lock subclass) struct msm_gem_object *msm_obj = to_msm_bo(obj);
WARN_ON(!mutex_is_locked(&dev->struct_mutex)); - WARN_ON(!is_purgeable(msm_obj)); + WARN_ON(!is_purgeable(msm_obj, subclass)); WARN_ON(obj->import_attach);
mutex_lock_nested(&msm_obj->lock, subclass); @@ -749,7 +747,7 @@ void msm_gem_active_get(struct drm_gem_object *obj, struct msm_gpu *gpu) struct msm_drm_private *priv = obj->dev->dev_private;
might_sleep(); - WARN_ON(msm_obj->madv != MSM_MADV_WILLNEED); + WARN_ON(msm_gem_madv(msm_obj, OBJ_LOCK_NORMAL) != MSM_MADV_WILLNEED);
if (!atomic_fetch_inc(&msm_obj->active_count)) { mutex_lock(&priv->mm_lock); diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index e98a8004813b..bb8aa6b1b254 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -97,18 +97,6 @@ static inline bool is_active(struct msm_gem_object *msm_obj) return atomic_read(&msm_obj->active_count); }
-static inline bool is_purgeable(struct msm_gem_object *msm_obj) -{ - WARN_ON(!mutex_is_locked(&msm_obj->base.dev->struct_mutex)); - return (msm_obj->madv == MSM_MADV_DONTNEED) && msm_obj->sgt && - !msm_obj->base.dma_buf && !msm_obj->base.import_attach; -} - -static inline bool is_vunmapable(struct msm_gem_object *msm_obj) -{ - return (msm_obj->vmap_count == 0) && msm_obj->vaddr; -} - /* The shrinker can be triggered while we hold objA->lock, and need * to grab objB->lock to purge it. Lockdep just sees these as a single * class of lock, so we use subclasses to teach it the difference. @@ -125,6 +113,32 @@ enum msm_gem_lock { OBJ_LOCK_SHRINKER, };
+/* Use this helper to read msm_obj->madv when msm_obj->lock not held: */ +static inline unsigned +msm_gem_madv(struct msm_gem_object *msm_obj, enum msm_gem_lock subclass) +{ + unsigned madv; + + mutex_lock_nested(&msm_obj->lock, subclass); + madv = msm_obj->madv; + mutex_unlock(&msm_obj->lock); + + return madv; +} + +static inline bool +is_purgeable(struct msm_gem_object *msm_obj, enum msm_gem_lock subclass) +{ + return (msm_gem_madv(msm_obj, subclass) == MSM_MADV_DONTNEED) && + msm_obj->sgt && !msm_obj->base.dma_buf && + !msm_obj->base.import_attach; +} + +static inline bool is_vunmapable(struct msm_gem_object *msm_obj) +{ + return (msm_obj->vmap_count == 0) && msm_obj->vaddr; +} + void msm_gem_purge(struct drm_gem_object *obj, enum msm_gem_lock subclass); void msm_gem_vunmap(struct drm_gem_object *obj, enum msm_gem_lock subclass);
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index c41b84a3a484..39a1b5327267 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -54,7 +54,7 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) mutex_lock(&priv->mm_lock);
list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { - if (is_purgeable(msm_obj)) + if (is_purgeable(msm_obj, OBJ_LOCK_SHRINKER)) count += msm_obj->base.size >> PAGE_SHIFT; }
@@ -84,7 +84,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) list_for_each_entry(msm_obj, &priv->inactive_list, mm_list) { if (freed >= sc->nr_to_scan) break; - if (is_purgeable(msm_obj)) { + if (is_purgeable(msm_obj, OBJ_LOCK_SHRINKER)) { msm_gem_purge(&msm_obj->base, OBJ_LOCK_SHRINKER); freed += msm_obj->base.size >> PAGE_SHIFT; }
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 9cdac4f7228c..e749a1c6f4e0 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -654,7 +654,6 @@ void msm_gem_purge(struct drm_gem_object *obj, enum msm_gem_lock subclass) struct drm_device *dev = obj->dev; struct msm_gem_object *msm_obj = to_msm_bo(obj);
- WARN_ON(!mutex_is_locked(&dev->struct_mutex)); WARN_ON(!is_purgeable(msm_obj, subclass)); WARN_ON(obj->import_attach);
diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 39a1b5327267..2c7bda1e2bf9 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -8,48 +8,13 @@ #include "msm_gem.h" #include "msm_gpu_trace.h"
-static bool msm_gem_shrinker_lock(struct drm_device *dev, bool *unlock) -{ - /* NOTE: we are *closer* to being able to get rid of - * mutex_trylock_recursive().. the msm_gem code itself does - * not need struct_mutex, although codepaths that can trigger - * shrinker are still called in code-paths that hold the - * struct_mutex. - * - * Also, msm_obj->madv is protected by struct_mutex. - * - * The next step is probably split out a seperate lock for - * protecting inactive_list, so that shrinker does not need - * struct_mutex. - */ - switch (mutex_trylock_recursive(&dev->struct_mutex)) { - case MUTEX_TRYLOCK_FAILED: - return false; - - case MUTEX_TRYLOCK_SUCCESS: - *unlock = true; - return true; - - case MUTEX_TRYLOCK_RECURSIVE: - *unlock = false; - return true; - } - - BUG(); -} - static unsigned long msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker); - struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long count = 0; - bool unlock; - - if (!msm_gem_shrinker_lock(dev, &unlock)) - return 0;
mutex_lock(&priv->mm_lock);
@@ -60,9 +25,6 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc)
mutex_unlock(&priv->mm_lock);
- if (unlock) - mutex_unlock(&dev->struct_mutex); - return count; }
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker); - struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0; - bool unlock; - - if (!msm_gem_shrinker_lock(dev, &unlock)) - return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
@@ -92,9 +49,6 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc)
mutex_unlock(&priv->mm_lock);
- if (unlock) - mutex_unlock(&dev->struct_mutex); - if (freed > 0) trace_msm_gem_purge(freed << PAGE_SHIFT);
@@ -106,13 +60,8 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) { struct msm_drm_private *priv = container_of(nb, struct msm_drm_private, vmap_notifier); - struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned unmapped = 0; - bool unlock; - - if (!msm_gem_shrinker_lock(dev, &unlock)) - return NOTIFY_DONE;
mutex_lock(&priv->mm_lock);
@@ -130,9 +79,6 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr)
mutex_unlock(&priv->mm_lock);
- if (unlock) - mutex_unlock(&dev->struct_mutex); - *(unsigned long *)ptr += unmapped;
if (unmapped > 0)
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything. -Daniel
On Mon, Oct 5, 2020 at 4:02 PM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
This is definitely a tangible improvement though - very happy to see msm_gem_shrinker_lock() go.
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, 5 Oct 2020 18:17:01 Kristian H. Kristensen wrote:
On Mon, Oct 5, 2020 at 4:02 PM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
The trylock makes page reclaimers turn to their next target e.g. inode cache instead of waiting for the mutex to be released. It makes sense for instance in scenarios of mild memory pressure.
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
This is definitely a tangible improvement though - very happy to see msm_gem_shrinker_lock() go.
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Oct 5, 2020 at 5:44 PM Hillf Danton hdanton@sina.com wrote:
On Mon, 5 Oct 2020 18:17:01 Kristian H. Kristensen wrote:
On Mon, Oct 5, 2020 at 4:02 PM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
The trylock makes page reclaimers turn to their next target e.g. inode cache instead of waiting for the mutex to be released. It makes sense for instance in scenarios of mild memory pressure.
is there some behind-the-scenes signalling for this, or is this just down to what the shrinker callbacks return? Generally when we get into shrinking, there are a big set of purgable bo's to consider, so the shrinker callback return wouldn't be considering just one potentially lock contended bo (buffer object). Ie failing one trylock, we just move on to the next.
fwiw, what I've seen on the userspace bo cache vs shrinker (anything that is shrinker potential is in userspace bo cache and MADV(WONTNEED)) is that in steady state I see a very strong recycling of bo's (which avoids allocating and mmap'ing or mapping to gpu a new buffer object), so it is definitely a win in mmap/realloc bandwidth.. in steady state there is a lot of free and realloc of same-sized buffers from frame to frame.
But in transient situations like moving to new game level when there is a heavy memory pressure and lots of freeing old buffers/textures/etc and then allocating new ones, I see shrinker kicking in hard (in android situations, not so much so with traditional linux userspace)
BR, -R
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
This is definitely a tangible improvement though - very happy to see msm_gem_shrinker_lock() go.
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, 5 Oct 2020 20:40:12 Rob Clark robdclark@gmail.com wrote:
On Mon, Oct 5, 2020 at 5:44 PM Hillf Danton hdanton@sina.com wrote:
On Mon, 5 Oct 2020 18:17:01 Kristian H. Kristensen wrote:
On Mon, Oct 5, 2020 at 4:02 PM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
The trylock makes page reclaimers turn to their next target e.g. inode cache instead of waiting for the mutex to be released. It makes sense for instance in scenarios of mild memory pressure.
is there some behind-the-scenes signalling for this, or is this just down to what the shrinker callbacks return?
Lets see what Dave may have in his mind about your questions.
Generally when we get into shrinking, there are a big set of purgable bo's to consider, so the shrinker callback return wouldn't be considering just one potentially lock contended bo (buffer object). Ie failing one trylock, we just move on to the next.
fwiw, what I've seen on the userspace bo cache vs shrinker (anything that is shrinker potential is in userspace bo cache and MADV(WONTNEED)) is that in steady state I see a very strong recycling of bo's (which avoids allocating and mmap'ing or mapping to gpu a new buffer object), so it is definitely a win in mmap/realloc bandwidth.. in steady state there is a lot of free and realloc of same-sized buffers from frame to frame.
But in transient situations like moving to new game level when there is a heavy memory pressure and lots of freeing old buffers/textures/etc and then allocating new ones, I see shrinker kicking in hard (in android situations, not so much so with traditional linux userspace)
BR, -R
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
This is definitely a tangible improvement though - very happy to see msm_gem_shrinker_lock() go.
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Oct 05, 2020 at 08:40:12PM -0700, Rob Clark wrote:
On Mon, Oct 5, 2020 at 5:44 PM Hillf Danton hdanton@sina.com wrote:
On Mon, 5 Oct 2020 18:17:01 Kristian H. Kristensen wrote:
On Mon, Oct 5, 2020 at 4:02 PM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
The trylock makes page reclaimers turn to their next target e.g. inode cache instead of waiting for the mutex to be released. It makes sense for instance in scenarios of mild memory pressure.
is there some behind-the-scenes signalling for this, or is this just down to what the shrinker callbacks return? Generally when we get into shrinking, there are a big set of purgable bo's to consider, so the shrinker callback return wouldn't be considering just one potentially lock contended bo (buffer object). Ie failing one trylock, we just move on to the next.
fwiw, what I've seen on the userspace bo cache vs shrinker (anything that is shrinker potential is in userspace bo cache and MADV(WONTNEED)) is that in steady state I see a very strong recycling of bo's (which avoids allocating and mmap'ing or mapping to gpu a new buffer object), so it is definitely a win in mmap/realloc bandwidth.. in steady state there is a lot of free and realloc of same-sized buffers from frame to frame.
But in transient situations like moving to new game level when there is a heavy memory pressure and lots of freeing old buffers/textures/etc and then allocating new ones, I see shrinker kicking in hard (in android situations, not so much so with traditional linux userspace)
Yeah per-buffer trylock is fine. Trylock on the mm_lock (or anything else device-global, like struct_mutex and msm_gem_shrinker_lock) I think isn't fine, since if you're unlucky you're hogging a ton of memory and that's the only freeable resource in the system. Going to other shrinkers won't help when it's the gpu shrinker that has all the freeable memory.
Also other shrinkers (inode and all these) also do lots of per-object trylocking. I think there's a canonical threshold of shrinker rounds where you're supposed to try harder (if possible), but that doesn't apply to dma_resv_lock. -Daniel
BR, -R
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
This is definitely a tangible improvement though - very happy to see msm_gem_shrinker_lock() go.
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Oct 5, 2020 at 7:02 AM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
the shrinker lock class has existed for a while.. and is based on the idea that anything in the get-pages/vmap path cannot happen on a WONTNEED bo.. although perhaps there should be a few more 'if (WARN_ON(obj->madv != WILLNEED)) return -EBUSY'..
replacing obj->lock with dma_resv lock, might be a nice cleanup.. but I think it will be a bit churny..
BR, -R
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Mon, Oct 5, 2020 at 6:49 PM Rob Clark robdclark@gmail.com wrote:
On Mon, Oct 5, 2020 at 7:02 AM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 05, 2020 at 05:24:19PM +0800, Hillf Danton wrote:
On Sun, 4 Oct 2020 12:21:45
From: Rob Clark robdclark@chromium.org
Now that the inactive_list is protected by mm_lock, and everything else on per-obj basis is protected by obj->lock, we no longer depend on struct_mutex.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/msm_gem.c | 1 - drivers/gpu/drm/msm/msm_gem_shrinker.c | 54 -------------------------- 2 files changed, 55 deletions(-)
[...]
@@ -71,13 +33,8 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = container_of(shrinker, struct msm_drm_private, shrinker);
struct drm_device *dev = priv->dev; struct msm_gem_object *msm_obj; unsigned long freed = 0;
bool unlock;
if (!msm_gem_shrinker_lock(dev, &unlock))
return SHRINK_STOP;
mutex_lock(&priv->mm_lock);
Better if the change in behavior is documented that SHRINK_STOP will no longer be needed.
btw I read through this and noticed you have your own obj lock, plus mutex_lock_nested. I strongly recommend to just cut over to dma_resv_lock for all object lock needs (soc drivers have been terrible with this unfortuntaly), and in the shrinker just use dma_resv_trylock instead of trying to play clever games outsmarting lockdep.
I recently wrote an entire blog length rant on why I think mutex_lock_nested is too dangerous to be useful:
https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Not anything about this here, just general comment. The problem extends to shmem helpers and all that also having their own locks for everything.
the shrinker lock class has existed for a while.. and is based on the idea that anything in the get-pages/vmap path cannot happen on a WONTNEED bo.. although perhaps there should be a few more 'if (WARN_ON(obj->madv != WILLNEED)) return -EBUSY'..
Yeah it works, but it's the kind of really clever stuff that eventually bites again. For pretty much no benefit, if the lock is held then you can assume someone else is using the object and you won't be able to shrink it anyway. So trylock is enough.
replacing obj->lock with dma_resv lock, might be a nice cleanup.. but I think it will be a bit churny..
Oh fully agreed, I tried to push the helpers a bit in that direction for shmem/cma and gave up. Just something I think we should have in mind. And in case your gpu ever becomes discrete ... yes the churn is terrible :-/ -Daniel
From: Rob Clark robdclark@chromium.org
Any cross-device sync use-cases *must* use explicit sync. And if there is only a single ring (no-preemption), everything is FIFO order and there is no need to implicit-sync.
Mesa should probably just always use MSM_SUBMIT_NO_IMPLICIT, as behavior is undefined when fences are not used to synchronize buffer usage across contexts (which is the only case where multiple different priority rings could come into play).
Signed-off-by: Rob Clark robdclark@chromium.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 7d653bdc92dc..b9b68153b7b2 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -219,7 +219,7 @@ static int submit_lock_objects(struct msm_gem_submit *submit) return ret; }
-static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit) +static int submit_fence_sync(struct msm_gem_submit *submit, bool implicit_sync) { int i, ret = 0;
@@ -239,7 +239,7 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit) return ret; }
- if (no_implicit) + if (!implicit_sync) continue;
ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx, @@ -704,7 +704,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, if (ret) goto out;
- ret = submit_fence_sync(submit, !!(args->flags & MSM_SUBMIT_NO_IMPLICIT)); + ret = submit_fence_sync(submit, (gpu->nr_rings > 1) && + !(args->flags & MSM_SUBMIT_NO_IMPLICIT)); if (ret) goto out;
On Sun, Oct 4, 2020 at 9:21 PM Rob Clark robdclark@gmail.com wrote:
From: Rob Clark robdclark@chromium.org
This doesn't remove *all* the struct_mutex, but it covers the worst of it, ie. shrinker/madvise/free/retire. The submit path still uses struct_mutex, but it still needs *something* serialize a portion of the submit path, and lock_stat mostly just shows the lock contention there being with other submits. And there are a few other bits of struct_mutex usage in less critical paths (debugfs, etc). But this seems like a reasonable step in the right direction.
What a great patch set. Daniel has some good points and nothing that requires big changes, but on the other hand, I'm not sure it's something that needs to block this set either.
Either way, for the series
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
Rob Clark (14): drm/msm: Use correct drm_gem_object_put() in fail case drm/msm: Drop chatty trace drm/msm: Move update_fences() drm/msm: Add priv->mm_lock to protect active/inactive lists drm/msm: Document and rename preempt_lock drm/msm: Protect ring->submits with it's own lock drm/msm: Refcount submits drm/msm: Remove obj->gpu drm/msm: Drop struct_mutex from the retire path drm/msm: Drop struct_mutex in free_object() path drm/msm: remove msm_gem_free_work drm/msm: drop struct_mutex in madvise path drm/msm: Drop struct_mutex in shrinker path drm/msm: Don't implicit-sync if only a single ring
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +- drivers/gpu/drm/msm/msm_debugfs.c | 7 ++ drivers/gpu/drm/msm/msm_drv.c | 15 +--- drivers/gpu/drm/msm/msm_drv.h | 19 +++-- drivers/gpu/drm/msm/msm_gem.c | 76 ++++++------------ drivers/gpu/drm/msm/msm_gem.h | 53 +++++++++---- drivers/gpu/drm/msm/msm_gem_shrinker.c | 58 ++------------ drivers/gpu/drm/msm/msm_gem_submit.c | 17 ++-- drivers/gpu/drm/msm/msm_gpu.c | 96 ++++++++++++++--------- drivers/gpu/drm/msm/msm_gpu.h | 5 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 3 +- drivers/gpu/drm/msm/msm_ringbuffer.h | 13 ++- 14 files changed, 188 insertions(+), 194 deletions(-)
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
On Mon, Oct 5, 2020 at 6:24 PM Kristian Høgsberg hoegsberg@gmail.com wrote:
On Sun, Oct 4, 2020 at 9:21 PM Rob Clark robdclark@gmail.com wrote:
From: Rob Clark robdclark@chromium.org
This doesn't remove *all* the struct_mutex, but it covers the worst of it, ie. shrinker/madvise/free/retire. The submit path still uses struct_mutex, but it still needs *something* serialize a portion of the submit path, and lock_stat mostly just shows the lock contention there being with other submits. And there are a few other bits of struct_mutex usage in less critical paths (debugfs, etc). But this seems like a reasonable step in the right direction.
What a great patch set. Daniel has some good points and nothing that requires big changes, but on the other hand, I'm not sure it's something that needs to block this set either.
Personally I'd throw the lockdep priming on top to make sure this stays correct (it's 3 lines), but yes imo this is all good to go. Just figured I'll sprinkle the latest in terms of gem locking over the series while it's here :-) -Daniel
Either way, for the series
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
Rob Clark (14): drm/msm: Use correct drm_gem_object_put() in fail case drm/msm: Drop chatty trace drm/msm: Move update_fences() drm/msm: Add priv->mm_lock to protect active/inactive lists drm/msm: Document and rename preempt_lock drm/msm: Protect ring->submits with it's own lock drm/msm: Refcount submits drm/msm: Remove obj->gpu drm/msm: Drop struct_mutex from the retire path drm/msm: Drop struct_mutex in free_object() path drm/msm: remove msm_gem_free_work drm/msm: drop struct_mutex in madvise path drm/msm: Drop struct_mutex in shrinker path drm/msm: Don't implicit-sync if only a single ring
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +- drivers/gpu/drm/msm/msm_debugfs.c | 7 ++ drivers/gpu/drm/msm/msm_drv.c | 15 +--- drivers/gpu/drm/msm/msm_drv.h | 19 +++-- drivers/gpu/drm/msm/msm_gem.c | 76 ++++++------------ drivers/gpu/drm/msm/msm_gem.h | 53 +++++++++---- drivers/gpu/drm/msm/msm_gem_shrinker.c | 58 ++------------ drivers/gpu/drm/msm/msm_gem_submit.c | 17 ++-- drivers/gpu/drm/msm/msm_gpu.c | 96 ++++++++++++++--------- drivers/gpu/drm/msm/msm_gpu.h | 5 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 3 +- drivers/gpu/drm/msm/msm_ringbuffer.h | 13 ++- 14 files changed, 188 insertions(+), 194 deletions(-)
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Oct 5, 2020 at 11:20 AM Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Oct 5, 2020 at 6:24 PM Kristian Høgsberg hoegsberg@gmail.com wrote:
On Sun, Oct 4, 2020 at 9:21 PM Rob Clark robdclark@gmail.com wrote:
From: Rob Clark robdclark@chromium.org
This doesn't remove *all* the struct_mutex, but it covers the worst of it, ie. shrinker/madvise/free/retire. The submit path still uses struct_mutex, but it still needs *something* serialize a portion of the submit path, and lock_stat mostly just shows the lock contention there being with other submits. And there are a few other bits of struct_mutex usage in less critical paths (debugfs, etc). But this seems like a reasonable step in the right direction.
What a great patch set. Daniel has some good points and nothing that requires big changes, but on the other hand, I'm not sure it's something that needs to block this set either.
Personally I'd throw the lockdep priming on top to make sure this stays correct (it's 3 lines), but yes imo this is all good to go. Just figured I'll sprinkle the latest in terms of gem locking over the series while it's here :-)
Yeah, I'll defn throw the lockdep priming into v2.. and I've got using obj->resv for locking on the todo list but looks like enough churn that it will probably be it's own series (but seems like there is room to introduce some lock/unlock helpers that don't really change anything but make an obj->lock transition easier)
BR, -R
-Daniel
Either way, for the series
Reviewed-by: Kristian H. Kristensen hoegsberg@google.com
Rob Clark (14): drm/msm: Use correct drm_gem_object_put() in fail case drm/msm: Drop chatty trace drm/msm: Move update_fences() drm/msm: Add priv->mm_lock to protect active/inactive lists drm/msm: Document and rename preempt_lock drm/msm: Protect ring->submits with it's own lock drm/msm: Refcount submits drm/msm: Remove obj->gpu drm/msm: Drop struct_mutex from the retire path drm/msm: Drop struct_mutex in free_object() path drm/msm: remove msm_gem_free_work drm/msm: drop struct_mutex in madvise path drm/msm: Drop struct_mutex in shrinker path drm/msm: Don't implicit-sync if only a single ring
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 12 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 4 +- drivers/gpu/drm/msm/msm_debugfs.c | 7 ++ drivers/gpu/drm/msm/msm_drv.c | 15 +--- drivers/gpu/drm/msm/msm_drv.h | 19 +++-- drivers/gpu/drm/msm/msm_gem.c | 76 ++++++------------ drivers/gpu/drm/msm/msm_gem.h | 53 +++++++++---- drivers/gpu/drm/msm/msm_gem_shrinker.c | 58 ++------------ drivers/gpu/drm/msm/msm_gem_submit.c | 17 ++-- drivers/gpu/drm/msm/msm_gpu.c | 96 ++++++++++++++--------- drivers/gpu/drm/msm/msm_gpu.h | 5 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 3 +- drivers/gpu/drm/msm/msm_ringbuffer.h | 13 ++- 14 files changed, 188 insertions(+), 194 deletions(-)
-- 2.26.2
Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
dri-devel@lists.freedesktop.org