Re: [PATCH] drm/msm/kms: Make a lock_class_key for each crtc mutex

3 Feb 2021

On Wed, Feb 3, 2021 at 2:10 AM Daniel Vetter daniel@ffwll.ch wrote:
...
On Tue, Feb 02, 2021 at 08:51:25AM -0800, Rob Clark wrote:
...
On Tue, Feb 2, 2021 at 7:46 AM Daniel Vetter daniel@ffwll.ch wrote:
...
On Mon, Jan 25, 2021 at 03:49:01PM -0800, Stephen Boyd wrote:
...
Lockdep complains about an AA deadlock when rebooting the device.
============================================
WARNING: possible recursive locking detected
5.4.91 #1 Not tainted

reboot/5213 is trying to acquire lock:
ffffff80d13391b0 (&kms->commit_lock[i]){+.+.}, at: lock_crtcs+0x60/0xa4
but task is already holding lock:
ffffff80d1339110 (&kms->commit_lock[i]){+.+.}, at: lock_crtcs+0x60/0xa4
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
lock(&kms->commit_lock[i]);
lock(&kms->commit_lock[i]);
*** DEADLOCK ***
May be due to missing lock nesting notation
6 locks held by reboot/5213:
__arm64_sys_reboot+0x148/0x2a0
device_shutdown+0x10c/0x2c4
drm_atomic_helper_shutdown+0x48/0xfc
modeset_lock+0x120/0x24c
lock_crtcs+0x60/0xa4
stack backtrace:
CPU: 4 PID: 5213 Comm: reboot Not tainted 5.4.91 #1
Hardware name: Google Pompom (rev1) with LTE (DT)
Call trace:
dump_backtrace+0x0/0x1dc
show_stack+0x24/0x30
dump_stack+0xfc/0x1a8
__lock_acquire+0xcd0/0x22b8
lock_acquire+0x1ec/0x240
__mutex_lock_common+0xe0/0xc84
mutex_lock_nested+0x48/0x58
lock_crtcs+0x60/0xa4
msm_atomic_commit_tail+0x348/0x570
commit_tail+0xdc/0x178
drm_atomic_helper_commit+0x160/0x168
drm_atomic_commit+0x68/0x80
This is because lockdep thinks all the locks taken in lock_crtcs() are
the same lock, when they actually aren't. That's because we call
mutex_init() in msm_kms_init() and that assigns on static key for every
lock initialized in this loop. Let's allocate a dynamic number of
lock_class_keys and assign them to each lock so that lockdep can figure
out an AA deadlock isn't possible here.
Fixes: b3d91800d9ac ("drm/msm: Fix race condition in msm driver with async layer updates")
Cc: Krishna Manikandan mkrishn@codeaurora.org
Signed-off-by: Stephen Boyd swboyd@chromium.org
This smells like throwing more bad after initial bad code ...
First a rant: https://blog.ffwll.ch/2020/08/lockdep-false-positives.html
Some technical on the patch itself: I think you want
mutex_lock_nested(crtc->lock, drm_crtc_index(crtc)), not your own locking
classes hand-rolled. It's defacto the same, but much more obviously
correct since self-documenting.
hmm, yeah, that is a bit cleaner.. but this patch is already on
msm-next, maybe I'll add a patch on top to change it
...
...
...
Yes I know the locking you're doing here is correct, but that goes to the
second issue: Why is this needed? atomic_async_update helpers are supposed
to take care of ordering fun like this, if they're not, we need to address
things there. The problem that
Maybe a better solution would be helper awareness of hw that has
double-buffered state and flush bits.. ie. something that looks a bit
more like the internal kms fxn ptrs. Currently the locking is
protecting something that the atomic helpers are not aware of, ie.
we've already written previous cursor updates to hw and are just
waiting until close to vblank to write the flush bits
But, we've been over this before. I'd tried various approaches.. the
current scheme replaces seanpaul's earlier attempts to do it the
"helper" way.  The current implementation does the best job of
avoiding fps drops when the legacy cursor uapi is in play.  (And yes,
legacy cursor + atomic ioctls is maybe not the greatest, but it is
what it is.)
I didn't read enough of the context and got confused, the flush handling
looks all reasonable and obviously needs some locks to avoid races with
updates.
It still looks a bit strange that you need multi-crtc locks for cursor
(generally this stuff is supposed to be solved with ordering) and why the
async helpers don't work since msm has something that's pretty close
itself. Atomic+cursor is a bit nasty, but if every driver hacks this
together themselves then there's not much chance of this ever really
working well across the board. And aside from the flush bit instead of
automatic double buffering (which you're just emulating) there's not
really anything special with msm afaics. So pretty sure that if this
doesn't work for msm, it doesn't work anywhere else.
I did dig out a hub/keyboard/mouse for the one mtk device I have to
see how it behaves with cursor updates.. the fps drops are pretty
bad.. roughly the same as they were with msm prior to the current
"defer the flush bits" scheme.  Maybe I'm just more picky than others.
Better helper support for this sort of hw would be useful, since I
think it is not uncommon.  (At least omap is similar, I didn't really
look at mtk/rockchip.)  If I had a clone to handle the display side of
things, my clone would work on that ;-)
BR,
-R
...
-Daniel
...
BR,
-R
...
commit b3d91800d9ac35014e0349292273a6fa7938d402
Author: Krishna Manikandan mkrishn@codeaurora.org
Date:   Fri Oct 16 19:40:43 2020 +0530
drm/msm: Fix race condition in msm driver with async layer updates


is _the_ reason we have drm_crtc_commit to track stuff, and Maxime has
recently rolled out a pile of changes to vc4 to use these things
correctly. Hacking some glorious hand-rolled locking for synchronization
of updates really should be the exception for kms drivers, not the rule.
And this one here doesn't look like an exception by far (the one legit I
know of is the locking issues amdgpu has between atomic_commit_tail and
gpu reset, and that one is really nasty, so not going to get fixed in
helpers, ever).
Cheers, Daniel
...

drivers/gpu/drm/msm/msm_kms.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index d8151a89e163..4735251a394d 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -157,6 +157,7 @@ struct msm_kms {
       * from the crtc's pending_timer close to end of the frame:
       */
      struct mutex commit_lock[MAX_CRTCS];

struct lock_class_key commit_lock_keys[MAX_CRTCS];
unsigned pending_crtc_mask;
struct msm_pending_timer pending_timers[MAX_CRTCS];



};
@@ -166,8 +167,11 @@ static inline int msm_kms_init(struct msm_kms *kms,
 {
      unsigned i, ret;

for (i = 0; i < ARRAY_SIZE(kms->commit_lock); i++)


        mutex_init(&kms->commit_lock[i]);




for (i = 0; i < ARRAY_SIZE(kms->commit_lock); i++) {


        lockdep_register_key(&kms->commit_lock_keys[i]);


        __mutex_init(&kms->commit_lock[i], "&kms->commit_lock[i]",


                     &kms->commit_lock_keys[i]);


}

kms->funcs = funcs;





base-commit: 19c329f6808995b142b3966301f217c831e7cf31
https://chromeos.dev

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH] drm/msm/kms: Make a lock_class_key for each crtc mutex

CPU0

base-commit: 19c329f6808995b142b3966301f217c831e7cf31