On Thu, Mar 03, 2022 at 08:03:21AM +0000, Hyeonggon Yoo wrote:
On Thu, Mar 03, 2022 at 09:18:13AM +0900, Byungchul Park wrote:
Hi Hyeonggon,
Dept also allows the following scenario when an user guarantees that each lock instance is different from another at a different depth:
lock A0 with depth lock A1 with depth + 1 lock A2 with depth + 2 lock A3 with depth + 3 (and so on) .. unlock A3 unlock A2 unlock A1 unlock A0
Look at this. Dept allows object->lock -> other_object->lock (with a different depth using *_lock_nested()) so won't report it.
However, Dept does not allow the following scenario where another lock class cuts in the dependency chain:
lock A0 with depth lock B lock A1 with depth + 1 lock A2 with depth + 2 lock A3 with depth + 3 (and so on) .. unlock A3 unlock A2 unlock A1 unlock B unlock A0
This scenario is clearly problematic. What do you think is going to happen with another context running the following?
First of all, I want to say I'm not expert at locking primitives. I may be wrong.
It's okay. Thanks anyway for your feedback.
45 * scan_mutex [-> object->lock] -> kmemleak_lock -> other_object->lock (SINGLE_DEPTH_NESTING) 46 * 47 * No kmemleak_lock and object->lock nesting is allowed outside scan_mutex 48 * regions.
lock order in kmemleak is described above.
and DEPT detects two cases as deadlock:
- object->lock -> other_object->lock
It's not a deadlock *IF* two have different depth using *_lock_nested(). Dept also allows this case. So Dept wouldn't report it.
- object->lock -> kmemleak_lock, kmemleak_lock -> other_object->lock
But this usage is risky. I already explained it in the mail you replied to. I copied it. See the below.
context A
lock A0 with depth lock B lock A1 with depth + 1 lock A2 with depth + 2 lock A3 with depth + 3 (and so on) .. unlock A3 unlock A2 unlock A1 unlock B unlock A0
...
context B
lock A1 with depth lock B lock A2 with depth + 1 lock A3 with depth + 2 (and so on) .. unlock A3 unlock A2 unlock B unlock A1
where Ax : object->lock, B : kmemleak_lock.
A deadlock might occur if the two contexts run at the same time.
And in kmemleak case, 1) and 2) is not possible because it must hold scan_mutex first.
This is another issue. Let's focus on whether the order is okay for now.
I think the author of kmemleak intended lockdep to treat object->lock and other_object->lock as different class, using raw_spin_lock_nested().
Yes. The author meant to assign a different class according to its depth using a Lockdep API. Strictly speaking, those are the same class anyway but we assign a different class to each depth to avoid Lockdep splats *IF* the user guarantees the nesting lock usage is safe, IOW, guarantees each lock instance is different at a different depth.
I was fundamentally asking you... so... is the nesting lock usage safe for real? I hope you distinguish between the safe case and the risky case when *_lock_nested() is involved. Thoughts?
Thanks, Byungchul
Am I missing something?
Thanks.
lock A1 with depth lock B lock A2 with depth + 1 lock A3 with depth + 2 (and so on) .. unlock A3 unlock A2 unlock B unlock A1
It's a deadlock. That's why Dept reports this case as a problem. Or am I missing something?
Thanks, Byungchul
context A's detail
context A [S] __raw_spin_lock_irqsave(&object->lock:0) [W] __raw_spin_lock_irqsave(kmemleak_lock:0) [E] spin_unlock(&object->lock:0)
[S] __raw_spin_lock_irqsave(&object->lock:0): [<ffffffc00810302c>] scan_gray_list+0x84/0x13c stacktrace: dept_ecxt_enter+0x88/0xf4 _raw_spin_lock_irqsave+0xf0/0x1c4 scan_gray_list+0x84/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
[W] __raw_spin_lock_irqsave(kmemleak_lock:0): [<ffffffc008102ebc>] scan_block+0x3c/0x128 stacktrace: __dept_wait+0x8c/0xa4 dept_wait+0x6c/0x88 _raw_spin_lock_irqsave+0xb8/0x1c4 scan_block+0x3c/0x128 scan_gray_list+0xc4/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
[E] spin_unlock(&object->lock:0): [<ffffffc008102ee0>] scan_block+0x60/0x128
context B's detail
context B [S] __raw_spin_lock_irqsave(kmemleak_lock:0) [W] _raw_spin_lock_nested(&object->lock:0) [E] spin_unlock(kmemleak_lock:0)
[S] __raw_spin_lock_irqsave(kmemleak_lock:0): [<ffffffc008102ebc>] scan_block+0x3c/0x128 stacktrace: dept_ecxt_enter+0x88/0xf4 _raw_spin_lock_irqsave+0xf0/0x1c4 scan_block+0x3c/0x128 kmemleak_scan+0x19c/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
[W] _raw_spin_lock_nested(&object->lock:0): [<ffffffc008102f34>] scan_block+0xb4/0x128 stacktrace: dept_wait+0x74/0x88 _raw_spin_lock_nested+0xa8/0x1b0 scan_block+0xb4/0x128 kmemleak_scan+0x19c/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20 [E] spin_unlock(kmemleak_lock:0): [<ffffffc008102ee0>] scan_block+0x60/0x128 stacktrace: dept_event+0x7c/0xfc _raw_spin_unlock_irqrestore+0x8c/0x120 scan_block+0x60/0x128 kmemleak_scan+0x19c/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
information that might be helpful
CPU: 1 PID: 38 Comm: kmemleak Tainted: G W 5.17.0-rc1+ #1 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.0+0x9c/0xc4 show_stack+0x14/0x28 dump_stack_lvl+0x9c/0xcc dump_stack+0x14/0x2c print_circle+0x2d4/0x438 cb_check_dl+0x6c/0x70 bfs+0xc0/0x168 add_dep+0x88/0x11c add_wait+0x2d0/0x2dc __dept_wait+0x8c/0xa4 dept_wait+0x6c/0x88 _raw_spin_lock_irqsave+0xb8/0x1c4 scan_block+0x3c/0x128 scan_gray_list+0xc4/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
=================================================== DEPT: Circular dependency has been detected. 5.17.0-rc1+ #1 Tainted: G W
summary
*** AA DEADLOCK ***
context A [S] __raw_spin_lock_irqsave(&object->lock:0) [W] _raw_spin_lock_nested(&object->lock:0) [E] spin_unlock(&object->lock:0)
[S]: start of the event context [W]: the wait blocked [E]: the event not reachable
context A's detail
context A [S] __raw_spin_lock_irqsave(&object->lock:0) [W] _raw_spin_lock_nested(&object->lock:0) [E] spin_unlock(&object->lock:0)
[S] __raw_spin_lock_irqsave(&object->lock:0): [<ffffffc00810302c>] scan_gray_list+0x84/0x13c stacktrace: dept_ecxt_enter+0x88/0xf4 _raw_spin_lock_irqsave+0xf0/0x1c4 scan_gray_list+0x84/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
[E] spin_unlock(&object->lock:0): [<ffffffc008102ee0>] scan_block+0x60/0x128
information that might be helpful
CPU: 1 PID: 38 Comm: kmemleak Tainted: G W 5.17.0-rc1+ #1 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace.part.0+0x9c/0xc4 show_stack+0x14/0x28 dump_stack_lvl+0x9c/0xcc dump_stack+0x14/0x2c print_circle+0x2d4/0x438 cb_check_dl+0x44/0x70 bfs+0x60/0x168 add_dep+0x88/0x11c add_wait+0x2d0/0x2dc __dept_wait+0x8c/0xa4 dept_wait+0x6c/0x88 _raw_spin_lock_nested+0xa8/0x1b0 scan_block+0xb4/0x128 scan_gray_list+0xc4/0x13c kmemleak_scan+0x2d8/0x54c kmemleak_scan_thread+0xac/0xd4 kthread+0xd4/0xe4 ret_from_fork+0x10/0x20
[...]
-- Thank you, You are awesome! Hyeonggon :-)
-- Thank you, You are awesome! Hyeonggon :-)