On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
From: Nicolai Hähnle Nicolai.Haehnle@amd.com
Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in the following example. Acquire context stamps are ordered like the thread numbers, i.e. thread #1 should back off when it encounters a mutex locked by thread #0 etc.
Thread #0 Thread #1 Thread #2 Thread #3
lock(ww) success lock(ww') success lock(ww) lock(ww) . . . unlock(ww) part 1
lock(ww) . . . success . . . . . unlock(ww) part 2 . back off lock(ww') . . . (stuck) (stuck)
Here, unlock(ww) part 1 is the part that sets lock->base.count to 1 (without being protected by lock->base.wait_lock), meaning that thread #0 can acquire ww in the fast path or, much more likely, the medium path in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then won't wake up any of the waiters in ww_mutex_set_context_fastpath.
Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is thread #2, since waiters are added at the tail. Thread #2 wakes up and backs off since it sees ww owned by a context with a lower stamp.
Meanwhile, thread #1 is never woken up, and so it won't back off its lock on ww'. So thread #0 gets stuck waiting for ww' to be released.
This patch fixes the deadlock by waking up all waiters in the slow path of ww_mutex_unlock.
We have an internal test case for amdgpu which continuously submits command streams from tens of threads, where all command streams reference hundreds of GPU buffer objects with a lot of overlap in the buffer lists between command streams. This test reliably caused a deadlock, and while I haven't completely confirmed that it is exactly the scenario outlined above, this patch does fix the test case.
v2:
- use wake_q_add
- add additional explanations
Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@redhat.com Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Maarten Lankhorst maarten.lankhorst@canonical.com Cc: dri-devel@lists.freedesktop.org Cc: stable@vger.kernel.org Reviewed-by: Christian König christian.koenig@amd.com (v1) Signed-off-by: Nicolai Hähnle nicolai.haehnle@amd.com
Completely and utterly fails to apply; I think this patch is based on code prior to the mutex rewrite.
Please rebase on tip/locking/core.
Also, is this a regression, or has this been a 'feature' of the ww_mutex code from early on?
Sorry forgot to mention that, but I checked. Seems to have been broken since day 1, at least looking at the original code the wake-single-waiter stuff is as old as the mutex code added in 2006. -Daniel