Quoting Chris Wilson (2020-07-15 13:21:43)
Quoting Daniel Vetter (2020-07-15 13:10:22)
On Wed, Jul 15, 2020 at 11:49:05AM +0100, Chris Wilson wrote:
When waiting with a callback on the stack, we must remove the callback upon wait completion. Since this will be notified by the fence signal callback, the removal often contends with the fence->lock being held by the signaler. We can look at the list entry to see if the callback was already signaled before we take the contended lock.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk
drivers/dma-buf/dma-fence.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 8d5bdfce638e..b910d7bc0854 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -420,6 +420,9 @@ dma_fence_remove_callback(struct dma_fence *fence, struct dma_fence_cb *cb) unsigned long flags; bool ret;
if (list_empty(&cb->node))
I was about to say "but the races" but then noticed that Paul fixed list_empty to use READ_ONCE like 5 years ago :-)
I'm always going "when exactly do we need list_empty_careful()"?
We can rule out a concurrent dma_fence_add_callback() for the same dma_fence_cb, as that is a lost cause. So we only have to worry about the concurrent list_del_init() from dma_fence_signal_locked(). So it's the timing of list_del_init(): WRITE_ONCE(list->next, list) vs READ_ONCE(list->next) == list and we don't need to care about the trailing instructions in list_del_init()...
Wait that trailing instruction is actually important here if the dma_fence_cb is on the stack, or other imminent free.
Ok, this does need to be list_empty_careful!
There's a further problem in that we call INIT_LIST_HEAD on the dma_fence_cb before the signal callback. So even if list_empty_careful() confirms the dma_fence_cb to be completely decoupled, the containing struct may still be inuse. -Chris