Quoting Gustavo Padovan (2017-07-27 20:03:53)
From: Gustavo Padovan gustavo.padovan@collabora.com
If userspace already dropped its own reference by closing the sw_sync fence fd we might end up in a deadlock where dma_fence_is_signaled_locked() will trigger the release of the fence a thus try to hold the lock to remove the fence from the list.
So the issue here is that call to dma_fence_is_signaled_lock() is triggering the unreference?
We need to grab a reference to the fence before calling into this chain if we want to avoid this issue.
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.com
drivers/dma-buf/sw_sync.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index af1bc84..8291434 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -144,11 +144,16 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) obj->value += inc;
list_for_each_entry_safe(pt, next, &obj->pt_list, link) {
if (!dma_fence_is_signaled_locked(&pt->base))
dma_fence_get(&pt->base);
This would need to be dma_fence_get_rcu() to avoid grabbing the fence when its refcount has hit 0.
if (!dma_fence_is_signaled_locked(&pt->base)) {
dma_fence_put(&pt->base); break;
} list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
But if I understand correctly, we just need to unlink first, then signal.
list_for_each_entry_safe() { if (!timeline_fence_signaled(&pt->base)) break;
list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
dma_fence_signal_locked(&pt->base); }
The challenge is in writing the comment to explain the open-coding. -Chris