From: Gustavo Padovan gustavo.padovan@collabora.com
If userspace already dropped its own reference by closing the sw_sync fence fd we might end up in a deadlock where dma_fence_is_signaled_locked() will trigger the release of the fence a thus try to hold the lock to remove the fence from the list.
We need to grab a reference to the fence before calling into this chain if we want to avoid this issue.
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.com --- drivers/dma-buf/sw_sync.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index af1bc84..8291434 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -144,11 +144,16 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) obj->value += inc;
list_for_each_entry_safe(pt, next, &obj->pt_list, link) { - if (!dma_fence_is_signaled_locked(&pt->base)) + dma_fence_get(&pt->base); + if (!dma_fence_is_signaled_locked(&pt->base)) { + dma_fence_put(&pt->base); break; + }
list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree); + + dma_fence_put(&pt->base); }
spin_unlock_irq(&obj->lock);
Quoting Gustavo Padovan (2017-07-27 20:03:53)
From: Gustavo Padovan gustavo.padovan@collabora.com
If userspace already dropped its own reference by closing the sw_sync fence fd we might end up in a deadlock where dma_fence_is_signaled_locked() will trigger the release of the fence a thus try to hold the lock to remove the fence from the list.
So the issue here is that call to dma_fence_is_signaled_lock() is triggering the unreference?
We need to grab a reference to the fence before calling into this chain if we want to avoid this issue.
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.com
drivers/dma-buf/sw_sync.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index af1bc84..8291434 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -144,11 +144,16 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) obj->value += inc;
list_for_each_entry_safe(pt, next, &obj->pt_list, link) {
if (!dma_fence_is_signaled_locked(&pt->base))
dma_fence_get(&pt->base);
This would need to be dma_fence_get_rcu() to avoid grabbing the fence when its refcount has hit 0.
if (!dma_fence_is_signaled_locked(&pt->base)) {
dma_fence_put(&pt->base); break;
} list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
But if I understand correctly, we just need to unlink first, then signal.
list_for_each_entry_safe() { if (!timeline_fence_signaled(&pt->base)) break;
list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
dma_fence_signal_locked(&pt->base); }
The challenge is in writing the comment to explain the open-coding. -Chris
2017-07-27 Chris Wilson chris@chris-wilson.co.uk:
Quoting Gustavo Padovan (2017-07-27 20:03:53)
From: Gustavo Padovan gustavo.padovan@collabora.com
If userspace already dropped its own reference by closing the sw_sync fence fd we might end up in a deadlock where dma_fence_is_signaled_locked() will trigger the release of the fence a thus try to hold the lock to remove the fence from the list.
So the issue here is that call to dma_fence_is_signaled_lock() is triggering the unreference?
Exactly. I'll say that explicitely in the commit message.
We need to grab a reference to the fence before calling into this chain if we want to avoid this issue.
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.com
drivers/dma-buf/sw_sync.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index af1bc84..8291434 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -144,11 +144,16 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) obj->value += inc;
list_for_each_entry_safe(pt, next, &obj->pt_list, link) {
if (!dma_fence_is_signaled_locked(&pt->base))
dma_fence_get(&pt->base);
This would need to be dma_fence_get_rcu() to avoid grabbing the fence when its refcount has hit 0.
if (!dma_fence_is_signaled_locked(&pt->base)) {
dma_fence_put(&pt->base); break;
} list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
But if I understand correctly, we just need to unlink first, then signal.
list_for_each_entry_safe() { if (!timeline_fence_signaled(&pt->base)) break;
list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
dma_fence_signal_locked(&pt->base); }
The challenge is in writing the comment to explain the open-coding.
That is cleaner and doesn't need the get/put dance. I'll come up with a comment to explain it.
Gustavo
Quoting Gustavo Padovan (2017-07-28 02:57:25)
2017-07-27 Chris Wilson chris@chris-wilson.co.uk:
Quoting Gustavo Padovan (2017-07-27 20:03:53)
From: Gustavo Padovan gustavo.padovan@collabora.com
If userspace already dropped its own reference by closing the sw_sync fence fd we might end up in a deadlock where dma_fence_is_signaled_locked() will trigger the release of the fence a thus try to hold the lock to remove the fence from the list.
So the issue here is that call to dma_fence_is_signaled_lock() is triggering the unreference?
Exactly. I'll say that explicitely in the commit message.
:) It was more of a rhetorical question making sure that I understood correctly.
But if I understand correctly, we just need to unlink first, then signal.
list_for_each_entry_safe() { if (!timeline_fence_signaled(&pt->base)) break;
list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree); dma_fence_signal_locked(&pt->base);
}
The challenge is in writing the comment to explain the open-coding.
That is cleaner and doesn't need the get/put dance. I'll come up with a comment to explain it.
...
/* * A signal callback may release the last reference to this fence, * causing it to be freed. That operation has to be last to avoid * a use after free inside this loop, and must be after we remove * the fence from the timeline in order to prevent deadlocking on * timeline->lock inside timeline_fence_release(). */ dma_fence_signal_locked(). -Chris
On Thu, Jul 27, 2017 at 04:03:53PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.com
If userspace already dropped its own reference by closing the sw_sync fence fd we might end up in a deadlock where dma_fence_is_signaled_locked() will trigger the release of the fence a thus try to hold the lock to remove the fence from the list.
We need to grab a reference to the fence before calling into this chain if we want to avoid this issue.
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.com
Do we have a testcase for this? -Daniel
drivers/dma-buf/sw_sync.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index af1bc84..8291434 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -144,11 +144,16 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) obj->value += inc;
list_for_each_entry_safe(pt, next, &obj->pt_list, link) {
if (!dma_fence_is_signaled_locked(&pt->base))
dma_fence_get(&pt->base);
if (!dma_fence_is_signaled_locked(&pt->base)) {
dma_fence_put(&pt->base); break;
}
list_del_init(&pt->link); rb_erase(&pt->node, &obj->pt_tree);
dma_fence_put(&pt->base);
}
spin_unlock_irq(&obj->lock);
-- 2.9.4
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org