When a timeout of zero is specified, the caller is only interested in the fence status.
In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query.
Avoid this overhead by returning early if a zero timeout is specified.
Signed-off-by: Andres Rodriguez andresx7@gmail.com ---
This heavily affects the performance of the Source2 engine running on radv.
This patch improves dota2(radv) perf on a i7-6700k+RX480 system from 72fps->81fps.
drivers/dma-buf/dma-fence.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..348e9e2 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -380,6 +380,9 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return ret;
+ if (!timeout) + return 0; + spin_lock_irqsave(fence->lock, flags);
if (intr && signal_pending(current)) {
CC a few extra lists I missed.
Regards, Andres
On 2017-04-25 09:36 PM, Andres Rodriguez wrote:
When a timeout of zero is specified, the caller is only interested in the fence status.
In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query.
Avoid this overhead by returning early if a zero timeout is specified.
Signed-off-by: Andres Rodriguez andresx7@gmail.com
This heavily affects the performance of the Source2 engine running on radv.
This patch improves dota2(radv) perf on a i7-6700k+RX480 system from 72fps->81fps.
drivers/dma-buf/dma-fence.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..348e9e2 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -380,6 +380,9 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return ret;
if (!timeout)
return 0;
spin_lock_irqsave(fence->lock, flags);
if (intr && signal_pending(current)) {
NAK, I'm wondering how often I have to reject that change. We should probably add a comment here.
Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them.
If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead.
Regards, Christian.
Am 26.04.2017 um 04:50 schrieb Andres Rodriguez:
CC a few extra lists I missed.
Regards, Andres
On 2017-04-25 09:36 PM, Andres Rodriguez wrote:
When a timeout of zero is specified, the caller is only interested in the fence status.
In the current implementation, dma_fence_default_wait will always call schedule_timeout() at least once for an unsignaled fence. This adds a significant overhead to a fence status query.
Avoid this overhead by returning early if a zero timeout is specified.
Signed-off-by: Andres Rodriguez andresx7@gmail.com
This heavily affects the performance of the Source2 engine running on radv.
This patch improves dota2(radv) perf on a i7-6700k+RX480 system from 72fps->81fps.
drivers/dma-buf/dma-fence.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f..348e9e2 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -380,6 +380,9 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return ret;
if (!timeout)
return 0;
spin_lock_irqsave(fence->lock, flags);
if (intr && signal_pending(current)) {
On 26 April 2017 at 17:20, Christian König deathsimple@vodafone.de wrote:
NAK, I'm wondering how often I have to reject that change. We should probably add a comment here.
Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them.
If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead.
Can we not move the return 0 (with spin unlock) down after we enabling signalling, but before we enter the schedule_timeout(1)?
Dave.
Am 26.04.2017 um 11:59 schrieb Dave Airlie:
On 26 April 2017 at 17:20, Christian König deathsimple@vodafone.de wrote:
NAK, I'm wondering how often I have to reject that change. We should probably add a comment here.
Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them.
If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead.
Can we not move the return 0 (with spin unlock) down after we enabling signalling, but before we enter the schedule_timeout(1)?
Yes, that would be an option.
Christian.
Dave.
On 2017-04-26 06:13 AM, Christian König wrote:
Am 26.04.2017 um 11:59 schrieb Dave Airlie:
On 26 April 2017 at 17:20, Christian König deathsimple@vodafone.de wrote:
NAK, I'm wondering how often I have to reject that change. We should probably add a comment here.
Even with a zero timeout we still need to enable signaling, otherwise some fence will never signal if userspace just polls on them.
If a caller is only interested in the fence status without enabling the signaling it should call dma_fence_is_signaled() instead.
Can we not move the return 0 (with spin unlock) down after we enabling signalling, but before we enter the schedule_timeout(1)?
Yes, that would be an option.
I was actually arguing with Dave about this on IRC yesterday. Seems like I owe him a beer now.
-Andres
Christian.
Dave.
dri-devel@lists.freedesktop.org