On Wed, Aug 9, 2017 at 3:41 PM, Chris Wilson chris@chris-wilson.co.uk wrote:
Quoting Chris Wilson (2017-08-09 18:57:44)
So we are taking a snapshot here. It looks like this could have been done using a dma_fence_array + dma_fence_proxy for capturing the future fence.
A quick sketch of this idea looks like:
void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, struct dma_fence *fence) {
struct dma_fence *old_fence;
unsigned long flags;
if (fence)
dma_fence_get(fence);
old_fence = xchg(&syncobj->fence, fence);
dma_fence_put(old_fence);
spin_lock_irqsave(&syncobj->lock, flags);
dma_fence_replace_proxy(&syncobj->fence, fence);
spin_unlock_irqrestore(&syncobj->lock, flags);
}
+int +drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_private)
+{
struct drm_syncobj_wait *args = data;
struct dma_fence **fences;
struct dma_fence_array *array;
unsigned long timeout;
unsigned int count;
u32 *handles;
int ret = 0;
u32 i;
BUILD_BUG_ON(sizeof(*fences) < sizeof(*handles));
if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
return -ENODEV;
if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_
ALL)
return -EINVAL;
count = args->count_handles;
if (!count)
return -EINVAL;
/* Get the handles from userspace */
fences = kmalloc_array(count,
sizeof(struct dma_fence *),
__GFP_NOWARN | GFP_KERNEL);
if (!fences)
return -ENOMEM;
handles = (void *)fences + count * (sizeof(*fences) -
sizeof(*handles));
if (copy_from_user(handles,
u64_to_user_ptr(args->handles),
sizeof(u32) * count)) {
ret = -EFAULT;
goto err;
}
for (i = 0; i < count; i++) {
struct drm_syncobj *s;
ret = -ENOENT;
s = drm_syncobj_find(file_private, handles[i]);
if (s) {
ret = 0;
spin_lock_irq(&s->lock);
if (!s->fence) {
if (args->flags &
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)
s->fence =
dma_fence_create_proxy();
else
ret = -EINVAL;
}
if (s->fence)
fences[i] = dma_fence_get(s->fence);
spin_unlock_irq(&s->lock);
}
if (ret) {
count = i;
goto err_fences;
}
}
array = dma_fence_array_create(count, fences, 0, 0,
!(args->flags &
DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL));
if (!array) {
ret = -ENOMEM;
goto err_fences;
}
timeout = drm_timeout_abs_to_jiffies(args->timeout_nsec);
timeout = dma_fence_wait_timeout(&array->base, true, timeout);
args->first_signaled = array->first_signaled;
dma_fence_put(&array->base);
return timeout < 0 ? timeout : 0;
+err_fences:
for (i = 0; i < count; i++)
dma_fence_put(fences[i]);
+err:
kfree(fences);
return ret;
+}
The key advantage is that this translate the ioctl into a dma-fence-array which already has to deal with the mess, sharing the burden. (But it does require a trivial patch to dma-fence-array to record the first signaled fence.)
However, this installs the proxy into syncobj->fence with the result that any concurrent wait also become a WAIT_FOR_SUBMIT. The behaviour of drm_syncobj is then quite inconsistent, sometimes it will wait for a future fence, sometimes it will report an error.
Yeah, that's not good. I thought about a variety of solutions to try and re-use more core dma_fence code. Ultimately I chose the current one because it takes a snapshot of the syncobjs and then, from that point forward, it's consistent with its snapshot. Nothing I was able to come up with based on core dma_fence wait routines does that.
--Jason