From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Hi,
Currently the Linux Kernel only have an implicit fencing mechanism where the fence are attached directly to buffers and userspace is unaware of what is happening. On the other hand explicit fencing which is not supported yet by Linux but it expose fences to the userspace to handle fencing between producer/consumer explicitely.
For that we use the Android Sync Framework[1], a explicit fencing mechanism that help the userspace handles fences directly. It has the concept of sync_file (called sync_fence in Android) that expose the driver's fences to userspace via file descriptors. File descriptors are useful because we can pass them around between process.
The Sync Framework is currently in the staging tree and on the process to be de-staged[2].
With explicit fencing we have a global mechanism that optimizes the flow of buffers between consumers and producers, avoid a lot of waiting. So instead of waiting for a buffer to be processed by the GPU before sending it to DRM in an Atomic IOCTL we can get a sync_file fd from the GPU driver at the moment we submit the buffer processing. The compositor then passes these fds to DRM in a atomic commit request, that will not be displayed until the fences signal, i.e, the GPU finished processing the buffer and it is ready to display. In DRM the fences we wait on before displaying a buffer are called in-fences.
Vice-versa, we have out-fences, to sychronize the return of buffers to GPU (producer) to be processed again. When DRM receives an atomic request with a special flag set it generates one fence per-crtc and attach it to a per-crtc sync_file. It then returns the array of sync_file fds to userspace as an atomic_ioctl out arg. With the fences available userspace can forward these fences to the GPU, where it will wait the fence to signal before starting to process on buffer again.
Explicit fencing with Sync Framework allows buffer suballocation. Userspace get a large buffer and divides it into small ones and submit requests to process them, each subbuffer gets and sync_file fd and can be processed in parallel. This is not even possible with implicit fencing.
While these are out-fences in DRM (the consumer) they become in-fences once they get to the GPU (the producer).
DRM explicit fences are opt-in, as the default will still be implicit fencing. To enable explicit in-fences one just need to pass a sync_file fd in the FENCE_FD plane property. *In-fences are per-plane*, i.e., per framebuffer.
For out-fences, just enabling DRM_MODE_ATOMIC_OUT_FENCE flag is enough. *Out-fences are per-crtc*.
In-fences ---------
In the first discussions on #dri-devel on IRC we decided to hide the Sync Framework from DRM drivers to reduce complexity, so as soon we get the fd via FENCE_FD plane property we convert the sync_file fd to a struct fence. However a sync_file might contain more than one fence, so we created the fence_collection concept. struct fence_collection is a subclass of struct fence and stores a group of fences that needs to be waited together, in other words, all the fences in the sync_file.
Then we just use the already in place fence support to wait on those fences. Once the producer calls fence_signal() for all fences on wait we can proceed with the atomic commit and display the framebuffers. DRM drivers only needs to be converted to struct fence to make use of this feature.
Out-fences ----------
Passing the DRM_MODE_ATOMIC_OUT_FENCE flag to an atomic request enables out-fences. The kernel then creates a fence, attach it to a sync_file and install this file on a unused fd for each crtc. Userspace get the fence back as an array of per-crtc sync_file fds.
DRM core use the already in place drm_event infrastructure to help signal fences, we've added a fence pointer to struct drm_pending_event. If the atomic update received requested an PAGE_FLIP_EVENT we just use the same drm_pending_event and set our fence there, otherwise we just create an event with a NULL file_priv to set our fence. On vblank we just call fence_signal() to signal that the buffer related to this fence is *now* on the screen. Note that this is exactly the opposite behaviour from Android, where the fences are signaled when they are not on display anymore, so free to be reused.
No changes are required to DRM drivers to have out-fences support, apart from atomic support of course.
Kernel tree -----------
For those who want all patches on this RFC are in my tree. The tree includes all sync frameworks patches needed at the moment:
https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences
I also hacked some poor some fake fences support to modetest here:
https://git.collabora.com/cgit/user/padovan/libdrm.git/log/?h=atomic
v2 changes ----------
The changes from RFC v1 to RFC v2 are in the patches description.
TODO ----
* upstream Android Sync ABI changes * de-stage Android Sync Framework * Upstream sync selftests (Emilio Lopez) * create tests for DRM fences
Regards,
Gustavo ---
[1] https://source.android.com/devices/graphics/implement.html#vsync [2] https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=sync
Gustavo Padovan (8): dma-buf/fence: add fence_collection fences Documentation: add fence-collection to kernel DocBook dma-buf/sync_file: add sync_file_fences_get() drm/fence: allow fence waiting to be interrupted by userspace drm/fence: add in-fences support drm/fence: add fence to drm_pending_event drm/fence: add fence timeline to drm_crtc drm/fence: add out-fences support
Documentation/DocBook/device-drivers.tmpl | 2 + drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/fence-collection.c | 159 ++++++++++++++++++++++++ drivers/dma-buf/fence.c | 2 +- drivers/dma-buf/sync_file.c | 60 +++++++++ drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/drm_atomic.c | 196 ++++++++++++++++++++++++++++-- drivers/gpu/drm/drm_atomic_helper.c | 24 +++- drivers/gpu/drm/drm_crtc.c | 36 ++++++ drivers/gpu/drm/drm_fops.c | 8 +- include/drm/drmP.h | 2 + include/drm/drm_atomic_helper.h | 4 +- include/drm/drm_crtc.h | 30 +++++ include/linux/fence-collection.h | 73 +++++++++++ include/linux/fence.h | 9 ++ include/linux/sync_file.h | 3 + include/uapi/drm/drm_mode.h | 11 +- 17 files changed, 600 insertions(+), 22 deletions(-) create mode 100644 drivers/dma-buf/fence-collection.c create mode 100644 include/linux/fence-collection.h
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
struct fence_collection inherits from struct fence and carries a collection of fences that needs to be waited together.
It is useful to translate a sync_file to a fence to remove the complexity of dealing with sync_files on DRM drivers. So even if there are many fences in the sync_file that needs to waited for a commit to happen, they all get added to the fence_collection and passed for DRM use as a standard struct fence.
That means that no changes needed to any driver besides supporting fences.
fence_collection's fence doesn't belong to any timeline context, so fence_is_later() and fence_later() are not meant to be called with fence_collections fences.
v2: Comments by Daniel Vetter: - merge fence_collection_init() and fence_collection_add() - only add callbacks at ->enable_signalling() - remove fence_collection_put() - check for type on to_fence_collection() - adjust fence_is_later() and fence_later() to WARN_ON() if they are used with collection fences.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/fence-collection.c | 159 +++++++++++++++++++++++++++++++++++++ drivers/dma-buf/fence.c | 2 +- include/linux/fence-collection.h | 73 +++++++++++++++++ include/linux/fence.h | 9 +++ 5 files changed, 243 insertions(+), 2 deletions(-) create mode 100644 drivers/dma-buf/fence-collection.c create mode 100644 include/linux/fence-collection.h
diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 4a424ec..52f818f 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,2 +1,2 @@ -obj-y := dma-buf.o fence.o reservation.o seqno-fence.o +obj-y := dma-buf.o fence.o reservation.o seqno-fence.o fence-collection.o obj-$(CONFIG_SYNC_FILE) += sync_file.o diff --git a/drivers/dma-buf/fence-collection.c b/drivers/dma-buf/fence-collection.c new file mode 100644 index 0000000..88872e5 --- /dev/null +++ b/drivers/dma-buf/fence-collection.c @@ -0,0 +1,159 @@ +/* + * fence-collection: aggregate fences to be waited together + * + * Copyright (C) 2016 Collabora Ltd + * Authors: + * Gustavo Padovan gustavo@padovan.org + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include <linux/export.h> +#include <linux/slab.h> +#include <linux/fence-collection.h> + +static const char *fence_collection_get_driver_name(struct fence *fence) +{ + struct fence_collection *collection = to_fence_collection(fence); + struct fence *f = collection->fences[0].fence; + + return f->ops->get_driver_name(fence); +} + +static const char *fence_collection_get_timeline_name(struct fence *fence) +{ + return "no context"; +} + +static void collection_check_cb_func(struct fence *fence, struct fence_cb *cb) +{ + struct fence_collection_cb *f_cb; + struct fence_collection *collection; + + f_cb = container_of(cb, struct fence_collection_cb, cb); + collection = f_cb->collection; + + if (atomic_dec_and_test(&collection->num_pending_fences)) + fence_signal(&collection->base); +} + +static bool fence_collection_enable_signaling(struct fence *fence) +{ + struct fence_collection *collection = to_fence_collection(fence); + int i; + + for (i = 0 ; i < collection->num_fences ; i++) { + if (fence_add_callback(collection->fences[i].fence, + &collection->fences[i].cb, + collection_check_cb_func)) { + atomic_dec(&collection->num_pending_fences); + return false; + } + } + + return !!atomic_read(&collection->num_pending_fences); +} + +static bool fence_collection_signaled(struct fence *fence) +{ + struct fence_collection *collection = to_fence_collection(fence); + + return (atomic_read(&collection->num_pending_fences) == 0); +} + +static void fence_collection_release(struct fence *fence) +{ + struct fence_collection *collection = to_fence_collection(fence); + int i; + + for (i = 0 ; i < collection->num_fences ; i++) + fence_put(collection->fences[i].fence); + + kfree(collection->fences); + fence_free(fence); +} + +static signed long fence_collection_wait(struct fence *fence, bool intr, + signed long timeout) +{ + struct fence_collection *collection = to_fence_collection(fence); + int i; + + if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) + return timeout; + + fence_enable_sw_signaling(fence); + + for (i = 0 ; i < collection->num_fences ; i++) { + if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) + return timeout; + + timeout = fence_wait(collection->fences[i].fence, intr); + if (timeout < 0) + return timeout; + } + + return timeout; +} + +const struct fence_ops fence_collection_ops = { + .get_driver_name = fence_collection_get_driver_name, + .get_timeline_name = fence_collection_get_timeline_name, + .enable_signaling = fence_collection_enable_signaling, + .signaled = fence_collection_signaled, + .wait = fence_collection_wait, + .release = fence_collection_release, +}; + +/** + * fence_collection_create - Create a custom fence collection + * is signaled + * @num_fences: [in] number of fences to add in the collection + * @cb: [in] cb array containing the fences for the collection + * + * Allocate a fence_collection and initialize the base fence with fence_init() + * and FENCE_NO_CONTEXT. It also inits all fence_collections specific fields + * and return the created fence_collection. In case of error it returns NULL. + * + * The caller should allocte the fence_collection_cb array with num_fences size + * and fill the fence fields with all the fences it wants to add to the + * collection. + * + * Collections have no context, thus fence_later() and fence_is_later() shall + * not be used with fence_collection. + * + * fence_put() on the base fence should be used to release the collection. + */ +struct fence_collection *fence_collection_create(int num_fences, + struct fence_collection_cb *cb) +{ + struct fence_collection *collection; + int i; + + collection = kzalloc(sizeof(*collection), GFP_KERNEL); + if (!collection) + return NULL; + + spin_lock_init(&collection->lock); + fence_init(&collection->base, &fence_collection_ops, &collection->lock, + FENCE_NO_CONTEXT, 0); + + collection->fences = cb; + collection->num_fences = num_fences; + atomic_set(&collection->num_pending_fences, num_fences); + + for (i = 0 ; i < collection->num_fences ; i++) { + fence_get(collection->fences[i].fence); + collection->fences[i].collection = collection; + } + + return collection; +} +EXPORT_SYMBOL(fence_collection_create); diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c index 7b05dbe..486e95c 100644 --- a/drivers/dma-buf/fence.c +++ b/drivers/dma-buf/fence.c @@ -35,7 +35,7 @@ EXPORT_TRACEPOINT_SYMBOL(fence_emit); * context or not. One device can have multiple separate contexts, * and they're used if some engine can run independently of another. */ -static atomic_t fence_context_counter = ATOMIC_INIT(0); +static atomic_t fence_context_counter = ATOMIC_INIT(1);
/** * fence_context_alloc - allocate an array of fence contexts diff --git a/include/linux/fence-collection.h b/include/linux/fence-collection.h new file mode 100644 index 0000000..4087622 --- /dev/null +++ b/include/linux/fence-collection.h @@ -0,0 +1,73 @@ +/* + * fence-collection: aggregates fence to be waited together + * + * Copyright (C) 2016 Collabora Ltd + * Authors: + * Gustavo Padovan gustavo@padovan.org + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#ifndef __LINUX_FENCE_COLLECTION_H +#define __LINUX_FENCE_COLLECTION_H + +#include <linux/fence.h> + +/** + * struct fence_collection_cb - callback for fence_add_callback + * @cb: the base fence_cb object + * @fence: fence we are storing in the collection + * @collection: collection the fence belongs to + * + * This struct is at the same time the storage for the fences in the + * collection and the struct we pass to fence_add_callback() + */ +struct fence_collection_cb { + struct fence_cb cb; + struct fence *fence; + struct fence_collection *collection; +}; + +/** + * struct fence_collection - fence to represent a collection of fences + * @base: fence base class + * @lock: spinlock for fence handling + * @num_pending_fences: fences in the collection not signaled yet + * @num_fences: number of fences in the collection + * @fences: ponteir to fence_collection_cb with data about the fences + */ +struct fence_collection { + struct fence base; + + spinlock_t lock; + atomic_t num_pending_fences; + int num_fences; + struct fence_collection_cb *fences; +}; + +extern const struct fence_ops fence_collection_ops; + +/** + * to_fence_collection - cast a fence to a fence_collection + * @fence: fence to cast to a fence_collection + * + * Returns NULL if the fence is not a fence_collection, + * or the fence_collection otherwise. + */ +static inline struct fence_collection *to_fence_collection(struct fence *fence) +{ + if (fence->ops != &fence_collection_ops) + return NULL; + return container_of(fence, struct fence_collection, base); +} + +struct fence_collection *fence_collection_create(int num_fences, + struct fence_collection_cb *cb); +#endif /* __LINUX_FENCE_COLLECTION_H */ diff --git a/include/linux/fence.h b/include/linux/fence.h index 5aa95eb..184ce79 100644 --- a/include/linux/fence.h +++ b/include/linux/fence.h @@ -30,6 +30,9 @@ #include <linux/printk.h> #include <linux/rcupdate.h>
+/* context number for fences without context, e.g: fence_collection */ +#define FENCE_NO_CONTEXT 0 + struct fence; struct fence_ops; struct fence_cb; @@ -292,6 +295,9 @@ static inline bool fence_is_later(struct fence *f1, struct fence *f2) if (WARN_ON(f1->context != f2->context)) return false;
+ if (WARN_ON(f1->context == FENCE_NO_CONTEXT)) + return false; + return (int)(f1->seqno - f2->seqno) > 0; }
@@ -309,6 +315,9 @@ static inline struct fence *fence_later(struct fence *f1, struct fence *f2) if (WARN_ON(f1->context != f2->context)) return NULL;
+ if (WARN_ON(f1->context == FENCE_NO_CONTEXT)) + return NULL; + /* * can't check just FENCE_FLAG_SIGNALED_BIT here, it may never have been * set if enable_signaling wasn't called, and enabling that here is
On Mon, Apr 25, 2016 at 07:33:21PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
struct fence_collection inherits from struct fence and carries a collection of fences that needs to be waited together.
It is useful to translate a sync_file to a fence to remove the complexity of dealing with sync_files on DRM drivers. So even if there are many fences in the sync_file that needs to waited for a commit to happen, they all get added to the fence_collection and passed for DRM use as a standard struct fence.
That means that no changes needed to any driver besides supporting fences.
fence_collection's fence doesn't belong to any timeline context, so fence_is_later() and fence_later() are not meant to be called with fence_collections fences.
v2: Comments by Daniel Vetter:
- merge fence_collection_init() and fence_collection_add()
- only add callbacks at ->enable_signalling()
- remove fence_collection_put()
- check for type on to_fence_collection()
- adjust fence_is_later() and fence_later() to WARN_ON() if they
are used with collection fences.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
FENCE_NO_CONTEXT semantics needs an ack from amdgpu maintainers. I'm not entirely sure they might not hit the new WARN_ON by accident now. Please cc Alex Deucher & Christian König. -Daniel
drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/fence-collection.c | 159 +++++++++++++++++++++++++++++++++++++ drivers/dma-buf/fence.c | 2 +- include/linux/fence-collection.h | 73 +++++++++++++++++ include/linux/fence.h | 9 +++ 5 files changed, 243 insertions(+), 2 deletions(-) create mode 100644 drivers/dma-buf/fence-collection.c create mode 100644 include/linux/fence-collection.h
diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 4a424ec..52f818f 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,2 +1,2 @@ -obj-y := dma-buf.o fence.o reservation.o seqno-fence.o +obj-y := dma-buf.o fence.o reservation.o seqno-fence.o fence-collection.o obj-$(CONFIG_SYNC_FILE) += sync_file.o diff --git a/drivers/dma-buf/fence-collection.c b/drivers/dma-buf/fence-collection.c new file mode 100644 index 0000000..88872e5 --- /dev/null +++ b/drivers/dma-buf/fence-collection.c @@ -0,0 +1,159 @@ +/*
- fence-collection: aggregate fences to be waited together
- Copyright (C) 2016 Collabora Ltd
- Authors:
- Gustavo Padovan gustavo@padovan.org
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License version 2 as published by
- the Free Software Foundation.
- This program is distributed in the hope that it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
- */
+#include <linux/export.h> +#include <linux/slab.h> +#include <linux/fence-collection.h>
+static const char *fence_collection_get_driver_name(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- struct fence *f = collection->fences[0].fence;
- return f->ops->get_driver_name(fence);
+}
+static const char *fence_collection_get_timeline_name(struct fence *fence) +{
- return "no context";
+}
+static void collection_check_cb_func(struct fence *fence, struct fence_cb *cb) +{
- struct fence_collection_cb *f_cb;
- struct fence_collection *collection;
- f_cb = container_of(cb, struct fence_collection_cb, cb);
- collection = f_cb->collection;
- if (atomic_dec_and_test(&collection->num_pending_fences))
fence_signal(&collection->base);
+}
+static bool fence_collection_enable_signaling(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- int i;
- for (i = 0 ; i < collection->num_fences ; i++) {
if (fence_add_callback(collection->fences[i].fence,
&collection->fences[i].cb,
collection_check_cb_func)) {
atomic_dec(&collection->num_pending_fences);
return false;
}
- }
- return !!atomic_read(&collection->num_pending_fences);
+}
+static bool fence_collection_signaled(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- return (atomic_read(&collection->num_pending_fences) == 0);
+}
+static void fence_collection_release(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- int i;
- for (i = 0 ; i < collection->num_fences ; i++)
fence_put(collection->fences[i].fence);
- kfree(collection->fences);
- fence_free(fence);
+}
+static signed long fence_collection_wait(struct fence *fence, bool intr,
signed long timeout)
+{
- struct fence_collection *collection = to_fence_collection(fence);
- int i;
- if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
return timeout;
- fence_enable_sw_signaling(fence);
- for (i = 0 ; i < collection->num_fences ; i++) {
if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
return timeout;
timeout = fence_wait(collection->fences[i].fence, intr);
if (timeout < 0)
return timeout;
- }
- return timeout;
+}
+const struct fence_ops fence_collection_ops = {
- .get_driver_name = fence_collection_get_driver_name,
- .get_timeline_name = fence_collection_get_timeline_name,
- .enable_signaling = fence_collection_enable_signaling,
- .signaled = fence_collection_signaled,
- .wait = fence_collection_wait,
- .release = fence_collection_release,
+};
+/**
- fence_collection_create - Create a custom fence collection
- is signaled
- @num_fences: [in] number of fences to add in the collection
- @cb: [in] cb array containing the fences for the collection
- Allocate a fence_collection and initialize the base fence with fence_init()
- and FENCE_NO_CONTEXT. It also inits all fence_collections specific fields
- and return the created fence_collection. In case of error it returns NULL.
- The caller should allocte the fence_collection_cb array with num_fences size
- and fill the fence fields with all the fences it wants to add to the
- collection.
- Collections have no context, thus fence_later() and fence_is_later() shall
- not be used with fence_collection.
- fence_put() on the base fence should be used to release the collection.
- */
+struct fence_collection *fence_collection_create(int num_fences,
struct fence_collection_cb *cb)
+{
- struct fence_collection *collection;
- int i;
- collection = kzalloc(sizeof(*collection), GFP_KERNEL);
- if (!collection)
return NULL;
- spin_lock_init(&collection->lock);
- fence_init(&collection->base, &fence_collection_ops, &collection->lock,
FENCE_NO_CONTEXT, 0);
- collection->fences = cb;
- collection->num_fences = num_fences;
- atomic_set(&collection->num_pending_fences, num_fences);
- for (i = 0 ; i < collection->num_fences ; i++) {
fence_get(collection->fences[i].fence);
collection->fences[i].collection = collection;
- }
- return collection;
+} +EXPORT_SYMBOL(fence_collection_create); diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c index 7b05dbe..486e95c 100644 --- a/drivers/dma-buf/fence.c +++ b/drivers/dma-buf/fence.c @@ -35,7 +35,7 @@ EXPORT_TRACEPOINT_SYMBOL(fence_emit);
- context or not. One device can have multiple separate contexts,
- and they're used if some engine can run independently of another.
*/ -static atomic_t fence_context_counter = ATOMIC_INIT(0); +static atomic_t fence_context_counter = ATOMIC_INIT(1);
/**
- fence_context_alloc - allocate an array of fence contexts
diff --git a/include/linux/fence-collection.h b/include/linux/fence-collection.h new file mode 100644 index 0000000..4087622 --- /dev/null +++ b/include/linux/fence-collection.h @@ -0,0 +1,73 @@ +/*
- fence-collection: aggregates fence to be waited together
- Copyright (C) 2016 Collabora Ltd
- Authors:
- Gustavo Padovan gustavo@padovan.org
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License version 2 as published by
- the Free Software Foundation.
- This program is distributed in the hope that it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
- */
+#ifndef __LINUX_FENCE_COLLECTION_H +#define __LINUX_FENCE_COLLECTION_H
+#include <linux/fence.h>
+/**
- struct fence_collection_cb - callback for fence_add_callback
- @cb: the base fence_cb object
- @fence: fence we are storing in the collection
- @collection: collection the fence belongs to
- This struct is at the same time the storage for the fences in the
- collection and the struct we pass to fence_add_callback()
- */
+struct fence_collection_cb {
- struct fence_cb cb;
- struct fence *fence;
- struct fence_collection *collection;
+};
+/**
- struct fence_collection - fence to represent a collection of fences
- @base: fence base class
- @lock: spinlock for fence handling
- @num_pending_fences: fences in the collection not signaled yet
- @num_fences: number of fences in the collection
- @fences: ponteir to fence_collection_cb with data about the fences
- */
+struct fence_collection {
- struct fence base;
- spinlock_t lock;
- atomic_t num_pending_fences;
- int num_fences;
- struct fence_collection_cb *fences;
+};
+extern const struct fence_ops fence_collection_ops;
+/**
- to_fence_collection - cast a fence to a fence_collection
- @fence: fence to cast to a fence_collection
- Returns NULL if the fence is not a fence_collection,
- or the fence_collection otherwise.
- */
+static inline struct fence_collection *to_fence_collection(struct fence *fence) +{
- if (fence->ops != &fence_collection_ops)
return NULL;
- return container_of(fence, struct fence_collection, base);
+}
+struct fence_collection *fence_collection_create(int num_fences,
struct fence_collection_cb *cb);
+#endif /* __LINUX_FENCE_COLLECTION_H */ diff --git a/include/linux/fence.h b/include/linux/fence.h index 5aa95eb..184ce79 100644 --- a/include/linux/fence.h +++ b/include/linux/fence.h @@ -30,6 +30,9 @@ #include <linux/printk.h> #include <linux/rcupdate.h>
+/* context number for fences without context, e.g: fence_collection */ +#define FENCE_NO_CONTEXT 0
struct fence; struct fence_ops; struct fence_cb; @@ -292,6 +295,9 @@ static inline bool fence_is_later(struct fence *f1, struct fence *f2) if (WARN_ON(f1->context != f2->context)) return false;
- if (WARN_ON(f1->context == FENCE_NO_CONTEXT))
return false;
- return (int)(f1->seqno - f2->seqno) > 0;
}
@@ -309,6 +315,9 @@ static inline struct fence *fence_later(struct fence *f1, struct fence *f2) if (WARN_ON(f1->context != f2->context)) return NULL;
- if (WARN_ON(f1->context == FENCE_NO_CONTEXT))
return NULL;
- /*
- can't check just FENCE_FLAG_SIGNALED_BIT here, it may never have been
- set if enable_signaling wasn't called, and enabling that here is
-- 2.5.5
2016-04-26 Daniel Vetter daniel@ffwll.ch:
On Mon, Apr 25, 2016 at 07:33:21PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
struct fence_collection inherits from struct fence and carries a collection of fences that needs to be waited together.
It is useful to translate a sync_file to a fence to remove the complexity of dealing with sync_files on DRM drivers. So even if there are many fences in the sync_file that needs to waited for a commit to happen, they all get added to the fence_collection and passed for DRM use as a standard struct fence.
That means that no changes needed to any driver besides supporting fences.
fence_collection's fence doesn't belong to any timeline context, so fence_is_later() and fence_later() are not meant to be called with fence_collections fences.
v2: Comments by Daniel Vetter:
- merge fence_collection_init() and fence_collection_add()
- only add callbacks at ->enable_signalling()
- remove fence_collection_put()
- check for type on to_fence_collection()
- adjust fence_is_later() and fence_later() to WARN_ON() if they
are used with collection fences.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
FENCE_NO_CONTEXT semantics needs an ack from amdgpu maintainers. I'm not entirely sure they might not hit the new WARN_ON by accident now. Please cc Alex Deucher & Christian König.
Sure, I'll Cc then in the revision. But if they use fence_context_alloc() to get the context they should never hit any WARN_ON as context numbers now starts at 1. 0 is reserved for FENCE_NO_CONTEXT.
Gustavo
On Tue, Apr 26, 2016 at 12:02:08PM -0300, Gustavo Padovan wrote:
2016-04-26 Daniel Vetter daniel@ffwll.ch:
On Mon, Apr 25, 2016 at 07:33:21PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
struct fence_collection inherits from struct fence and carries a collection of fences that needs to be waited together.
It is useful to translate a sync_file to a fence to remove the complexity of dealing with sync_files on DRM drivers. So even if there are many fences in the sync_file that needs to waited for a commit to happen, they all get added to the fence_collection and passed for DRM use as a standard struct fence.
That means that no changes needed to any driver besides supporting fences.
fence_collection's fence doesn't belong to any timeline context, so fence_is_later() and fence_later() are not meant to be called with fence_collections fences.
v2: Comments by Daniel Vetter:
- merge fence_collection_init() and fence_collection_add()
- only add callbacks at ->enable_signalling()
- remove fence_collection_put()
- check for type on to_fence_collection()
- adjust fence_is_later() and fence_later() to WARN_ON() if they
are used with collection fences.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
FENCE_NO_CONTEXT semantics needs an ack from amdgpu maintainers. I'm not entirely sure they might not hit the new WARN_ON by accident now. Please cc Alex Deucher & Christian König.
Sure, I'll Cc then in the revision. But if they use fence_context_alloc() to get the context they should never hit any WARN_ON as context numbers now starts at 1. 0 is reserved for FENCE_NO_CONTEXT.
I was more concerned whether the codepaths could accidentally walk over non-amdgpu fences (through prime buffer sharing for example). Otoh that would be a preexisting bug I think ... -Daniel
On Mon, Apr 25, 2016 at 07:33:21PM -0300, Gustavo Padovan wrote:
+static const char *fence_collection_get_timeline_name(struct fence *fence) +{
- return "no context";
"unbound" to distinguish from fence contexts within a timeline?
+static bool fence_collection_enable_signaling(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- int i;
- for (i = 0 ; i < collection->num_fences ; i++) {
if (fence_add_callback(collection->fences[i].fence,
&collection->fences[i].cb,
collection_check_cb_func)) {
atomic_dec(&collection->num_pending_fences);
return false;
Don't stop, we need to enable all the others!
}
- }
- return !!atomic_read(&collection->num_pending_fences);
Redundant !!
+}
+static bool fence_collection_signaled(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- return (atomic_read(&collection->num_pending_fences) == 0);
Redundant ()
+static signed long fence_collection_wait(struct fence *fence, bool intr,
signed long timeout)
+{
What advantage does this have over fence_default_wait? You enable signaling on all, then wait sequentially. The code looks redundant and could just use fence_default_wait instead. -Chris
2016-04-26 Chris Wilson chris@chris-wilson.co.uk:
On Mon, Apr 25, 2016 at 07:33:21PM -0300, Gustavo Padovan wrote:
+static const char *fence_collection_get_timeline_name(struct fence *fence) +{
- return "no context";
"unbound" to distinguish from fence contexts within a timeline?
+static bool fence_collection_enable_signaling(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- int i;
- for (i = 0 ; i < collection->num_fences ; i++) {
if (fence_add_callback(collection->fences[i].fence,
&collection->fences[i].cb,
collection_check_cb_func)) {
atomic_dec(&collection->num_pending_fences);
return false;
Don't stop, we need to enable all the others!
}
- }
- return !!atomic_read(&collection->num_pending_fences);
Redundant !!
+}
+static bool fence_collection_signaled(struct fence *fence) +{
- struct fence_collection *collection = to_fence_collection(fence);
- return (atomic_read(&collection->num_pending_fences) == 0);
Redundant ()
+static signed long fence_collection_wait(struct fence *fence, bool intr,
signed long timeout)
+{
What advantage does this have over fence_default_wait? You enable signaling on all, then wait sequentially. The code looks redundant and could just use fence_default_wait instead.
None actually, I'll just replace it with fence_default_wait().
Gustavo
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Include fence-collection files in the DocBook.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- Documentation/DocBook/device-drivers.tmpl | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 509a187..68539e5 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -132,8 +132,10 @@ X!Edrivers/base/interface.c !Edrivers/dma-buf/dma-buf.c !Edrivers/dma-buf/fence.c !Edrivers/dma-buf/seqno-fence.c +!Edrivers/dma-buf/fence-collection.c !Iinclude/linux/fence.h !Iinclude/linux/seqno-fence.h +!Iinclude/linux/fence-collection.h !Edrivers/dma-buf/reservation.c !Iinclude/linux/reservation.h !Edrivers/dma-buf/sync_file.c
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Creates a function that given an sync file descriptor returns a fence_collection containing all fences in the sync_file.
If there is only one fence in the sync_file this fence itself is returned, however if there is more than one, a fence_collection fence is returned.
v2: Comments by Daniel Vetter - Adapt to new version of fence_collection_init() - Hold a reference for the fence we return
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- drivers/dma-buf/sync_file.c | 60 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/sync_file.h | 3 +++ 2 files changed, 63 insertions(+)
diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c index 40ee098..5dd76c0 100644 --- a/drivers/dma-buf/sync_file.c +++ b/drivers/dma-buf/sync_file.c @@ -23,6 +23,7 @@ #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/anon_inodes.h> +#include <linux/fence-collection.h> #include <linux/sync_file.h> #include <uapi/linux/sync_file.h>
@@ -147,6 +148,62 @@ void sync_file_install(struct sync_file *sync_file, int fd) } EXPORT_SYMBOL(sync_file_install);
+/** + * sync_file_fences_get - get the fence(s) related to the sync_file fd + * @fd: sync_file fd to get the fence(s) from + * + * Ensures @fd references a valid sync_file and returns a fence that + * represents all fence in the sync_file. + * + * If there is only one fence in the sync_file, this fence is returned. + * If there is more than one, a fence_collection containing all fences + * is created and its base fence object is returned. + * On both cases a reference to the returned fence is held. On error + * NULL is returned. + */ +struct fence *sync_file_fences_get(int fd) +{ + struct sync_file *sync_file; + struct fence_collection *collection; + struct fence_collection_cb *cb; + int i; + + sync_file = sync_file_fdget(fd); + if (!sync_file) + return NULL; + + if (sync_file->num_fences == 1) { + struct fence *fence = sync_file->cbs[0].fence; + + fence_get(fence); + sync_file_put(sync_file); + return fence; + } + + cb = kcalloc(sync_file->num_fences, sizeof(*cb), GFP_KERNEL); + if (!cb) { + sync_file_put(sync_file); + return NULL; + } + + for (i = 0 ; i < sync_file->num_fences ; i++) + cb[i].fence = sync_file->cbs[i].fence; + + collection = fence_collection_create(sync_file->num_fences, cb); + if (!collection) { + kfree(cb); + sync_file_put(sync_file); + return NULL; + } + + sync_file->collection = collection; + fence_get(&collection->base); + sync_file_put(sync_file); + + return &collection->base; +} +EXPORT_SYMBOL(sync_file_fences_get); + static void sync_file_add_pt(struct sync_file *sync_file, int *i, struct fence *fence) { @@ -234,6 +291,9 @@ static void sync_file_free(struct kref *kref) kref); int i;
+ if (sync_file->collection) + fence_put(&sync_file->collection->base); + for (i = 0; i < sync_file->num_fences; ++i) { fence_remove_callback(sync_file->cbs[i].fence, &sync_file->cbs[i].cb); diff --git a/include/linux/sync_file.h b/include/linux/sync_file.h index 900fa0c..3eb3aad 100644 --- a/include/linux/sync_file.h +++ b/include/linux/sync_file.h @@ -35,6 +35,7 @@ struct sync_file_cb { * @num_fences: number of sync_pts in the fence * @wq: wait queue for fence signaling * @status: 0: signaled, >0:active, <0: error + * @collection: collection with the fences in the sync_file * @cbs: sync_pts callback information */ struct sync_file { @@ -49,6 +50,7 @@ struct sync_file { wait_queue_head_t wq; atomic_t status;
+ struct fence_collection *collection; struct sync_file_cb cbs[]; };
@@ -59,4 +61,5 @@ void sync_file_install(struct sync_file *sync_file, int fd); struct sync_file *sync_file_merge(const char *name, struct sync_file *a, struct sync_file *b);
+struct fence *sync_file_fences_get(int fd); #endif /* _LINUX_SYNC_H */
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
If userspace is running an synchronously atomic commit and interrupts the atomic operation during fence_wait() it will hang until the timer expires, so here we change the wait to be interruptible so it stop immediately when userspace wants to quit.
Also adds the necessary error checking for fence_wait().
v2: Comment by Daniel Vetter - Add error checking for fence_wait()
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- drivers/gpu/drm/drm_atomic_helper.c | 21 +++++++++++++++------ include/drm/drm_atomic_helper.h | 4 ++-- 2 files changed, 17 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 7bf678e..f1cfcce 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -993,13 +993,15 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_modeset_enables); * incoming fb's and stash it in the drm_plane_state. This is called after * drm_atomic_helper_swap_state() so it uses the current plane state (and * just uses the atomic state to find the changed planes) + * + * Return 0 on sucess or < 0 on error. */ -void drm_atomic_helper_wait_for_fences(struct drm_device *dev, - struct drm_atomic_state *state) +int drm_atomic_helper_wait_for_fences(struct drm_device *dev, + struct drm_atomic_state *state) { struct drm_plane *plane; struct drm_plane_state *plane_state; - int i; + int i, ret;
for_each_plane_in_state(state, plane, plane_state, i) { if (!plane->state->fence) @@ -1007,10 +1009,14 @@ void drm_atomic_helper_wait_for_fences(struct drm_device *dev,
WARN_ON(!plane->state->fb);
- fence_wait(plane->state->fence, false); + ret = fence_wait(plane->state->fence, true); + if (ret) + return ret; fence_put(plane->state->fence); plane->state->fence = NULL; } + + return 0; } EXPORT_SYMBOL(drm_atomic_helper_wait_for_fences);
@@ -1174,7 +1180,9 @@ int drm_atomic_helper_commit(struct drm_device *dev, * current layout. */
- drm_atomic_helper_wait_for_fences(dev, state); + ret = drm_atomic_helper_wait_for_fences(dev, state); + if (ret) + goto out;
drm_atomic_helper_commit_modeset_disables(dev, state);
@@ -1184,11 +1192,12 @@ int drm_atomic_helper_commit(struct drm_device *dev,
drm_atomic_helper_wait_for_vblanks(dev, state);
+out: drm_atomic_helper_cleanup_planes(dev, state);
drm_atomic_state_free(state);
- return 0; + return ret; } EXPORT_SYMBOL(drm_atomic_helper_commit);
diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h index fe9d89c..3a5d2e5 100644 --- a/include/drm/drm_atomic_helper.h +++ b/include/drm/drm_atomic_helper.h @@ -42,8 +42,8 @@ int drm_atomic_helper_commit(struct drm_device *dev, struct drm_atomic_state *state, bool async);
-void drm_atomic_helper_wait_for_fences(struct drm_device *dev, - struct drm_atomic_state *state); +int drm_atomic_helper_wait_for_fences(struct drm_device *dev, + struct drm_atomic_state *state); bool drm_atomic_helper_framebuffer_changed(struct drm_device *dev, struct drm_atomic_state *old_state, struct drm_crtc *crtc);
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
The fd is then translated to a fence (that may be a fence_collection subclass or just a normal fence) and then used by DRM to fence_wait() for all fences in the sync_file to signal. So it only commits when all framebuffers are ready to scanout.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
v2: Comments by Daniel Vetter: - remove set state->fence = NULL in destroy phase - accept fence -1 as valid and just return 0 - do not call fence_get() - sync_file_fences_get() already calls it - fence_put() if state->fence is already set, in case userspace set the property more than once. --- drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/drm_atomic.c | 14 ++++++++++++++ drivers/gpu/drm/drm_atomic_helper.c | 3 +++ drivers/gpu/drm/drm_crtc.c | 7 +++++++ include/drm/drm_crtc.h | 1 + 5 files changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index f2a74d0..3c987e3 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -12,6 +12,7 @@ menuconfig DRM select I2C select I2C_ALGOBIT select DMA_SHARED_BUFFER + select SYNC_FILE help Kernel-level support for the Direct Rendering Infrastructure (DRI) introduced in XFree86 4.0. If you say Y here, you need to select diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 8ee1db8..13674c7 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -30,6 +30,7 @@ #include <drm/drm_atomic.h> #include <drm/drm_mode.h> #include <drm/drm_plane_helper.h> +#include <linux/sync_file.h>
/** * drm_atomic_state_default_release - @@ -680,6 +681,17 @@ int drm_atomic_plane_set_property(struct drm_plane *plane, drm_atomic_set_fb_for_plane(state, fb); if (fb) drm_framebuffer_unreference(fb); + } else if (property == config->prop_fence_fd) { + if (U642I64(val) == -1) + return 0; + + if (state->fence) + fence_put(state->fence); + + state->fence = sync_file_fences_get(val); + if (!state->fence) + return -EINVAL; + } else if (property == config->prop_crtc_id) { struct drm_crtc *crtc = drm_crtc_find(dev, val); return drm_atomic_set_crtc_for_plane(state, crtc); @@ -737,6 +749,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
if (property == config->prop_fb_id) { *val = (state->fb) ? state->fb->base.id : 0; + } else if (property == config->prop_fence_fd) { + *val = -1; } else if (property == config->prop_crtc_id) { *val = (state->crtc) ? state->crtc->base.id : 0; } else if (property == config->prop_crtc_x) { diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index f1cfcce..866f332 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -2696,6 +2696,9 @@ void __drm_atomic_helper_plane_destroy_state(struct drm_plane *plane, { if (state->fb) drm_framebuffer_unreference(state->fb); + + if (state->fence) + fence_put(state->fence); } EXPORT_SYMBOL(__drm_atomic_helper_plane_destroy_state);
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 55ffde5..65212ce 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -1278,6 +1278,7 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
if (drm_core_check_feature(dev, DRIVER_ATOMIC)) { drm_object_attach_property(&plane->base, config->prop_fb_id, 0); + drm_object_attach_property(&plane->base, config->prop_fence_fd, -1); drm_object_attach_property(&plane->base, config->prop_crtc_id, 0); drm_object_attach_property(&plane->base, config->prop_crtc_x, 0); drm_object_attach_property(&plane->base, config->prop_crtc_y, 0); @@ -1533,6 +1534,12 @@ static int drm_mode_create_standard_properties(struct drm_device *dev) return -ENOMEM; dev->mode_config.prop_fb_id = prop;
+ prop = drm_property_create_signed_range(dev, DRM_MODE_PROP_ATOMIC, + "FENCE_FD", -1, INT_MAX); + if (!prop) + return -ENOMEM; + dev->mode_config.prop_fence_fd = prop; + prop = drm_property_create_object(dev, DRM_MODE_PROP_ATOMIC, "CRTC_ID", DRM_MODE_OBJECT_CRTC); if (!prop) diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index 8cb377c..5ba3cda 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -2122,6 +2122,7 @@ struct drm_mode_config { struct drm_property *prop_crtc_w; struct drm_property *prop_crtc_h; struct drm_property *prop_fb_id; + struct drm_property *prop_fence_fd; struct drm_property *prop_crtc_id; struct drm_property *prop_active; struct drm_property *prop_mode_id;
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
The fd is then translated to a fence (that may be a fence_collection subclass or just a normal fence) and then used by DRM to fence_wait() for all fences in the sync_file to signal. So it only commits when all framebuffers are ready to scanout.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
v2: Comments by Daniel Vetter:
- remove set state->fence = NULL in destroy phase
- accept fence -1 as valid and just return 0
- do not call fence_get() - sync_file_fences_get() already calls it
- fence_put() if state->fence is already set, in case userspace
set the property more than once.
drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/drm_atomic.c | 14 ++++++++++++++ drivers/gpu/drm/drm_atomic_helper.c | 3 +++ drivers/gpu/drm/drm_crtc.c | 7 +++++++ include/drm/drm_crtc.h | 1 + 5 files changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index f2a74d0..3c987e3 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -12,6 +12,7 @@ menuconfig DRM select I2C select I2C_ALGOBIT select DMA_SHARED_BUFFER
- select SYNC_FILE help Kernel-level support for the Direct Rendering Infrastructure (DRI) introduced in XFree86 4.0. If you say Y here, you need to select
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 8ee1db8..13674c7 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -30,6 +30,7 @@ #include <drm/drm_atomic.h> #include <drm/drm_mode.h> #include <drm/drm_plane_helper.h> +#include <linux/sync_file.h>
/**
- drm_atomic_state_default_release -
@@ -680,6 +681,17 @@ int drm_atomic_plane_set_property(struct drm_plane *plane, drm_atomic_set_fb_for_plane(state, fb); if (fb) drm_framebuffer_unreference(fb);
- } else if (property == config->prop_fence_fd) {
if (U642I64(val) == -1)
return 0;
if (state->fence)
fence_put(state->fence);
state->fence = sync_file_fences_get(val);
if (!state->fence)
return -EINVAL;
- } else if (property == config->prop_crtc_id) { struct drm_crtc *crtc = drm_crtc_find(dev, val); return drm_atomic_set_crtc_for_plane(state, crtc);
@@ -737,6 +749,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
if (property == config->prop_fb_id) { *val = (state->fb) ? state->fb->base.id : 0;
- } else if (property == config->prop_fence_fd) {
} else if (property == config->prop_crtc_id) { *val = (state->crtc) ? state->crtc->base.id : 0; } else if (property == config->prop_crtc_x) {*val = -1;
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index f1cfcce..866f332 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -2696,6 +2696,9 @@ void __drm_atomic_helper_plane_destroy_state(struct drm_plane *plane, { if (state->fb) drm_framebuffer_unreference(state->fb);
- if (state->fence)
fence_put(state->fence);
} EXPORT_SYMBOL(__drm_atomic_helper_plane_destroy_state);
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 55ffde5..65212ce 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -1278,6 +1278,7 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
if (drm_core_check_feature(dev, DRIVER_ATOMIC)) { drm_object_attach_property(&plane->base, config->prop_fb_id, 0);
drm_object_attach_property(&plane->base, config->prop_crtc_id, 0); drm_object_attach_property(&plane->base, config->prop_crtc_x, 0); drm_object_attach_property(&plane->base, config->prop_crtc_y, 0);drm_object_attach_property(&plane->base, config->prop_fence_fd, -1);
@@ -1533,6 +1534,12 @@ static int drm_mode_create_standard_properties(struct drm_device *dev) return -ENOMEM; dev->mode_config.prop_fb_id = prop;
- prop = drm_property_create_signed_range(dev, DRM_MODE_PROP_ATOMIC,
"FENCE_FD", -1, INT_MAX);
- if (!prop)
return -ENOMEM;
- dev->mode_config.prop_fence_fd = prop;
- prop = drm_property_create_object(dev, DRM_MODE_PROP_ATOMIC, "CRTC_ID", DRM_MODE_OBJECT_CRTC); if (!prop)
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index 8cb377c..5ba3cda 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -2122,6 +2122,7 @@ struct drm_mode_config { struct drm_property *prop_crtc_w; struct drm_property *prop_crtc_h; struct drm_property *prop_fb_id;
- struct drm_property *prop_fence_fd; struct drm_property *prop_crtc_id; struct drm_property *prop_active; struct drm_property *prop_mode_id;
-- 2.5.5
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Gustavo
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other. -Daniel
On Tue, Apr 26, 2016 at 04:36:36PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other.
It gives the impression that each plane might flip as soon as its fence signals.
On Tue, Apr 26, 2016 at 07:26:21PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 04:36:36PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other.
It gives the impression that each plane might flip as soon as its fence signals.
That wouldn't be atomic. Not sure how someone could come up with that idea. I mean we could move FENCE_FD to the crtc (fence fds can be merged), but that's just a needless difference to what hwc expects. I think aligning with the only real-world users in this case here makes sense.
Plus docs in case someone has funny ideas. -Daniel
On Tue, Apr 26, 2016 at 07:20:49PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 07:26:21PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 04:36:36PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other.
It gives the impression that each plane might flip as soon as its fence signals.
That wouldn't be atomic. Not sure how someone could come up with that idea.
What else would it mean? It's attached to a specific plane, so why would it affect other planes?
I mean we could move FENCE_FD to the crtc (fence fds can be merged), but that's just a needless difference to what hwc expects. I think aligning with the only real-world users in this case here makes sense.
Well it doesn't belong on the crtc either. I would just stick in the ioctl as a separate thing, then it's clear it's related to the whole operation rather than any kms object.
Plus docs in case someone has funny ideas.
Weren't you just quoting rusty's API manifesto recently? ;)
Maybe it was someone else.
On Tue, Apr 26, 2016 at 08:40:45PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 07:20:49PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 07:26:21PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 04:36:36PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote: > From: Gustavo Padovan gustavo.padovan@collabora.co.uk > > There is now a new property called FENCE_FD attached to every plane > state that receives the sync_file fd from userspace via the atomic commit > IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other.
It gives the impression that each plane might flip as soon as its fence signals.
That wouldn't be atomic. Not sure how someone could come up with that idea.
What else would it mean? It's attached to a specific plane, so why would it affect other planes?
I mean we could move FENCE_FD to the crtc (fence fds can be merged), but that's just a needless difference to what hwc expects. I think aligning with the only real-world users in this case here makes sense.
Well it doesn't belong on the crtc either. I would just stick in the ioctl as a separate thing, then it's clear it's related to the whole operation rather than any kms object.
We want it per-crtc I'd say, so that you could flip each crtc individually. But really the reason for per-plane is hw composer from Android. I don't see any point in designing an api that's needlessly different from what the main user expects (even if it may be silly).
The other bit is that for implicit syncing you need one fence per fb/plane anyway, so this also fits nicely on the driver side I think.
Plus docs in case someone has funny ideas.
Weren't you just quoting rusty's API manifesto recently? ;)
I quote it all the time.
http://sweng.the-davies.net/Home/rustys-api-design-manifesto
I think current interface is scoring pretty high since as part of Gustavo's work there's no also a new atomic helper which will get the waiting right. There's still the problem that neither for drm_event nor the fence do we have anything idiot-proof. So for that the solution is testcases (which are also happening). -Daniel
On Tue, Apr 26, 2016 at 08:23:46PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 08:40:45PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 07:20:49PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 07:26:21PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 04:36:36PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
> On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote: > > From: Gustavo Padovan gustavo.padovan@collabora.co.uk > > > > There is now a new property called FENCE_FD attached to every plane > > state that receives the sync_file fd from userspace via the atomic commit > > IOCTL. > > I still don't like this property abuse. Also with atomic, all passed > fences must be waited upon before anything is done, so attaching them > to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences. However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other.
It gives the impression that each plane might flip as soon as its fence signals.
That wouldn't be atomic. Not sure how someone could come up with that idea.
What else would it mean? It's attached to a specific plane, so why would it affect other planes?
I mean we could move FENCE_FD to the crtc (fence fds can be merged), but that's just a needless difference to what hwc expects. I think aligning with the only real-world users in this case here makes sense.
Well it doesn't belong on the crtc either. I would just stick in the ioctl as a separate thing, then it's clear it's related to the whole operation rather than any kms object.
We want it per-crtc I'd say, so that you could flip each crtc individually.
Then you could just issue multiple ioctls. For eg. those nasty 4k MST display (or just otherwise neatly synced displayes) you want to wait for all the fences upfront and the flip everything at once, otherwise you'll get nasty tears at the seams.
But really the reason for per-plane is hw composer from Android. I don't see any point in designing an api that's needlessly different from what the main user expects (even if it may be silly).
What are they doing that can't stuff the fences into an array instead of props?
The other bit is that for implicit syncing you need one fence per fb/plane anyway, so this also fits nicely on the driver side I think.
Plus docs in case someone has funny ideas.
Weren't you just quoting rusty's API manifesto recently? ;)
I quote it all the time.
http://sweng.the-davies.net/Home/rustys-api-design-manifesto
I think current interface is scoring pretty high since as part of Gustavo's work there's no also a new atomic helper which will get the waiting right. There's still the problem that neither for drm_event nor the fence do we have anything idiot-proof. So for that the solution is testcases (which are also happening).
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 08:23:46PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 08:40:45PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 07:20:49PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 07:26:21PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 04:36:36PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote: > 2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com: > > > On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote: > > > From: Gustavo Padovan gustavo.padovan@collabora.co.uk > > > > > > There is now a new property called FENCE_FD attached to every plane > > > state that receives the sync_file fd from userspace via the atomic commit > > > IOCTL. > > > > I still don't like this property abuse. Also with atomic, all passed > > fences must be waited upon before anything is done, so attaching them > > to planes seems like it might just give people the wrong idea. > > I'm actually fine with this as property, but another solutions is use > an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like > we have done for out fences. However the FENCE_FD property is easier to > handle in userspace than the array. Any other idea?
Imo FENCE_FD is perfectly fine. But what's the concern around giving people the wrong idea with attaching fences to planes? For nonblocking commits we need to store them somewhere for the worker, drm_plane_state seems like an as good place as any other.
It gives the impression that each plane might flip as soon as its fence signals.
That wouldn't be atomic. Not sure how someone could come up with that idea.
What else would it mean? It's attached to a specific plane, so why would it affect other planes?
I mean we could move FENCE_FD to the crtc (fence fds can be merged), but that's just a needless difference to what hwc expects. I think aligning with the only real-world users in this case here makes sense.
Well it doesn't belong on the crtc either. I would just stick in the ioctl as a separate thing, then it's clear it's related to the whole operation rather than any kms object.
We want it per-crtc I'd say, so that you could flip each crtc individually.
Then you could just issue multiple ioctls. For eg. those nasty 4k MST display (or just otherwise neatly synced displayes) you want to wait for all the fences upfront and the flip everything at once, otherwise you'll get nasty tears at the seams.
But really the reason for per-plane is hw composer from Android. I don't see any point in designing an api that's needlessly different from what the main user expects (even if it may be silly).
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
As long as you end up with a pile of fences somehow it'll work. -Daniel
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 08:23:46PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 08:40:45PM +0300, Ville Syrjälä wrote: But really the reason for per-plane is hw composer from Android. I don't see any point in designing an api that's needlessly different from what the main user expects (even if it may be silly).
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
As long as you end up with a pile of fences somehow it'll work. -Daniel
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
On Tue, Apr 26, 2016 at 01:48:02PM -0700, Greg Hackmann wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 08:23:46PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 08:40:45PM +0300, Ville Syrjälä wrote: But really the reason for per-plane is hw composer from Android. I don't see any point in designing an api that's needlessly different from what the main user expects (even if it may be silly).
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
As long as you end up with a pile of fences somehow it'll work. -Daniel
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
I'm ok with an array too if that's what you folks prefer (it's meant to be used by you after all). I just don't want just 1 fence for the entire op, forcing userspace to first merge them all together. That seems silly.
One side-effect of that is that we'd also have to rework all the internal bits and move fences around in atomic. Which means change a pile of drivers. Not sure that's worth it, but I'd be ok either way really. -Daniel
On Wed, Apr 27, 2016 at 2:39 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Apr 26, 2016 at 01:48:02PM -0700, Greg Hackmann wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
On Tue, Apr 26, 2016 at 08:23:46PM +0200, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 08:40:45PM +0300, Ville Syrjälä wrote: But really the reason for per-plane is hw composer from Android. I don't see any point in designing an api that's needlessly different from what the main user expects (even if it may be silly).
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
As long as you end up with a pile of fences somehow it'll work. -Daniel
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
I'm ok with an array too if that's what you folks prefer (it's meant to be used by you after all). I just don't want just 1 fence for the entire op, forcing userspace to first merge them all together. That seems silly.
I was kinda more a fan of array too, if for no other reason that to be consistent w/ how out-fences work. (And using property just for in-fence seemed slightly weird/abusive to me)
One side-effect of that is that we'd also have to rework all the internal bits and move fences around in atomic. Which means change a pile of drivers. Not sure that's worth it, but I'd be ok either way really.
hmm, well we could keep the array per-plane (and if one layer is using multiple planes, just list the same fd multiple times).. then it mostly comes down to changes in the ioctl fxn itself.
BR, -R
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi,
On 28 April 2016 at 23:28, Rob Clark robdclark@gmail.com wrote:
On Wed, Apr 27, 2016 at 2:39 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Apr 26, 2016 at 01:48:02PM -0700, Greg Hackmann wrote:
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
I'm ok with an array too if that's what you folks prefer (it's meant to be used by you after all). I just don't want just 1 fence for the entire op, forcing userspace to first merge them all together. That seems silly.
I was kinda more a fan of array too, if for no other reason that to be consistent w/ how out-fences work. (And using property just for in-fence seemed slightly weird/abusive to me)
I don't think it's really useful to look for much consistency between the two, beyond the name. I'm more concerned with consistency between in-fences and the implicit fences on buffers/FBs, and between out-fences and the page_flip_events.
One side-effect of that is that we'd also have to rework all the internal bits and move fences around in atomic. Which means change a pile of drivers. Not sure that's worth it, but I'd be ok either way really.
hmm, well we could keep the array per-plane (and if one layer is using multiple planes, just list the same fd multiple times).. then it mostly comes down to changes in the ioctl fxn itself.
... and new API in libdrm, which is going to be a serious #ifdef and distribution pain. The core property API has been available since 2.4.62 last June, but for this we'd have to write the code, wait for the kernel code, wait for HWC, get everything together, and then merge and release. That gives minimum one year of libdrm releases which have had atomic but not in-fence API support, if we're adding a new array. And I just don't really see what it buys us, apart from the need for the core atomic_get_property helper to statically return -1 when requesting FENCE_FD.
Cheers, Daniel
On Fri, Apr 29, 2016 at 3:48 AM, Daniel Stone daniel@fooishbar.org wrote:
Hi,
On 28 April 2016 at 23:28, Rob Clark robdclark@gmail.com wrote:
On Wed, Apr 27, 2016 at 2:39 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Apr 26, 2016 at 01:48:02PM -0700, Greg Hackmann wrote:
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
I'm ok with an array too if that's what you folks prefer (it's meant to be used by you after all). I just don't want just 1 fence for the entire op, forcing userspace to first merge them all together. That seems silly.
I was kinda more a fan of array too, if for no other reason that to be consistent w/ how out-fences work. (And using property just for in-fence seemed slightly weird/abusive to me)
I don't think it's really useful to look for much consistency between the two, beyond the name. I'm more concerned with consistency between in-fences and the implicit fences on buffers/FBs, and between out-fences and the page_flip_events.
One side-effect of that is that we'd also have to rework all the internal bits and move fences around in atomic. Which means change a pile of drivers. Not sure that's worth it, but I'd be ok either way really.
hmm, well we could keep the array per-plane (and if one layer is using multiple planes, just list the same fd multiple times).. then it mostly comes down to changes in the ioctl fxn itself.
... and new API in libdrm, which is going to be a serious #ifdef and distribution pain. The core property API has been available since 2.4.62 last June, but for this we'd have to write the code, wait for the kernel code, wait for HWC, get everything together, and then merge and release. That gives minimum one year of libdrm releases which have had atomic but not in-fence API support, if we're adding a new array. And I just don't really see what it buys us, apart from the need for the core atomic_get_property helper to statically return -1 when requesting FENCE_FD.
don't we have the same issue for out-fences anyway?
ofc, I suspect we could handle making fences look like properties in userspace in libdrm (at least if there was a sane way that libdrm could track and eventually close() old out-fence fd's). I'm not entirely sure this matters, I mean how do we make implicit vs explicit fencing transparent to the compositor and the proto between compositor<->app?
Admittedly I haven't given *too* much thought yet about the implications to libdrm and it's users, but it seems like we need to make a v2 API rev anyway for out-fences, and the compositor is going to need different codepaths for explicit vs implicit (if it supports both). So I don't think in-fences as something other than property really costs us anything additional?
(Unless there is some sane reason to have an intermediate state w/ in-fences but pageflip events instead of out-fences? But that seems odd..)
BR, -R
Cheers, Daniel
Gustavo, you added fence_put() in __drm_atomic_helper_plane_destroy_state(), shouldn't we also add a corresponding fence_get() in __drm_atomic_helper_plane_duplicate_state() ?
On Fri, Apr 29, 2016 at 3:23 PM, Rob Clark robdclark@gmail.com wrote:
On Fri, Apr 29, 2016 at 3:48 AM, Daniel Stone daniel@fooishbar.org wrote:
Hi,
On 28 April 2016 at 23:28, Rob Clark robdclark@gmail.com wrote:
On Wed, Apr 27, 2016 at 2:39 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Apr 26, 2016 at 01:48:02PM -0700, Greg Hackmann wrote:
A (per-CRTC?) array of fences would be more flexible. And even in
the cases
where you could make a 1-to-1 mapping between planes and fences, it's
not
that much more work for userspace to assemble those fences into an
array
anyway.
I'm ok with an array too if that's what you folks prefer (it's meant
to be
used by you after all). I just don't want just 1 fence for the entire
op,
forcing userspace to first merge them all together. That seems silly.
I was kinda more a fan of array too, if for no other reason that to be consistent w/ how out-fences work. (And using property just for in-fence seemed slightly weird/abusive to me)
I don't think it's really useful to look for much consistency between the two, beyond the name. I'm more concerned with consistency between in-fences and the implicit fences on buffers/FBs, and between out-fences and the page_flip_events.
One side-effect of that is that we'd also have to rework all the
internal
bits and move fences around in atomic. Which means change a pile of drivers. Not sure that's worth it, but I'd be ok either way really.
hmm, well we could keep the array per-plane (and if one layer is using multiple planes, just list the same fd multiple times).. then it mostly comes down to changes in the ioctl fxn itself.
... and new API in libdrm, which is going to be a serious #ifdef and distribution pain. The core property API has been available since 2.4.62 last June, but for this we'd have to write the code, wait for the kernel code, wait for HWC, get everything together, and then merge and release. That gives minimum one year of libdrm releases which have had atomic but not in-fence API support, if we're adding a new array. And I just don't really see what it buys us, apart from the need for the core atomic_get_property helper to statically return -1 when requesting FENCE_FD.
don't we have the same issue for out-fences anyway?
ofc, I suspect we could handle making fences look like properties in userspace in libdrm (at least if there was a sane way that libdrm could track and eventually close() old out-fence fd's). I'm not entirely sure this matters, I mean how do we make implicit vs explicit fencing transparent to the compositor and the proto between compositor<->app?
Admittedly I haven't given *too* much thought yet about the implications to libdrm and it's users, but it seems like we need to make a v2 API rev anyway for out-fences, and the compositor is going to need different codepaths for explicit vs implicit (if it supports both). So I don't think in-fences as something other than property really costs us anything additional?
(Unless there is some sane reason to have an intermediate state w/ in-fences but pageflip events instead of out-fences? But that seems odd..)
BR, -R
Cheers, Daniel
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
2016-07-12 Dominik Behr dbehr@chromium.org:
Gustavo, you added fence_put() in __drm_atomic_helper_plane_destroy_state(), shouldn't we also add a corresponding fence_get() in __drm_atomic_helper_plane_duplicate_state() ?
Yes, my new version of the patches that already have an extra fence_get()
https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences
Gustavo
On 04/26/2016 11:39 PM, Daniel Vetter wrote:
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
I'm ok with an array too if that's what you folks prefer (it's meant to be used by you after all). I just don't want just 1 fence for the entire op, forcing userspace to first merge them all together. That seems silly.
One side-effect of that is that we'd also have to rework all the internal bits and move fences around in atomic. Which means change a pile of drivers. Not sure that's worth it, but I'd be ok either way really. -Daniel
It's not a strong preference on my end. The 1:1 plane-to-layer mapping breaks down somewhat on hardware where you need to split large hwcomposer layers across multiple DRM planes.
That said, you can force that case to fit by just dup()ing the fence a bunch of times or arbitrarily picking one of the planes to assign the fence to. Either is kludgey, but I can't argue it's kludgey enough to justify refactoring a bunch of existing driver code.
Hi,
On 26 April 2016 at 21:48, Greg Hackmann ghackmann@google.com wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Right, and when using implicit fencing, this comes as a plane property, by virtue of plane -> fb -> buffer -> fence.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
Can you please elaborate?
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
As Ville says, I don't want to go down the path of scheduling CRTC updates separately, because that breaks MST pretty badly. If you don't want your updates to display atomically, then don't schedule them atomically ... ? That's the only reason I can see for making fencing per-CRTC, rather than just a pile of unassociated fences appended to the request. Per-CRTC fences also forces userspace to merge fences before submission when using multiple planes per CRTC, which is pretty punitive.
I think having it semantically attached to the plane is a little bit nicer for tracing (why was this request delayed? -> a fence -> which buffer was that fence for?) at a glance. Also the 'pile of appended fences' model is a bit awkward for more generic userspace, which creates a libdrm request and builds it (add a plane, try it out, wind back) incrementally. Using properties makes that really easy, but without properties, we'd have to add separate codepaths - and thus separate ABI, which complicates distribution - to libdrm to account for a separate plane array which shares a cursor with the properties. So for that reason if none other, I'd really prefer not to go down that route.
Cheers, Daniel
2016-04-27 Daniel Stone daniel@fooishbar.org:
Hi,
On 26 April 2016 at 21:48, Greg Hackmann ghackmann@google.com wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Right, and when using implicit fencing, this comes as a plane property, by virtue of plane -> fb -> buffer -> fence.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
Can you please elaborate?
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
As Ville says, I don't want to go down the path of scheduling CRTC updates separately, because that breaks MST pretty badly. If you don't want your updates to display atomically, then don't schedule them atomically ... ? That's the only reason I can see for making fencing per-CRTC, rather than just a pile of unassociated fences appended to the request. Per-CRTC fences also forces userspace to merge fences before submission when using multiple planes per CRTC, which is pretty punitive.
I think having it semantically attached to the plane is a little bit nicer for tracing (why was this request delayed? -> a fence -> which buffer was that fence for?) at a glance. Also the 'pile of appended fences' model is a bit awkward for more generic userspace, which creates a libdrm request and builds it (add a plane, try it out, wind back) incrementally. Using properties makes that really easy, but without properties, we'd have to add separate codepaths - and thus separate ABI, which complicates distribution - to libdrm to account for a separate plane array which shares a cursor with the properties. So for that reason if none other, I'd really prefer not to go down that route.
I also agree to have it as FENCE_FD prop on the plane. Summarizing the arguments on this thread, they are:
- implicit fences also needs one fence per plane/fb, so it will be good to match with that. - requires userspace to always merge fences - can use standard plane properties, making kernel and userspace life easier, an array brings more work to build the atomic request plus extra checkings on the kernel. - do not need to changes to drivers - better for tracing, can identify the buffer/fence promptly
Gustavo
On Thu, Apr 28, 2016 at 11:36:44AM -0300, Gustavo Padovan wrote:
2016-04-27 Daniel Stone daniel@fooishbar.org:
Hi,
On 26 April 2016 at 21:48, Greg Hackmann ghackmann@google.com wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Right, and when using implicit fencing, this comes as a plane property, by virtue of plane -> fb -> buffer -> fence.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
Can you please elaborate?
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
As Ville says, I don't want to go down the path of scheduling CRTC updates separately, because that breaks MST pretty badly. If you don't want your updates to display atomically, then don't schedule them atomically ... ? That's the only reason I can see for making fencing per-CRTC, rather than just a pile of unassociated fences appended to the request. Per-CRTC fences also forces userspace to merge fences before submission when using multiple planes per CRTC, which is pretty punitive.
I think having it semantically attached to the plane is a little bit nicer for tracing (why was this request delayed? -> a fence -> which buffer was that fence for?) at a glance. Also the 'pile of appended fences' model is a bit awkward for more generic userspace, which creates a libdrm request and builds it (add a plane, try it out, wind back) incrementally. Using properties makes that really easy, but without properties, we'd have to add separate codepaths - and thus separate ABI, which complicates distribution - to libdrm to account for a separate plane array which shares a cursor with the properties. So for that reason if none other, I'd really prefer not to go down that route.
I also agree to have it as FENCE_FD prop on the plane. Summarizing the arguments on this thread, they are:
- implicit fences also needs one fence per plane/fb, so it will be good to match with that.
- requires userspace to always merge fences
"does not require" I presume?
- can use standard plane properties, making kernel and userspace life easier, an array brings more work to build the atomic request plus extra checkings on the kernel.
- do not need to changes to drivers
- better for tracing, can identify the buffer/fence promptly
- Fits in well with the libdrm atomic rollback support - no need to manage fences separately when incrementally building an atomic commit.
Gustavo _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, Apr 28, 2016 at 11:36:44AM -0300, Gustavo Padovan wrote:
2016-04-27 Daniel Stone daniel@fooishbar.org:
Hi,
On 26 April 2016 at 21:48, Greg Hackmann ghackmann@google.com wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Right, and when using implicit fencing, this comes as a plane property, by virtue of plane -> fb -> buffer -> fence.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
Can you please elaborate?
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
As Ville says, I don't want to go down the path of scheduling CRTC updates separately, because that breaks MST pretty badly. If you don't want your updates to display atomically, then don't schedule them atomically ... ? That's the only reason I can see for making fencing per-CRTC, rather than just a pile of unassociated fences appended to the request. Per-CRTC fences also forces userspace to merge fences before submission when using multiple planes per CRTC, which is pretty punitive.
I think having it semantically attached to the plane is a little bit nicer for tracing (why was this request delayed? -> a fence -> which buffer was that fence for?) at a glance. Also the 'pile of appended fences' model is a bit awkward for more generic userspace, which creates a libdrm request and builds it (add a plane, try it out, wind back) incrementally. Using properties makes that really easy, but without properties, we'd have to add separate codepaths - and thus separate ABI, which complicates distribution - to libdrm to account for a separate plane array which shares a cursor with the properties. So for that reason if none other, I'd really prefer not to go down that route.
I also agree to have it as FENCE_FD prop on the plane. Summarizing the arguments on this thread, they are:
Your "summary" forgot to include any counter arguments.
- implicit fences also needs one fence per plane/fb, so it will be good to match with that.
We would actually need a fence per object rather than per fb.
- requires userspace to always merge fences
"doesn't?" but that's not true if it's an array. It would be true if you had just one fence for the whole thing, or one per crtc.
- can use standard plane properties, making kernel and userspace life easier, an array brings more work to build the atomic request plus extra checkings on the kernel.
I don't really get this one. The objects and props are arrays too. Why is another array so problematic?
- do not need to changes to drivers
- better for tracing, can identify the buffer/fence promptly
Can fences be reused somehow while still attached to a plane, or ever? That might cause some oddness if you, say, leave a fence attached to one plane and then do a modeset on another crtc perhaps which needs to turn the first crtc off+on to reconfigure something.
On Thu, Apr 28, 2016 at 6:56 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
- better for tracing, can identify the buffer/fence promptly
Can fences be reused somehow while still attached to a plane, or ever? That might cause some oddness if you, say, leave a fence attached to one plane and then do a modeset on another crtc perhaps which needs to turn the first crtc off+on to reconfigure something.
Fences auto-disappear of course and don't stick around when you duplicate the drm_plane_state again. I still don't really get the real concerns though ... In the end it's purely a transport question, and both ABI ideas work out semantically exactly the same in the end. It's just that at least in my opinion FENCE_FD prop is a lot more convenient. -Daniel
On Thu, Apr 28, 2016 at 07:43:16PM +0200, Daniel Vetter wrote:
On Thu, Apr 28, 2016 at 6:56 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
- better for tracing, can identify the buffer/fence promptly
Can fences be reused somehow while still attached to a plane, or ever? That might cause some oddness if you, say, leave a fence attached to one plane and then do a modeset on another crtc perhaps which needs to turn the first crtc off+on to reconfigure something.
Fences auto-disappear of course and don't stick around when you duplicate the drm_plane_state again. I still don't really get the real concerns though ...
Properties that magically change values shouldn't exist IMO. I guess if you could have write-only properties or something it migth be sensible?
In the end it's purely a transport question, and both ABI ideas work out semantically exactly the same in the end. It's just that at least in my opinion FENCE_FD prop is a lot more convenient.
-Daniel
Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
2016-04-28 Ville Syrjälä ville.syrjala@linux.intel.com:
On Thu, Apr 28, 2016 at 07:43:16PM +0200, Daniel Vetter wrote:
On Thu, Apr 28, 2016 at 6:56 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
- better for tracing, can identify the buffer/fence promptly
Can fences be reused somehow while still attached to a plane, or ever? That might cause some oddness if you, say, leave a fence attached to one plane and then do a modeset on another crtc perhaps which needs to turn the first crtc off+on to reconfigure something.
Fences auto-disappear of course and don't stick around when you duplicate the drm_plane_state again. I still don't really get the real concerns though ...
Properties that magically change values shouldn't exist IMO. I guess if you could have write-only properties or something it migth be sensible?
We can just not return FENCE_FD on get_props, that would make it write-only.
Gustavo
On Thu, Apr 28, 2016 at 7:55 PM, Gustavo Padovan gustavo.padovan@collabora.co.uk wrote:
2016-04-28 Ville Syrjälä ville.syrjala@linux.intel.com:
On Thu, Apr 28, 2016 at 07:43:16PM +0200, Daniel Vetter wrote:
On Thu, Apr 28, 2016 at 6:56 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
- better for tracing, can identify the buffer/fence promptly
Can fences be reused somehow while still attached to a plane, or ever? That might cause some oddness if you, say, leave a fence attached to one plane and then do a modeset on another crtc perhaps which needs to turn the first crtc off+on to reconfigure something.
Fences auto-disappear of course and don't stick around when you duplicate the drm_plane_state again. I still don't really get the real concerns though ...
Properties that magically change values shouldn't exist IMO. I guess if you could have write-only properties or something it migth be sensible?
We can just not return FENCE_FD on get_props, that would make it write-only.
We do actually return a value for get_props, but it's -1 which for fds means "no fd". That's to make sure userspace can save&restore any prop without causing harm. -Daniel
On Thu, Apr 28, 2016 at 07:56:19PM +0300, Ville Syrjälä wrote:
On Thu, Apr 28, 2016 at 11:36:44AM -0300, Gustavo Padovan wrote:
2016-04-27 Daniel Stone daniel@fooishbar.org:
Hi,
On 26 April 2016 at 21:48, Greg Hackmann ghackmann@google.com wrote:
On 04/26/2016 01:05 PM, Daniel Vetter wrote:
On Tue, Apr 26, 2016 at 09:55:06PM +0300, Ville Syrjälä wrote:
What are they doing that can't stuff the fences into an array instead of props?
The hw composer interface is one in-fence per plane. That's really the major reason why the kernel interface is built to match. And I really don't think we should diverge just because we have a slight different color preference ;-)
The relationship between layers and fences is only fuzzy and indirect though. The relationship is really between the buffer you're displaying on that layer, and the fence representing the work done to render into that buffer. SurfaceFlinger just happens to bundle them together inside the same struct hwc_layer_1 as an API convenience.
Right, and when using implicit fencing, this comes as a plane property, by virtue of plane -> fb -> buffer -> fence.
Which is kind of splitting hairs as long as you have a 1-to-1 relationship between layers and DRM planes. But that's not always the case.
Can you please elaborate?
A (per-CRTC?) array of fences would be more flexible. And even in the cases where you could make a 1-to-1 mapping between planes and fences, it's not that much more work for userspace to assemble those fences into an array anyway.
As Ville says, I don't want to go down the path of scheduling CRTC updates separately, because that breaks MST pretty badly. If you don't want your updates to display atomically, then don't schedule them atomically ... ? That's the only reason I can see for making fencing per-CRTC, rather than just a pile of unassociated fences appended to the request. Per-CRTC fences also forces userspace to merge fences before submission when using multiple planes per CRTC, which is pretty punitive.
I think having it semantically attached to the plane is a little bit nicer for tracing (why was this request delayed? -> a fence -> which buffer was that fence for?) at a glance. Also the 'pile of appended fences' model is a bit awkward for more generic userspace, which creates a libdrm request and builds it (add a plane, try it out, wind back) incrementally. Using properties makes that really easy, but without properties, we'd have to add separate codepaths - and thus separate ABI, which complicates distribution - to libdrm to account for a separate plane array which shares a cursor with the properties. So for that reason if none other, I'd really prefer not to go down that route.
I also agree to have it as FENCE_FD prop on the plane. Summarizing the arguments on this thread, they are:
Your "summary" forgot to include any counter arguments.
- implicit fences also needs one fence per plane/fb, so it will be good to match with that.
We would actually need a fence per object rather than per fb.
I guess you could overcome this by automagically creating a merged fence for a multi-obj fb?
On Thu, Apr 28, 2016 at 8:17 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
- implicit fences also needs one fence per plane/fb, so it will be good to match with that.
We would actually need a fence per object rather than per fb.
I guess you could overcome this by automagically creating a merged fence for a multi-obj fb?
Yeah, and the android hwc does this for you already. You get passed a surface (or whatever it's called exactly) plus a fence, and the surface contains the gralloc/native buffer thing, which would contain multiple dma-buf handles if your hw does planar stuff that way.
I think everyone else who wants explicit fencing will go with the same or similar model, so it's just about implicit fencing. And there we can easily construct the fence_collection from a drm_framebuffer ourselves with a small helper in each driver (or shared one in cma, although cma doesn't yet grok reserverations/implicitly attached fences). -Daniel
On Tue, Apr 26, 2016 at 11:14:22AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:25PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
There is now a new property called FENCE_FD attached to every plane state that receives the sync_file fd from userspace via the atomic commit IOCTL.
I still don't like this property abuse. Also with atomic, all passed fences must be waited upon before anything is done, so attaching them to planes seems like it might just give people the wrong idea.
I'm actually fine with this as property, but another solutions is use an array of {plane, fence_fd} and extend drm_atomic_ioctl args just like we have done for out fences.
Why do you want to associate these with planes?
However the FENCE_FD property is easier to handle in userspace than the array. Any other idea?
Gustavo
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Now a drm_pending_event can either send a real drm_event or signal a fence, or both. It allow us to signal via fences when the buffer is displayed on the screen. Which in turn means that the previous buffer is not in use anymore and can be freed or sent back to another driver for processing.
v2: Comments from Daniel Vetter - call fence_signal in drm_send_event_locked() - remove unneeded !e->event check
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- drivers/gpu/drm/drm_atomic.c | 19 +++++++++++++------ drivers/gpu/drm/drm_fops.c | 8 +++++++- include/drm/drmP.h | 2 ++ 3 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 13674c7..5f9d434 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -1437,7 +1437,8 @@ EXPORT_SYMBOL(drm_atomic_async_commit); */
static struct drm_pending_vblank_event *create_vblank_event( - struct drm_device *dev, struct drm_file *file_priv, uint64_t user_data) + struct drm_device *dev, struct drm_file *file_priv, + struct fence *fence, uint64_t user_data) { struct drm_pending_vblank_event *e = NULL; int ret; @@ -1450,12 +1451,17 @@ static struct drm_pending_vblank_event *create_vblank_event( e->event.base.length = sizeof(e->event); e->event.user_data = user_data;
- ret = drm_event_reserve_init(dev, file_priv, &e->base, &e->event.base); - if (ret) { - kfree(e); - return NULL; + if (file_priv) { + ret = drm_event_reserve_init(dev, file_priv, &e->base, + &e->event.base); + if (ret) { + kfree(e); + return NULL; + } }
+ e->base.fence = fence; + return e; }
@@ -1682,7 +1688,8 @@ retry: for_each_crtc_in_state(state, crtc, crtc_state, i) { struct drm_pending_vblank_event *e;
- e = create_vblank_event(dev, file_priv, arg->user_data); + e = create_vblank_event(dev, file_priv, NULL, + arg->user_data); if (!e) { ret = -ENOMEM; goto out; diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index aeef58e..7872635 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -801,8 +801,14 @@ void drm_send_event_locked(struct drm_device *dev, struct drm_pending_event *e) { assert_spin_locked(&dev->event_lock);
+ if (e->fence) { + fence_signal(e->fence); + fence_put(e->fence); + } + if (!e->file_priv) { - e->destroy(e); + if (e->destroy) + e->destroy(e); return; }
diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 3c8422c..2ae6d7e 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -56,6 +56,7 @@ #include <linux/types.h> #include <linux/vmalloc.h> #include <linux/workqueue.h> +#include <linux/fence.h>
#include <asm/mman.h> #include <asm/pgalloc.h> @@ -282,6 +283,7 @@ struct drm_ioctl_desc { /* Event queued up for userspace to read */ struct drm_pending_event { struct drm_event *event; + struct fence *fence; struct list_head link; struct list_head pending_link; struct drm_file *file_priv;
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Create one timeline context for each CRTC to be able to handle out-fences and signal them. It adds a few members to struct drm_crtc: fence_context, where we store the context we get from fence_context_alloc(), the fence seqno and the fence lock, that we pass in fence_init() to be used by the fence.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- drivers/gpu/drm/drm_crtc.c | 29 +++++++++++++++++++++++++++++ include/drm/drm_crtc.h | 19 +++++++++++++++++++ 2 files changed, 48 insertions(+)
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 65212ce..cf9750a 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -659,6 +659,32 @@ static unsigned int drm_num_crtcs(struct drm_device *dev) return num; }
+static const char *drm_crtc_fence_get_driver_name(struct fence *fence) +{ + struct drm_crtc *crtc = fence_to_crtc(fence); + + return crtc->dev->driver->name; +} + +static const char *drm_crtc_fence_get_timeline_name(struct fence *fence) +{ + struct drm_crtc *crtc = fence_to_crtc(fence); + + return crtc->name; +} + +static bool drm_crtc_fence_enable_signaling(struct fence *fence) +{ + return true; +} + +const struct fence_ops drm_crtc_fence_ops = { + .get_driver_name = drm_crtc_fence_get_driver_name, + .get_timeline_name = drm_crtc_fence_get_timeline_name, + .enable_signaling = drm_crtc_fence_enable_signaling, + .wait = fence_default_wait, +}; + /** * drm_crtc_init_with_planes - Initialise a new CRTC object with * specified primary and cursor planes. @@ -709,6 +735,9 @@ int drm_crtc_init_with_planes(struct drm_device *dev, struct drm_crtc *crtc, return -ENOMEM; }
+ crtc->fence_context = fence_context_alloc(1); + spin_lock_init(&crtc->fence_lock); + crtc->base.properties = &crtc->properties;
list_add_tail(&crtc->head, &config->crtc_list); diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index 5ba3cda..d8c32c8 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -32,6 +32,7 @@ #include <linux/fb.h> #include <linux/hdmi.h> #include <linux/media-bus-format.h> +#include <linux/fence.h> #include <uapi/drm/drm_mode.h> #include <uapi/drm/drm_fourcc.h> #include <drm/drm_modeset_lock.h> @@ -715,6 +716,9 @@ struct drm_crtc_funcs { * @helper_private: mid-layer private data * @properties: property tracking for this CRTC * @state: current atomic state for this CRTC + * @fence_context: context for fence signalling + * @fence_lock: fence lock for the fence context + * @fence_seqno: seqno variable to create fences * @acquire_ctx: per-CRTC implicit acquire context used by atomic drivers for * legacy IOCTLs * @@ -771,6 +775,11 @@ struct drm_crtc {
struct drm_crtc_state *state;
+ /* fence timelines info for DRM out-fences */ + unsigned int fence_context; + spinlock_t fence_lock; + unsigned long fence_seqno; + /* * For legacy crtc IOCTLs so that atomic drivers can get at the locking * acquire context. @@ -778,6 +787,16 @@ struct drm_crtc { struct drm_modeset_acquire_ctx *acquire_ctx; };
+extern const struct fence_ops drm_crtc_fence_ops; + +static inline struct drm_crtc *fence_to_crtc(struct fence *fence) +{ + if (fence->ops != &drm_crtc_fence_ops) + return NULL; + + return container_of(fence->lock, struct drm_crtc, fence_lock); +} + /** * struct drm_connector_state - mutable connector state * @connector: backpointer to the connector
On Mon, Apr 25, 2016 at 07:33:27PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Create one timeline context for each CRTC to be able to handle out-fences and signal them. It adds a few members to struct drm_crtc: fence_context, where we store the context we get from fence_context_alloc(), the fence seqno and the fence lock, that we pass in fence_init() to be used by the fence.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
drivers/gpu/drm/drm_crtc.c | 29 +++++++++++++++++++++++++++++ include/drm/drm_crtc.h | 19 +++++++++++++++++++ 2 files changed, 48 insertions(+)
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 65212ce..cf9750a 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -659,6 +659,32 @@ static unsigned int drm_num_crtcs(struct drm_device *dev) return num; }
+static const char *drm_crtc_fence_get_driver_name(struct fence *fence) +{
- struct drm_crtc *crtc = fence_to_crtc(fence);
- return crtc->dev->driver->name;
+}
+static const char *drm_crtc_fence_get_timeline_name(struct fence *fence) +{
- struct drm_crtc *crtc = fence_to_crtc(fence);
- return crtc->name;
+}
Is that exported to userspace? crtc->name is an internal thing, not meant for outside consumption.
+static bool drm_crtc_fence_enable_signaling(struct fence *fence) +{
- return true;
+}
+const struct fence_ops drm_crtc_fence_ops = {
- .get_driver_name = drm_crtc_fence_get_driver_name,
- .get_timeline_name = drm_crtc_fence_get_timeline_name,
- .enable_signaling = drm_crtc_fence_enable_signaling,
- .wait = fence_default_wait,
+};
/**
- drm_crtc_init_with_planes - Initialise a new CRTC object with
- specified primary and cursor planes.
@@ -709,6 +735,9 @@ int drm_crtc_init_with_planes(struct drm_device *dev, struct drm_crtc *crtc, return -ENOMEM; }
crtc->fence_context = fence_context_alloc(1);
spin_lock_init(&crtc->fence_lock);
crtc->base.properties = &crtc->properties;
list_add_tail(&crtc->head, &config->crtc_list);
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index 5ba3cda..d8c32c8 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -32,6 +32,7 @@ #include <linux/fb.h> #include <linux/hdmi.h> #include <linux/media-bus-format.h> +#include <linux/fence.h> #include <uapi/drm/drm_mode.h> #include <uapi/drm/drm_fourcc.h> #include <drm/drm_modeset_lock.h> @@ -715,6 +716,9 @@ struct drm_crtc_funcs {
- @helper_private: mid-layer private data
- @properties: property tracking for this CRTC
- @state: current atomic state for this CRTC
- @fence_context: context for fence signalling
- @fence_lock: fence lock for the fence context
- @fence_seqno: seqno variable to create fences
- @acquire_ctx: per-CRTC implicit acquire context used by atomic drivers for
- legacy IOCTLs
@@ -771,6 +775,11 @@ struct drm_crtc {
struct drm_crtc_state *state;
- /* fence timelines info for DRM out-fences */
- unsigned int fence_context;
- spinlock_t fence_lock;
- unsigned long fence_seqno;
- /*
- For legacy crtc IOCTLs so that atomic drivers can get at the locking
- acquire context.
@@ -778,6 +787,16 @@ struct drm_crtc { struct drm_modeset_acquire_ctx *acquire_ctx; };
+extern const struct fence_ops drm_crtc_fence_ops;
+static inline struct drm_crtc *fence_to_crtc(struct fence *fence) +{
- if (fence->ops != &drm_crtc_fence_ops)
return NULL;
- return container_of(fence->lock, struct drm_crtc, fence_lock);
+}
/**
- struct drm_connector_state - mutable connector state
- @connector: backpointer to the connector
-- 2.5.5
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:27PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Create one timeline context for each CRTC to be able to handle out-fences and signal them. It adds a few members to struct drm_crtc: fence_context, where we store the context we get from fence_context_alloc(), the fence seqno and the fence lock, that we pass in fence_init() to be used by the fence.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
drivers/gpu/drm/drm_crtc.c | 29 +++++++++++++++++++++++++++++ include/drm/drm_crtc.h | 19 +++++++++++++++++++ 2 files changed, 48 insertions(+)
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 65212ce..cf9750a 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -659,6 +659,32 @@ static unsigned int drm_num_crtcs(struct drm_device *dev) return num; }
+static const char *drm_crtc_fence_get_driver_name(struct fence *fence) +{
- struct drm_crtc *crtc = fence_to_crtc(fence);
- return crtc->dev->driver->name;
+}
+static const char *drm_crtc_fence_get_timeline_name(struct fence *fence) +{
- struct drm_crtc *crtc = fence_to_crtc(fence);
- return crtc->name;
+}
Is that exported to userspace? crtc->name is an internal thing, not meant for outside consumption.
No. However it may be exported via debugfs at some point. Maybe have drm_crtc->timeline_name which has the obj_id instead, eg., "drm_crtc19" ?
Gustavo
On Tue, Apr 26, 2016 at 11:23:06AM -0300, Gustavo Padovan wrote:
2016-04-26 Ville Syrjälä ville.syrjala@linux.intel.com:
On Mon, Apr 25, 2016 at 07:33:27PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Create one timeline context for each CRTC to be able to handle out-fences and signal them. It adds a few members to struct drm_crtc: fence_context, where we store the context we get from fence_context_alloc(), the fence seqno and the fence lock, that we pass in fence_init() to be used by the fence.
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
drivers/gpu/drm/drm_crtc.c | 29 +++++++++++++++++++++++++++++ include/drm/drm_crtc.h | 19 +++++++++++++++++++ 2 files changed, 48 insertions(+)
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 65212ce..cf9750a 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -659,6 +659,32 @@ static unsigned int drm_num_crtcs(struct drm_device *dev) return num; }
+static const char *drm_crtc_fence_get_driver_name(struct fence *fence) +{
- struct drm_crtc *crtc = fence_to_crtc(fence);
- return crtc->dev->driver->name;
+}
+static const char *drm_crtc_fence_get_timeline_name(struct fence *fence) +{
- struct drm_crtc *crtc = fence_to_crtc(fence);
- return crtc->name;
+}
Is that exported to userspace? crtc->name is an internal thing, not meant for outside consumption.
No. However it may be exported via debugfs at some point. Maybe have drm_crtc->timeline_name which has the obj_id instead, eg., "drm_crtc19" ?
I'm fine either way if it's an internal thing. So pick whichever makes people's life easier I guess.
Hi,
On 26 April 2016 at 00:33, Gustavo Padovan gustavo@padovan.org wrote:
+static inline struct drm_crtc *fence_to_crtc(struct fence *fence) +{
if (fence->ops != &drm_crtc_fence_ops)
return NULL;
Since this is (currently) only used before unconditional dereferences, maybe turn this into a BUG_ON instead of return NULL?
Cheers, Daniel
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Support DRM out-fences creating a sync_file with a fence for each crtc update with the DRM_MODE_ATOMIC_OUT_FENCE flag.
We then send an struct drm_out_fences array with the out-fences fds back in the drm_atomic_ioctl() as an out arg in the out_fences_ptr field.
struct drm_out_fences { __u32 crtc_id; __u32 fd; };
v2: Comment by Rob Clark: - Squash commit that adds DRM_MODE_ATOMIC_OUT_FENCE flag here.
Comment by Daniel Vetter: - Add clean up code for out_fences
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk --- drivers/gpu/drm/drm_atomic.c | 163 +++++++++++++++++++++++++++++++++++++++++-- include/drm/drm_crtc.h | 10 +++ include/uapi/drm/drm_mode.h | 11 ++- 3 files changed, 179 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 5f9d434..06c6007 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -1566,6 +1566,133 @@ void drm_atomic_clean_old_fb(struct drm_device *dev, } EXPORT_SYMBOL(drm_atomic_clean_old_fb);
+static struct drm_out_fence_state *get_out_fence(struct drm_device *dev, + struct drm_atomic_state *state, + uint32_t __user *out_fences_ptr, + uint64_t count_out_fences, + uint64_t user_data) +{ + struct drm_crtc *crtc; + struct drm_crtc_state *crtc_state; + struct drm_out_fences *out_fences; + struct drm_out_fence_state *fence_state; + int num_fences = 0; + int i, ret; + + if (count_out_fences > dev->mode_config.num_crtc) + return ERR_PTR(-EINVAL); + + out_fences = kcalloc(count_out_fences, sizeof(*out_fences), + GFP_KERNEL); + if (!out_fences) + return ERR_PTR(-ENOMEM); + + fence_state = kcalloc(count_out_fences, sizeof(*fence_state), + GFP_KERNEL); + if (!fence_state) { + kfree(out_fences); + return ERR_PTR(-ENOMEM); + } + + for (i = 0 ; i < count_out_fences ; i++) + fence_state[i].fd = -1; + + for_each_crtc_in_state(state, crtc, crtc_state, i) { + struct drm_pending_vblank_event *e; + struct fence *fence; + char name[32]; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) { + ret = -ENOMEM; + goto out; + } + + fence_init(fence, &drm_crtc_fence_ops, &crtc->fence_lock, + crtc->fence_context, crtc->fence_seqno); + + snprintf(name, sizeof(name), "crtc-%d_%lu", + drm_crtc_index(crtc), crtc->fence_seqno++); + + fence_state[i].fd = get_unused_fd_flags(O_CLOEXEC); + if (fence_state[i].fd < 0) { + fence_put(fence); + ret = fence_state[i].fd; + goto out; + } + + fence_state[i].sync_file = sync_file_create(name, fence); + if(!fence_state[i].sync_file) { + fence_put(fence); + ret = -ENOMEM; + goto out; + } + + if (crtc_state->event) { + crtc_state->event->base.fence = fence; + } else { + e = create_vblank_event(dev, NULL, fence, user_data); + if (!e) { + ret = -ENOMEM; + goto out; + } + + crtc_state->event = e; + } + + out_fences[num_fences].crtc_id = crtc->base.id; + out_fences[num_fences].fd = fence_state[i].fd; + num_fences++; + } + + if (copy_to_user(out_fences_ptr, out_fences, + num_fences * sizeof(*out_fences))) { + ret = -EFAULT; + goto out; + } + + kfree(out_fences); + + return fence_state; + +out: + for (i = 0 ; i < count_out_fences ; i++) { + if (fence_state[i].sync_file) + sync_file_put(fence_state[i].sync_file); + if (fence_state[i].fd >= 0) + put_unused_fd(fence_state[i].fd); + } + + kfree(fence_state); + kfree(out_fences); + + return ERR_PTR(ret); +} + +static void install_out_fence(uint64_t count_out_fences, + struct drm_out_fence_state *state) +{ + int i; + + for (i = 0 ; i < count_out_fences ; i++) { + if (state[i].sync_file) + sync_file_install(state[i].sync_file, state[i].fd); + } +} + +static void release_out_fence(uint64_t count_out_fences, + struct drm_out_fence_state *state) +{ + int i; + + for (i = 0 ; i < count_out_fences ; i++) { + if (state->sync_file) + sync_file_put(state->sync_file); + if (state->fd >= 0) + put_unused_fd(state->fd); + } +} + int drm_mode_atomic_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { @@ -1574,12 +1701,14 @@ int drm_mode_atomic_ioctl(struct drm_device *dev, uint32_t __user *count_props_ptr = (uint32_t __user *)(unsigned long)(arg->count_props_ptr); uint32_t __user *props_ptr = (uint32_t __user *)(unsigned long)(arg->props_ptr); uint64_t __user *prop_values_ptr = (uint64_t __user *)(unsigned long)(arg->prop_values_ptr); + uint32_t __user *out_fences_ptr = (uint32_t __user *)(unsigned long)(arg->out_fences_ptr); unsigned int copied_objs, copied_props; struct drm_atomic_state *state; struct drm_modeset_acquire_ctx ctx; struct drm_plane *plane; struct drm_crtc *crtc; struct drm_crtc_state *crtc_state; + struct drm_out_fence_state *fence_state = NULL; unsigned plane_mask; int ret = 0; unsigned int i, j; @@ -1605,9 +1734,13 @@ int drm_mode_atomic_ioctl(struct drm_device *dev, !dev->mode_config.async_page_flip) return -EINVAL;
+ if ((arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) && !arg->count_out_fences) + return -EINVAL; + /* can't test and expect an event at the same time. */ if ((arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) && - (arg->flags & DRM_MODE_PAGE_FLIP_EVENT)) + (arg->flags & (DRM_MODE_PAGE_FLIP_EVENT + | DRM_MODE_ATOMIC_OUT_FENCE))) return -EINVAL;
drm_modeset_acquire_init(&ctx, 0); @@ -1699,16 +1832,30 @@ retry: } }
+ if (arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) { + fence_state = get_out_fence(dev, state, out_fences_ptr, + arg->count_out_fences, + arg->user_data); + if (IS_ERR(fence_state)) { + ret = PTR_ERR(fence_state); + goto out; + } + } + if (arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) { /* * Unlike commit, check_only does not clean up state. * Below we call drm_atomic_state_free for it. */ ret = drm_atomic_check_only(state); - } else if (arg->flags & DRM_MODE_ATOMIC_NONBLOCK) { - ret = drm_atomic_async_commit(state); } else { - ret = drm_atomic_commit(state); + if (arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) + install_out_fence(arg->count_out_fences, fence_state); + + if (arg->flags & DRM_MODE_ATOMIC_NONBLOCK) + ret = drm_atomic_async_commit(state); + else + ret = drm_atomic_commit(state); }
out: @@ -1729,6 +1876,14 @@ out: } }
+ if ((arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) && + !IS_ERR_OR_NULL(fence_state)) { + if (ret) + release_out_fence(arg->count_out_fences, fence_state); + + kfree(fence_state); + } + if (ret == -EDEADLK) { drm_atomic_state_clear(state); drm_modeset_backoff(&ctx); diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index d8c32c8..5f7c272 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -798,6 +798,16 @@ static inline struct drm_crtc *fence_to_crtc(struct fence *fence) }
/** + * struct drm_out_fence_state - store out_fence data for clean up + * @sync_file: sync_file related to the out_fence + * @fd: file descriptor to installed on the sync_file. + */ +struct drm_out_fence_state { + struct sync_file *sync_file; + int fd; +}; + +/** * struct drm_connector_state - mutable connector state * @connector: backpointer to the connector * @crtc: CRTC to connect connector to, NULL if disabled diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h index 7a7856e..4cdcd22 100644 --- a/include/uapi/drm/drm_mode.h +++ b/include/uapi/drm/drm_mode.h @@ -582,13 +582,20 @@ struct drm_mode_destroy_dumb { #define DRM_MODE_ATOMIC_TEST_ONLY 0x0100 #define DRM_MODE_ATOMIC_NONBLOCK 0x0200 #define DRM_MODE_ATOMIC_ALLOW_MODESET 0x0400 +#define DRM_MODE_ATOMIC_OUT_FENCE 0x0800
#define DRM_MODE_ATOMIC_FLAGS (\ DRM_MODE_PAGE_FLIP_EVENT |\ DRM_MODE_PAGE_FLIP_ASYNC |\ DRM_MODE_ATOMIC_TEST_ONLY |\ DRM_MODE_ATOMIC_NONBLOCK |\ - DRM_MODE_ATOMIC_ALLOW_MODESET) + DRM_MODE_ATOMIC_ALLOW_MODESET |\ + DRM_MODE_ATOMIC_OUT_FENCE) + +struct drm_out_fences { + __u32 crtc_id; + __u32 fd; +};
struct drm_mode_atomic { __u32 flags; @@ -599,6 +606,8 @@ struct drm_mode_atomic { __u64 prop_values_ptr; __u64 reserved; __u64 user_data; + __u64 count_out_fences; + __u64 out_fences_ptr; };
/**
On Mon, Apr 25, 2016 at 07:33:28PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Support DRM out-fences creating a sync_file with a fence for each crtc update with the DRM_MODE_ATOMIC_OUT_FENCE flag.
We then send an struct drm_out_fences array with the out-fences fds back in the drm_atomic_ioctl() as an out arg in the out_fences_ptr field.
struct drm_out_fences { __u32 crtc_id; __u32 fd; };
v2: Comment by Rob Clark:
- Squash commit that adds DRM_MODE_ATOMIC_OUT_FENCE flag here.
Comment by Daniel Vetter:
- Add clean up code for out_fences
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
drivers/gpu/drm/drm_atomic.c | 163 +++++++++++++++++++++++++++++++++++++++++-- include/drm/drm_crtc.h | 10 +++ include/uapi/drm/drm_mode.h | 11 ++- 3 files changed, 179 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 5f9d434..06c6007 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -1566,6 +1566,133 @@ void drm_atomic_clean_old_fb(struct drm_device *dev, } EXPORT_SYMBOL(drm_atomic_clean_old_fb);
+static struct drm_out_fence_state *get_out_fence(struct drm_device *dev,
struct drm_atomic_state *state,
uint32_t __user *out_fences_ptr,
uint64_t count_out_fences,
uint64_t user_data)
+{
- struct drm_crtc *crtc;
- struct drm_crtc_state *crtc_state;
- struct drm_out_fences *out_fences;
- struct drm_out_fence_state *fence_state;
- int num_fences = 0;
- int i, ret;
- if (count_out_fences > dev->mode_config.num_crtc)
return ERR_PTR(-EINVAL);
- out_fences = kcalloc(count_out_fences, sizeof(*out_fences),
GFP_KERNEL);
- if (!out_fences)
return ERR_PTR(-ENOMEM);
A bit tricky, but the above kcalloc is the only thing that catches integer overflows in count_out_fences. Needs a comment imo since this could be a security exploit if we accidentally screw it up.
Also needs a testcase imo.
- fence_state = kcalloc(count_out_fences, sizeof(*fence_state),
GFP_KERNEL);
- if (!fence_state) {
kfree(out_fences);
return ERR_PTR(-ENOMEM);
- }
- for (i = 0 ; i < count_out_fences ; i++)
fence_state[i].fd = -1;
- for_each_crtc_in_state(state, crtc, crtc_state, i) {
struct drm_pending_vblank_event *e;
struct fence *fence;
char name[32];
fence = kzalloc(sizeof(*fence), GFP_KERNEL);
if (!fence) {
ret = -ENOMEM;
goto out;
}
fence_init(fence, &drm_crtc_fence_ops, &crtc->fence_lock,
crtc->fence_context, crtc->fence_seqno);
snprintf(name, sizeof(name), "crtc-%d_%lu",
drm_crtc_index(crtc), crtc->fence_seqno++);
Hm ... fence_init_with_name? I'm kinda confused why we only name fences that are exported though, and why not all of them. Debugging fence deadlocks is real hard, so giving them all names might be a good idea.
Anyway, seems like more room for a bit more sync_file/struct fence merging.
fence_state[i].fd = get_unused_fd_flags(O_CLOEXEC);
if (fence_state[i].fd < 0) {
fence_put(fence);
ret = fence_state[i].fd;
goto out;
}
fence_state[i].sync_file = sync_file_create(name, fence);
if(!fence_state[i].sync_file) {
fence_put(fence);
ret = -ENOMEM;
goto out;
}
if (crtc_state->event) {
crtc_state->event->base.fence = fence;
} else {
This looks a bit funny - I'd change the create event logic to create an event either if we have the either drm event or out-fence flag set.
e = create_vblank_event(dev, NULL, fence, user_data);
if (!e) {
ret = -ENOMEM;
goto out;
}
crtc_state->event = e;
}
out_fences[num_fences].crtc_id = crtc->base.id;
out_fences[num_fences].fd = fence_state[i].fd;
num_fences++;
- }
- if (copy_to_user(out_fences_ptr, out_fences,
num_fences * sizeof(*out_fences))) {
ret = -EFAULT;
goto out;
- }
- kfree(out_fences);
- return fence_state;
+out:
- for (i = 0 ; i < count_out_fences ; i++) {
if (fence_state[i].sync_file)
sync_file_put(fence_state[i].sync_file);
if (fence_state[i].fd >= 0)
put_unused_fd(fence_state[i].fd);
- }
- kfree(fence_state);
- kfree(out_fences);
- return ERR_PTR(ret);
+}
+static void install_out_fence(uint64_t count_out_fences,
struct drm_out_fence_state *state)
+{
- int i;
- for (i = 0 ; i < count_out_fences ; i++) {
if (state[i].sync_file)
sync_file_install(state[i].sync_file, state[i].fd);
Is sync_file_install anything more than fd_install? Imo a wrapper for just that function is overkill and just hides stuff. I'd nuke it (another sync_file patch though). In dma-buf we also don't wrap it, we only have a convenience wrapper for users who want to combine the get_unused_flags+fd_install in one go. And maybe even that is silly.
Ok, I unlazied and it's indeed just a silly wrapper. Please nuke it.
- }
+}
+static void release_out_fence(uint64_t count_out_fences,
struct drm_out_fence_state *state)
+{
- int i;
- for (i = 0 ; i < count_out_fences ; i++) {
if (state->sync_file)
sync_file_put(state->sync_file);
if (state->fd >= 0)
put_unused_fd(state->fd);
- }
+}
int drm_mode_atomic_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { @@ -1574,12 +1701,14 @@ int drm_mode_atomic_ioctl(struct drm_device *dev, uint32_t __user *count_props_ptr = (uint32_t __user *)(unsigned long)(arg->count_props_ptr); uint32_t __user *props_ptr = (uint32_t __user *)(unsigned long)(arg->props_ptr); uint64_t __user *prop_values_ptr = (uint64_t __user *)(unsigned long)(arg->prop_values_ptr);
- uint32_t __user *out_fences_ptr = (uint32_t __user *)(unsigned long)(arg->out_fences_ptr); unsigned int copied_objs, copied_props; struct drm_atomic_state *state; struct drm_modeset_acquire_ctx ctx; struct drm_plane *plane; struct drm_crtc *crtc; struct drm_crtc_state *crtc_state;
- struct drm_out_fence_state *fence_state = NULL; unsigned plane_mask; int ret = 0; unsigned int i, j;
@@ -1605,9 +1734,13 @@ int drm_mode_atomic_ioctl(struct drm_device *dev, !dev->mode_config.async_page_flip) return -EINVAL;
- if ((arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) && !arg->count_out_fences)
return -EINVAL;
We need testcases which check that arg->count_out_fences and arg->out_fences are 0 when the OUT_FENCE flag is not set.
Definitely needs an igt testcase for this invalid input case. Ofc we also need tests that give the kernel nonsens in count_out_fences and out_fences with the flag set.
- /* can't test and expect an event at the same time. */ if ((arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) &&
(arg->flags & DRM_MODE_PAGE_FLIP_EVENT))
(arg->flags & (DRM_MODE_PAGE_FLIP_EVENT
| DRM_MODE_ATOMIC_OUT_FENCE)))
If you go with my suggestion above to create the event if either is set, maybe a DRM_MODE_ATOMIC_EVENT_MASK with both? Would read easier.
return -EINVAL;
drm_modeset_acquire_init(&ctx, 0); @@ -1699,16 +1832,30 @@ retry: } }
- if (arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) {
fence_state = get_out_fence(dev, state, out_fences_ptr,
arg->count_out_fences,
arg->user_data);
if (IS_ERR(fence_state)) {
ret = PTR_ERR(fence_state);
goto out;
}
- }
- if (arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) { /*
*/ ret = drm_atomic_check_only(state);
- Unlike commit, check_only does not clean up state.
- Below we call drm_atomic_state_free for it.
- } else if (arg->flags & DRM_MODE_ATOMIC_NONBLOCK) {
} else {ret = drm_atomic_async_commit(state);
ret = drm_atomic_commit(state);
if (arg->flags & DRM_MODE_ATOMIC_OUT_FENCE)
install_out_fence(arg->count_out_fences, fence_state);
if (arg->flags & DRM_MODE_ATOMIC_NONBLOCK)
ret = drm_atomic_async_commit(state);
else
}ret = drm_atomic_commit(state);
out: @@ -1729,6 +1876,14 @@ out: } }
- if ((arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) &&
!IS_ERR_OR_NULL(fence_state)) {
if (ret)
release_out_fence(arg->count_out_fences, fence_state);
kfree(fence_state);
- }
- if (ret == -EDEADLK) { drm_atomic_state_clear(state); drm_modeset_backoff(&ctx);
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h index d8c32c8..5f7c272 100644 --- a/include/drm/drm_crtc.h +++ b/include/drm/drm_crtc.h @@ -798,6 +798,16 @@ static inline struct drm_crtc *fence_to_crtc(struct fence *fence) }
/**
- struct drm_out_fence_state - store out_fence data for clean up
- @sync_file: sync_file related to the out_fence
- @fd: file descriptor to installed on the sync_file.
- */
+struct drm_out_fence_state {
- struct sync_file *sync_file;
- int fd;
+};
+/**
- struct drm_connector_state - mutable connector state
- @connector: backpointer to the connector
- @crtc: CRTC to connect connector to, NULL if disabled
diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h index 7a7856e..4cdcd22 100644 --- a/include/uapi/drm/drm_mode.h +++ b/include/uapi/drm/drm_mode.h @@ -582,13 +582,20 @@ struct drm_mode_destroy_dumb { #define DRM_MODE_ATOMIC_TEST_ONLY 0x0100 #define DRM_MODE_ATOMIC_NONBLOCK 0x0200 #define DRM_MODE_ATOMIC_ALLOW_MODESET 0x0400 +#define DRM_MODE_ATOMIC_OUT_FENCE 0x0800
#define DRM_MODE_ATOMIC_FLAGS (\ DRM_MODE_PAGE_FLIP_EVENT |\ DRM_MODE_PAGE_FLIP_ASYNC |\ DRM_MODE_ATOMIC_TEST_ONLY |\ DRM_MODE_ATOMIC_NONBLOCK |\
DRM_MODE_ATOMIC_ALLOW_MODESET)
DRM_MODE_ATOMIC_ALLOW_MODESET |\
DRM_MODE_ATOMIC_OUT_FENCE)
+struct drm_out_fences {
- __u32 crtc_id;
- __u32 fd;
+};
struct drm_mode_atomic { __u32 flags; @@ -599,6 +606,8 @@ struct drm_mode_atomic { __u64 prop_values_ptr; __u64 reserved; __u64 user_data;
- __u64 count_out_fences;
- __u64 out_fences_ptr;
};
/**
2.5.5
2016-04-26 Daniel Vetter daniel@ffwll.ch:
On Mon, Apr 25, 2016 at 07:33:28PM -0300, Gustavo Padovan wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Support DRM out-fences creating a sync_file with a fence for each crtc update with the DRM_MODE_ATOMIC_OUT_FENCE flag.
We then send an struct drm_out_fences array with the out-fences fds back in the drm_atomic_ioctl() as an out arg in the out_fences_ptr field.
struct drm_out_fences { __u32 crtc_id; __u32 fd; };
v2: Comment by Rob Clark:
- Squash commit that adds DRM_MODE_ATOMIC_OUT_FENCE flag here.
Comment by Daniel Vetter:
- Add clean up code for out_fences
Signed-off-by: Gustavo Padovan gustavo.padovan@collabora.co.uk
drivers/gpu/drm/drm_atomic.c | 163 +++++++++++++++++++++++++++++++++++++++++-- include/drm/drm_crtc.h | 10 +++ include/uapi/drm/drm_mode.h | 11 ++- 3 files changed, 179 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 5f9d434..06c6007 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -1566,6 +1566,133 @@ void drm_atomic_clean_old_fb(struct drm_device *dev, } EXPORT_SYMBOL(drm_atomic_clean_old_fb);
+static struct drm_out_fence_state *get_out_fence(struct drm_device *dev,
struct drm_atomic_state *state,
uint32_t __user *out_fences_ptr,
uint64_t count_out_fences,
uint64_t user_data)
+{
- struct drm_crtc *crtc;
- struct drm_crtc_state *crtc_state;
- struct drm_out_fences *out_fences;
- struct drm_out_fence_state *fence_state;
- int num_fences = 0;
- int i, ret;
- if (count_out_fences > dev->mode_config.num_crtc)
return ERR_PTR(-EINVAL);
- out_fences = kcalloc(count_out_fences, sizeof(*out_fences),
GFP_KERNEL);
- if (!out_fences)
return ERR_PTR(-ENOMEM);
A bit tricky, but the above kcalloc is the only thing that catches integer overflows in count_out_fences. Needs a comment imo since this could be a security exploit if we accidentally screw it up.
The check above makes sure that count_out_fences is not bigger than num_crtc. Don't that fix this?
Also needs a testcase imo.
- fence_state = kcalloc(count_out_fences, sizeof(*fence_state),
GFP_KERNEL);
- if (!fence_state) {
kfree(out_fences);
return ERR_PTR(-ENOMEM);
- }
- for (i = 0 ; i < count_out_fences ; i++)
fence_state[i].fd = -1;
- for_each_crtc_in_state(state, crtc, crtc_state, i) {
struct drm_pending_vblank_event *e;
struct fence *fence;
char name[32];
fence = kzalloc(sizeof(*fence), GFP_KERNEL);
if (!fence) {
ret = -ENOMEM;
goto out;
}
fence_init(fence, &drm_crtc_fence_ops, &crtc->fence_lock,
crtc->fence_context, crtc->fence_seqno);
snprintf(name, sizeof(name), "crtc-%d_%lu",
drm_crtc_index(crtc), crtc->fence_seqno++);
Hm ... fence_init_with_name? I'm kinda confused why we only name fences that are exported though, and why not all of them. Debugging fence deadlocks is real hard, so giving them all names might be a good idea.
Anyway, seems like more room for a bit more sync_file/struct fence merging.
We just removed name from sync_file_create() so snprintf() is not even necessary here anymore.
fence_state[i].fd = get_unused_fd_flags(O_CLOEXEC);
if (fence_state[i].fd < 0) {
fence_put(fence);
ret = fence_state[i].fd;
goto out;
}
fence_state[i].sync_file = sync_file_create(name, fence);
if(!fence_state[i].sync_file) {
fence_put(fence);
ret = -ENOMEM;
goto out;
}
if (crtc_state->event) {
crtc_state->event->base.fence = fence;
} else {
This looks a bit funny - I'd change the create event logic to create an event either if we have the either drm event or out-fence flag set.
Ok.
e = create_vblank_event(dev, NULL, fence, user_data);
if (!e) {
ret = -ENOMEM;
goto out;
}
crtc_state->event = e;
}
out_fences[num_fences].crtc_id = crtc->base.id;
out_fences[num_fences].fd = fence_state[i].fd;
num_fences++;
- }
- if (copy_to_user(out_fences_ptr, out_fences,
num_fences * sizeof(*out_fences))) {
ret = -EFAULT;
goto out;
- }
- kfree(out_fences);
- return fence_state;
+out:
- for (i = 0 ; i < count_out_fences ; i++) {
if (fence_state[i].sync_file)
sync_file_put(fence_state[i].sync_file);
if (fence_state[i].fd >= 0)
put_unused_fd(fence_state[i].fd);
- }
- kfree(fence_state);
- kfree(out_fences);
- return ERR_PTR(ret);
+}
+static void install_out_fence(uint64_t count_out_fences,
struct drm_out_fence_state *state)
+{
- int i;
- for (i = 0 ; i < count_out_fences ; i++) {
if (state[i].sync_file)
sync_file_install(state[i].sync_file, state[i].fd);
Is sync_file_install anything more than fd_install? Imo a wrapper for just that function is overkill and just hides stuff. I'd nuke it (another sync_file patch though). In dma-buf we also don't wrap it, we only have a convenience wrapper for users who want to combine the get_unused_flags+fd_install in one go. And maybe even that is silly.
Ok, I unlazied and it's indeed just a silly wrapper. Please nuke it.
already fixed in the sync file de-stage patches.
- }
+}
+static void release_out_fence(uint64_t count_out_fences,
struct drm_out_fence_state *state)
+{
- int i;
- for (i = 0 ; i < count_out_fences ; i++) {
if (state->sync_file)
sync_file_put(state->sync_file);
if (state->fd >= 0)
put_unused_fd(state->fd);
- }
+}
int drm_mode_atomic_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { @@ -1574,12 +1701,14 @@ int drm_mode_atomic_ioctl(struct drm_device *dev, uint32_t __user *count_props_ptr = (uint32_t __user *)(unsigned long)(arg->count_props_ptr); uint32_t __user *props_ptr = (uint32_t __user *)(unsigned long)(arg->props_ptr); uint64_t __user *prop_values_ptr = (uint64_t __user *)(unsigned long)(arg->prop_values_ptr);
- uint32_t __user *out_fences_ptr = (uint32_t __user *)(unsigned long)(arg->out_fences_ptr); unsigned int copied_objs, copied_props; struct drm_atomic_state *state; struct drm_modeset_acquire_ctx ctx; struct drm_plane *plane; struct drm_crtc *crtc; struct drm_crtc_state *crtc_state;
- struct drm_out_fence_state *fence_state = NULL; unsigned plane_mask; int ret = 0; unsigned int i, j;
@@ -1605,9 +1734,13 @@ int drm_mode_atomic_ioctl(struct drm_device *dev, !dev->mode_config.async_page_flip) return -EINVAL;
- if ((arg->flags & DRM_MODE_ATOMIC_OUT_FENCE) && !arg->count_out_fences)
return -EINVAL;
We need testcases which check that arg->count_out_fences and arg->out_fences are 0 when the OUT_FENCE flag is not set.
Definitely needs an igt testcase for this invalid input case. Ofc we also need tests that give the kernel nonsens in count_out_fences and out_fences with the flag set.
- /* can't test and expect an event at the same time. */ if ((arg->flags & DRM_MODE_ATOMIC_TEST_ONLY) &&
(arg->flags & DRM_MODE_PAGE_FLIP_EVENT))
(arg->flags & (DRM_MODE_PAGE_FLIP_EVENT
| DRM_MODE_ATOMIC_OUT_FENCE)))
If you go with my suggestion above to create the event if either is set, maybe a DRM_MODE_ATOMIC_EVENT_MASK with both? Would read easier.
Ok.
Gustavo
Hi
Out of interest will this allow tear free with PRIME?
Thanks
Mike
On Tue, 26 Apr 2016, 12:33 a.m. Gustavo Padovan, gustavo@padovan.org wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Hi,
Currently the Linux Kernel only have an implicit fencing mechanism where the fence are attached directly to buffers and userspace is unaware of what is happening. On the other hand explicit fencing which is not supported yet by Linux but it expose fences to the userspace to handle fencing between producer/consumer explicitely.
For that we use the Android Sync Framework[1], a explicit fencing mechanism that help the userspace handles fences directly. It has the concept of sync_file (called sync_fence in Android) that expose the driver's fences to userspace via file descriptors. File descriptors are useful because we can pass them around between process.
The Sync Framework is currently in the staging tree and on the process to be de-staged[2].
With explicit fencing we have a global mechanism that optimizes the flow of buffers between consumers and producers, avoid a lot of waiting. So instead of waiting for a buffer to be processed by the GPU before sending it to DRM in an Atomic IOCTL we can get a sync_file fd from the GPU driver at the moment we submit the buffer processing. The compositor then passes these fds to DRM in a atomic commit request, that will not be displayed until the fences signal, i.e, the GPU finished processing the buffer and it is ready to display. In DRM the fences we wait on before displaying a buffer are called in-fences.
Vice-versa, we have out-fences, to sychronize the return of buffers to GPU (producer) to be processed again. When DRM receives an atomic request with a special flag set it generates one fence per-crtc and attach it to a per-crtc sync_file. It then returns the array of sync_file fds to userspace as an atomic_ioctl out arg. With the fences available userspace can forward these fences to the GPU, where it will wait the fence to signal before starting to process on buffer again.
Explicit fencing with Sync Framework allows buffer suballocation. Userspace get a large buffer and divides it into small ones and submit requests to process them, each subbuffer gets and sync_file fd and can be processed in parallel. This is not even possible with implicit fencing.
While these are out-fences in DRM (the consumer) they become in-fences once they get to the GPU (the producer).
DRM explicit fences are opt-in, as the default will still be implicit fencing. To enable explicit in-fences one just need to pass a sync_file fd in the FENCE_FD plane property. *In-fences are per-plane*, i.e., per framebuffer.
For out-fences, just enabling DRM_MODE_ATOMIC_OUT_FENCE flag is enough. *Out-fences are per-crtc*.
In-fences
In the first discussions on #dri-devel on IRC we decided to hide the Sync Framework from DRM drivers to reduce complexity, so as soon we get the fd via FENCE_FD plane property we convert the sync_file fd to a struct fence. However a sync_file might contain more than one fence, so we created the fence_collection concept. struct fence_collection is a subclass of struct fence and stores a group of fences that needs to be waited together, in other words, all the fences in the sync_file.
Then we just use the already in place fence support to wait on those fences. Once the producer calls fence_signal() for all fences on wait we can proceed with the atomic commit and display the framebuffers. DRM drivers only needs to be converted to struct fence to make use of this feature.
Out-fences
Passing the DRM_MODE_ATOMIC_OUT_FENCE flag to an atomic request enables out-fences. The kernel then creates a fence, attach it to a sync_file and install this file on a unused fd for each crtc. Userspace get the fence back as an array of per-crtc sync_file fds.
DRM core use the already in place drm_event infrastructure to help signal fences, we've added a fence pointer to struct drm_pending_event. If the atomic update received requested an PAGE_FLIP_EVENT we just use the same drm_pending_event and set our fence there, otherwise we just create an event with a NULL file_priv to set our fence. On vblank we just call fence_signal() to signal that the buffer related to this fence is *now* on the screen. Note that this is exactly the opposite behaviour from Android, where the fences are signaled when they are not on display anymore, so free to be reused.
No changes are required to DRM drivers to have out-fences support, apart from atomic support of course.
Kernel tree
For those who want all patches on this RFC are in my tree. The tree includes all sync frameworks patches needed at the moment:
https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences
I also hacked some poor some fake fences support to modetest here:
https://git.collabora.com/cgit/user/padovan/libdrm.git/log/?h=atomic
v2 changes
The changes from RFC v1 to RFC v2 are in the patches description.
TODO
- upstream Android Sync ABI changes
- de-stage Android Sync Framework
- Upstream sync selftests (Emilio Lopez)
- create tests for DRM fences
Regards,
Gustavo
[1] https://source.android.com/devices/graphics/implement.html#vsync [2] https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=sync
Gustavo Padovan (8): dma-buf/fence: add fence_collection fences Documentation: add fence-collection to kernel DocBook dma-buf/sync_file: add sync_file_fences_get() drm/fence: allow fence waiting to be interrupted by userspace drm/fence: add in-fences support drm/fence: add fence to drm_pending_event drm/fence: add fence timeline to drm_crtc drm/fence: add out-fences support
Documentation/DocBook/device-drivers.tmpl | 2 + drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/fence-collection.c | 159 ++++++++++++++++++++++++ drivers/dma-buf/fence.c | 2 +- drivers/dma-buf/sync_file.c | 60 +++++++++ drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/drm_atomic.c | 196 ++++++++++++++++++++++++++++-- drivers/gpu/drm/drm_atomic_helper.c | 24 +++- drivers/gpu/drm/drm_crtc.c | 36 ++++++ drivers/gpu/drm/drm_fops.c | 8 +- include/drm/drmP.h | 2 + include/drm/drm_atomic_helper.h | 4 +- include/drm/drm_crtc.h | 30 +++++ include/linux/fence-collection.h | 73 +++++++++++ include/linux/fence.h | 9 ++ include/linux/sync_file.h | 3 + include/uapi/drm/drm_mode.h | 11 +- 17 files changed, 600 insertions(+), 22 deletions(-) create mode 100644 drivers/dma-buf/fence-collection.c create mode 100644 include/linux/fence-collection.h
-- 2.5.5
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, Apr 25, 2016 at 11:21:18PM +0000, Mike Lothian wrote:
Hi
Out of interest will this allow tear free with PRIME?
Tear free with prime on the kernel side is already supported using reservation objects/fences attached implicitly to dma-bufs. Just needs driver support (which for the displaying side has landed in i915 in 4.6, nouveau/radeon did manage things correctly on the render side since a long time).
I think tear free needs some additional changes in X, and the patches are pending review.
This is just the explicit fencing needed by android (and wanted by others). Same thing overall idea, but different ABI in the details. -Daniel
Thanks
Mike
On Tue, 26 Apr 2016, 12:33 a.m. Gustavo Padovan, gustavo@padovan.org wrote:
From: Gustavo Padovan gustavo.padovan@collabora.co.uk
Hi,
Currently the Linux Kernel only have an implicit fencing mechanism where the fence are attached directly to buffers and userspace is unaware of what is happening. On the other hand explicit fencing which is not supported yet by Linux but it expose fences to the userspace to handle fencing between producer/consumer explicitely.
For that we use the Android Sync Framework[1], a explicit fencing mechanism that help the userspace handles fences directly. It has the concept of sync_file (called sync_fence in Android) that expose the driver's fences to userspace via file descriptors. File descriptors are useful because we can pass them around between process.
The Sync Framework is currently in the staging tree and on the process to be de-staged[2].
With explicit fencing we have a global mechanism that optimizes the flow of buffers between consumers and producers, avoid a lot of waiting. So instead of waiting for a buffer to be processed by the GPU before sending it to DRM in an Atomic IOCTL we can get a sync_file fd from the GPU driver at the moment we submit the buffer processing. The compositor then passes these fds to DRM in a atomic commit request, that will not be displayed until the fences signal, i.e, the GPU finished processing the buffer and it is ready to display. In DRM the fences we wait on before displaying a buffer are called in-fences.
Vice-versa, we have out-fences, to sychronize the return of buffers to GPU (producer) to be processed again. When DRM receives an atomic request with a special flag set it generates one fence per-crtc and attach it to a per-crtc sync_file. It then returns the array of sync_file fds to userspace as an atomic_ioctl out arg. With the fences available userspace can forward these fences to the GPU, where it will wait the fence to signal before starting to process on buffer again.
Explicit fencing with Sync Framework allows buffer suballocation. Userspace get a large buffer and divides it into small ones and submit requests to process them, each subbuffer gets and sync_file fd and can be processed in parallel. This is not even possible with implicit fencing.
While these are out-fences in DRM (the consumer) they become in-fences once they get to the GPU (the producer).
DRM explicit fences are opt-in, as the default will still be implicit fencing. To enable explicit in-fences one just need to pass a sync_file fd in the FENCE_FD plane property. *In-fences are per-plane*, i.e., per framebuffer.
For out-fences, just enabling DRM_MODE_ATOMIC_OUT_FENCE flag is enough. *Out-fences are per-crtc*.
In-fences
In the first discussions on #dri-devel on IRC we decided to hide the Sync Framework from DRM drivers to reduce complexity, so as soon we get the fd via FENCE_FD plane property we convert the sync_file fd to a struct fence. However a sync_file might contain more than one fence, so we created the fence_collection concept. struct fence_collection is a subclass of struct fence and stores a group of fences that needs to be waited together, in other words, all the fences in the sync_file.
Then we just use the already in place fence support to wait on those fences. Once the producer calls fence_signal() for all fences on wait we can proceed with the atomic commit and display the framebuffers. DRM drivers only needs to be converted to struct fence to make use of this feature.
Out-fences
Passing the DRM_MODE_ATOMIC_OUT_FENCE flag to an atomic request enables out-fences. The kernel then creates a fence, attach it to a sync_file and install this file on a unused fd for each crtc. Userspace get the fence back as an array of per-crtc sync_file fds.
DRM core use the already in place drm_event infrastructure to help signal fences, we've added a fence pointer to struct drm_pending_event. If the atomic update received requested an PAGE_FLIP_EVENT we just use the same drm_pending_event and set our fence there, otherwise we just create an event with a NULL file_priv to set our fence. On vblank we just call fence_signal() to signal that the buffer related to this fence is *now* on the screen. Note that this is exactly the opposite behaviour from Android, where the fences are signaled when they are not on display anymore, so free to be reused.
No changes are required to DRM drivers to have out-fences support, apart from atomic support of course.
Kernel tree
For those who want all patches on this RFC are in my tree. The tree includes all sync frameworks patches needed at the moment:
https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences
I also hacked some poor some fake fences support to modetest here:
https://git.collabora.com/cgit/user/padovan/libdrm.git/log/?h=atomic
v2 changes
The changes from RFC v1 to RFC v2 are in the patches description.
TODO
- upstream Android Sync ABI changes
- de-stage Android Sync Framework
- Upstream sync selftests (Emilio Lopez)
- create tests for DRM fences
Regards,
Gustavo
[1] https://source.android.com/devices/graphics/implement.html#vsync [2] https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=sync
Gustavo Padovan (8): dma-buf/fence: add fence_collection fences Documentation: add fence-collection to kernel DocBook dma-buf/sync_file: add sync_file_fences_get() drm/fence: allow fence waiting to be interrupted by userspace drm/fence: add in-fences support drm/fence: add fence to drm_pending_event drm/fence: add fence timeline to drm_crtc drm/fence: add out-fences support
Documentation/DocBook/device-drivers.tmpl | 2 + drivers/dma-buf/Makefile | 2 +- drivers/dma-buf/fence-collection.c | 159 ++++++++++++++++++++++++ drivers/dma-buf/fence.c | 2 +- drivers/dma-buf/sync_file.c | 60 +++++++++ drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/drm_atomic.c | 196 ++++++++++++++++++++++++++++-- drivers/gpu/drm/drm_atomic_helper.c | 24 +++- drivers/gpu/drm/drm_crtc.c | 36 ++++++ drivers/gpu/drm/drm_fops.c | 8 +- include/drm/drmP.h | 2 + include/drm/drm_atomic_helper.h | 4 +- include/drm/drm_crtc.h | 30 +++++ include/linux/fence-collection.h | 73 +++++++++++ include/linux/fence.h | 9 ++ include/linux/sync_file.h | 3 + include/uapi/drm/drm_mode.h | 11 +- 17 files changed, 600 insertions(+), 22 deletions(-) create mode 100644 drivers/dma-buf/fence-collection.c create mode 100644 include/linux/fence-collection.h
-- 2.5.5
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org