Lot's of discussion since v4, thanks for your feedback:
http://lists.freedesktop.org/archives/dri-devel/2015-August/088259.html
The two main concerns was about range-based flush (which we decided to postpone) and about making the SYNC ioctl mandatory. I need you guys guiding and educating me on the latter now. PTAL.
Tiago
Daniel Thompson (1): drm: prime: Honour O_RDWR during prime-handle-to-fd
Daniel Vetter (1): dma-buf: Add ioctls to allow userspace to flush
Tiago Vignatti (4): dma-buf: Remove range-based flush dma-buf: DRAFT: Make SYNC mandatory when userspace mmap drm/i915: Implement end_cpu_access drm/i915: Use CPU mapping for userspace dma-buf mmap()
Documentation/dma-buf-sharing.txt | 31 +++++++---- drivers/dma-buf/dma-buf.c | 87 +++++++++++++++++++++++++++---- drivers/gpu/drm/drm_prime.c | 10 ++-- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 42 ++++++++++++++- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 4 +- drivers/gpu/drm/udl/udl_fb.c | 2 - drivers/staging/android/ion/ion.c | 6 +-- drivers/staging/android/ion/ion_test.c | 4 +- include/linux/dma-buf.h | 15 +++--- include/uapi/drm/drm.h | 1 + include/uapi/linux/dma-buf.h | 41 +++++++++++++++ 11 files changed, 196 insertions(+), 47 deletions(-) create mode 100644 include/uapi/linux/dma-buf.h
From: Daniel Thompson daniel.thompson@linaro.org
Currently DRM_IOCTL_PRIME_HANDLE_TO_FD rejects all flags except (DRM|O)_CLOEXEC making it difficult (maybe impossible) for userspace to mmap() the resulting dma-buf even when this is supported by the DRM driver.
It is trivial to relax the restriction and permit read/write access. This is safe because the flags are seldom touched by drm; mostly they are passed verbatim to dma_buf calls.
v3 (Tiago): removed unused flags variable from drm_prime_handle_to_fd_ioctl.
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Signed-off-by: Tiago Vignatti tiago.vignatti@intel.com --- drivers/gpu/drm/drm_prime.c | 10 +++------- include/uapi/drm/drm.h | 1 + 2 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 27aa718..df6cdc7 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -329,7 +329,7 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = { * drm_gem_prime_export - helper library implementation of the export callback * @dev: drm_device to export from * @obj: GEM object to export - * @flags: flags like DRM_CLOEXEC + * @flags: flags like DRM_CLOEXEC and DRM_RDWR * * This is the implementation of the gem_prime_export functions for GEM drivers * using the PRIME helpers. @@ -628,7 +628,6 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct drm_prime_handle *args = data; - uint32_t flags;
if (!drm_core_check_feature(dev, DRIVER_PRIME)) return -EINVAL; @@ -637,14 +636,11 @@ int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data, return -ENOSYS;
/* check flags are valid */ - if (args->flags & ~DRM_CLOEXEC) + if (args->flags & ~(DRM_CLOEXEC | DRM_RDWR)) return -EINVAL;
- /* we only want to pass DRM_CLOEXEC which is == O_CLOEXEC */ - flags = args->flags & DRM_CLOEXEC; - return dev->driver->prime_handle_to_fd(dev, file_priv, - args->handle, flags, &args->fd); + args->handle, args->flags, &args->fd); }
int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data, diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 3801584..ad8223e 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -668,6 +668,7 @@ struct drm_set_client_cap { __u64 value; };
+#define DRM_RDWR O_RDWR #define DRM_CLOEXEC O_CLOEXEC struct drm_prime_handle { __u32 handle;
Hi all,
On 27 August 2015 at 00:29, Tiago Vignatti tiago.vignatti@intel.com wrote:
From: Daniel Thompson daniel.thompson@linaro.org
Currently DRM_IOCTL_PRIME_HANDLE_TO_FD rejects all flags except (DRM|O)_CLOEXEC making it difficult (maybe impossible) for userspace to mmap() the resulting dma-buf even when this is supported by the DRM driver.
It is trivial to relax the restriction and permit read/write access. This is safe because the flags are seldom touched by drm; mostly they are passed verbatim to dma_buf calls.
Strictly speaking shouldn't this patch be the last one in the series ? I.e. we should lift this restriction, after the sync method (ioctl/syscall/etc.) is in place. Or perhaps I missed something ?
Thanks Emil
On 08/27/2015 01:36 PM, Emil Velikov wrote:
Hi all,
On 27 August 2015 at 00:29, Tiago Vignatti tiago.vignatti@intel.com wrote:
From: Daniel Thompson daniel.thompson@linaro.org
Currently DRM_IOCTL_PRIME_HANDLE_TO_FD rejects all flags except (DRM|O)_CLOEXEC making it difficult (maybe impossible) for userspace to mmap() the resulting dma-buf even when this is supported by the DRM driver.
It is trivial to relax the restriction and permit read/write access. This is safe because the flags are seldom touched by drm; mostly they are passed verbatim to dma_buf calls.
Strictly speaking shouldn't this patch be the last one in the series ? I.e. we should lift this restriction, after the sync method (ioctl/syscall/etc.) is in place. Or perhaps I missed something ?
I think you're right about it, Emil.
Thank you,
Tiago
This patch removes range-based information used for optimizations in begin_cpu_access and end_cpu_access.
We don't have any user nor implementation using range-based flush. It seems a consensus that if we ever want something like that again (or even more robust using 2D, 3D sub-range regions) we can use the upcoming dma-buf sync ioctl for such.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Daniel Vetter daniel.vetter@intel.com Signed-off-by: Tiago Vignatti tiago.vignatti@intel.com --- Documentation/dma-buf-sharing.txt | 19 ++++++++----------- drivers/dma-buf/dma-buf.c | 13 ++++--------- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 4 ++-- drivers/gpu/drm/udl/udl_fb.c | 2 -- drivers/staging/android/ion/ion.c | 6 ++---- drivers/staging/android/ion/ion_test.c | 4 ++-- include/linux/dma-buf.h | 12 +++++------- 8 files changed, 24 insertions(+), 38 deletions(-)
diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 480c8de..4f4a84b 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -257,17 +257,15 @@ Access to a dma_buf from the kernel context involves three steps:
Interface: int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, - size_t start, size_t len, enum dma_data_direction direction)
This allows the exporter to ensure that the memory is actually available for cpu access - the exporter might need to allocate or swap-in and pin the backing storage. The exporter also needs to ensure that cpu access is - coherent for the given range and access direction. The range and access - direction can be used by the exporter to optimize the cache flushing, i.e. - access outside of the range or with a different direction (read instead of - write) might return stale or even bogus data (e.g. when the exporter needs to - copy the data to temporary storage). + coherent for the access direction. The direction can be used by the exporter + to optimize the cache flushing, i.e. access with a different direction (read + instead of write) might return stale or even bogus data (e.g. when the + exporter needs to copy the data to temporary storage).
This step might fail, e.g. in oom conditions.
@@ -322,14 +320,13 @@ Access to a dma_buf from the kernel context involves three steps:
3. Finish access
- When the importer is done accessing the range specified in begin_cpu_access, - it needs to announce this to the exporter (to facilitate cache flushing and - unpinning of any pinned resources). The result of any dma_buf kmap calls - after end_cpu_access is undefined. + When the importer is done accessing the CPU, it needs to announce this to + the exporter (to facilitate cache flushing and unpinning of any pinned + resources). The result of any dma_buf kmap calls after end_cpu_access is + undefined.
Interface: void dma_buf_end_cpu_access(struct dma_buf *dma_buf, - size_t start, size_t len, enum dma_data_direction dir);
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 155c146..b2ac13b 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -539,13 +539,11 @@ EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); * preparations. Coherency is only guaranteed in the specified range for the * specified access direction. * @dmabuf: [in] buffer to prepare cpu access for. - * @start: [in] start of range for cpu access. - * @len: [in] length of range for cpu access. * @direction: [in] length of range for cpu access. * * Can return negative error values, returns 0 on success. */ -int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len, +int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { int ret = 0; @@ -554,8 +552,7 @@ int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len, return -EINVAL;
if (dmabuf->ops->begin_cpu_access) - ret = dmabuf->ops->begin_cpu_access(dmabuf, start, - len, direction); + ret = dmabuf->ops->begin_cpu_access(dmabuf, direction);
return ret; } @@ -567,19 +564,17 @@ EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access); * actions. Coherency is only guaranteed in the specified range for the * specified access direction. * @dmabuf: [in] buffer to complete cpu access for. - * @start: [in] start of range for cpu access. - * @len: [in] length of range for cpu access. * @direction: [in] length of range for cpu access. * * This call must always succeed. */ -void dma_buf_end_cpu_access(struct dma_buf *dmabuf, size_t start, size_t len, +void dma_buf_end_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { WARN_ON(!dmabuf);
if (dmabuf->ops->end_cpu_access) - dmabuf->ops->end_cpu_access(dmabuf, start, len, direction); + dmabuf->ops->end_cpu_access(dmabuf, direction); } EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index e9c2bfd..65ab2bd 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -196,7 +196,7 @@ static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct * return -EINVAL; }
-static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size_t length, enum dma_data_direction direction) +static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); struct drm_device *dev = obj->base.dev; diff --git a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c index 0cc71c9..468a711 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c +++ b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c @@ -79,7 +79,7 @@ static void omap_gem_dmabuf_release(struct dma_buf *buffer)
static int omap_gem_dmabuf_begin_cpu_access(struct dma_buf *buffer, - size_t start, size_t len, enum dma_data_direction dir) + enum dma_data_direction dir) { struct drm_gem_object *obj = buffer->priv; struct page **pages; @@ -94,7 +94,7 @@ static int omap_gem_dmabuf_begin_cpu_access(struct dma_buf *buffer, }
static void omap_gem_dmabuf_end_cpu_access(struct dma_buf *buffer, - size_t start, size_t len, enum dma_data_direction dir) + enum dma_data_direction dir) { struct drm_gem_object *obj = buffer->priv; omap_gem_put_pages(obj); diff --git a/drivers/gpu/drm/udl/udl_fb.c b/drivers/gpu/drm/udl/udl_fb.c index 62c7b1d..5e6c116 100644 --- a/drivers/gpu/drm/udl/udl_fb.c +++ b/drivers/gpu/drm/udl/udl_fb.c @@ -410,7 +410,6 @@ static int udl_user_framebuffer_dirty(struct drm_framebuffer *fb,
if (ufb->obj->base.import_attach) { ret = dma_buf_begin_cpu_access(ufb->obj->base.import_attach->dmabuf, - 0, ufb->obj->base.size, DMA_FROM_DEVICE); if (ret) goto unlock; @@ -426,7 +425,6 @@ static int udl_user_framebuffer_dirty(struct drm_framebuffer *fb,
if (ufb->obj->base.import_attach) { dma_buf_end_cpu_access(ufb->obj->base.import_attach->dmabuf, - 0, ufb->obj->base.size, DMA_FROM_DEVICE); }
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index 6f48112..e5f1be0 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -1057,8 +1057,7 @@ static void ion_dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long offset, { }
-static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, - size_t len, +static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { struct ion_buffer *buffer = dmabuf->priv; @@ -1076,8 +1075,7 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, size_t start, return PTR_ERR_OR_ZERO(vaddr); }
-static void ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf, size_t start, - size_t len, +static void ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { struct ion_buffer *buffer = dmabuf->priv; diff --git a/drivers/staging/android/ion/ion_test.c b/drivers/staging/android/ion/ion_test.c index 7d6e6b6..e97801c 100644 --- a/drivers/staging/android/ion/ion_test.c +++ b/drivers/staging/android/ion/ion_test.c @@ -109,7 +109,7 @@ static int ion_handle_test_kernel(struct dma_buf *dma_buf, void __user *ptr, if (offset > dma_buf->size || size > dma_buf->size - offset) return -EINVAL;
- ret = dma_buf_begin_cpu_access(dma_buf, offset, size, dir); + ret = dma_buf_begin_cpu_access(dma_buf, dir); if (ret) return ret;
@@ -139,7 +139,7 @@ static int ion_handle_test_kernel(struct dma_buf *dma_buf, void __user *ptr, copy_offset = 0; } err: - dma_buf_end_cpu_access(dma_buf, offset, size, dir); + dma_buf_end_cpu_access(dma_buf, dir); return ret; }
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index f98bd70..532108e 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -54,7 +54,7 @@ struct dma_buf_attachment; * @release: release this buffer; to be called after the last dma_buf_put. * @begin_cpu_access: [optional] called before cpu access to invalidate cpu * caches and allocate backing storage (if not yet done) - * respectively pin the objet into memory. + * respectively pin the object into memory. * @end_cpu_access: [optional] called after cpu access to flush caches. * @kmap_atomic: maps a page from the buffer into kernel address * space, users may not block until the subsequent unmap call. @@ -93,10 +93,8 @@ struct dma_buf_ops { /* after final dma_buf_put() */ void (*release)(struct dma_buf *);
- int (*begin_cpu_access)(struct dma_buf *, size_t, size_t, - enum dma_data_direction); - void (*end_cpu_access)(struct dma_buf *, size_t, size_t, - enum dma_data_direction); + int (*begin_cpu_access)(struct dma_buf *, enum dma_data_direction); + void (*end_cpu_access)(struct dma_buf *, enum dma_data_direction); void *(*kmap_atomic)(struct dma_buf *, unsigned long); void (*kunmap_atomic)(struct dma_buf *, unsigned long, void *); void *(*kmap)(struct dma_buf *, unsigned long); @@ -224,9 +222,9 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, enum dma_data_direction); void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *, enum dma_data_direction); -int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len, +int dma_buf_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); -void dma_buf_end_cpu_access(struct dma_buf *dma_buf, size_t start, size_t len, +void dma_buf_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction dir); void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
From: Daniel Vetter daniel.vetter@ffwll.ch
The userspace might need some sort of cache coherency management e.g. when CPU and GPU domains are being accessed through dma-buf at the same time. To circumvent this problem there are begin/end coherency markers, that forward directly to existing dma-buf device drivers vfunc hooks. Userspace can make use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be used like following: - mmap dma-buf fd - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write to mmap area 3. SYNC_END ioctl. This can be repeated as often as you want (with the new data being consumed by the GPU or say scanout device) - munamp once you don't need the buffer any more
v2 (Tiago): Fix header file type names (u64 -> __u64) v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end dma-buf functions. Check for overflows in start/length. v4 (Tiago): use 2d regions for sync. v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and remove range information from struct dma_buf_sync.
Cc: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com Signed-off-by: Tiago Vignatti tiago.vignatti@intel.com --- Documentation/dma-buf-sharing.txt | 12 +++++++++++ drivers/dma-buf/dma-buf.c | 43 +++++++++++++++++++++++++++++++++++++++ include/uapi/linux/dma-buf.h | 41 +++++++++++++++++++++++++++++++++++++ 3 files changed, 96 insertions(+) create mode 100644 include/uapi/linux/dma-buf.h
diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 4f4a84b..ec0ab1d 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -352,6 +352,18 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
No special interfaces, userspace simply calls mmap on the dma-buf fd.
+ Also, the userspace might need some sort of cache coherency management e.g. + when CPU and GPU domains are being accessed through dma-buf at the same + time. To circumvent this problem there are begin/end coherency markers, that + forward directly to existing dma-buf device drivers vfunc hooks. Userspace + can make use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The + sequence would be used like following: + - mmap dma-buf fd + - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write + to mmap area 3. SYNC_END ioctl. This can be repeated as often as you + want (with the new data being consumed by the GPU or say scanout device) + - munamp once you don't need the buffer any more + 2. Supporting existing mmap interfaces in importers
Similar to the motivation for kernel cpu access it is again important that diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index b2ac13b..9a298bd 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -34,6 +34,8 @@ #include <linux/poll.h> #include <linux/reservation.h>
+#include <uapi/linux/dma-buf.h> + static inline int is_dma_buf_file(struct file *);
struct dma_buf_list { @@ -251,11 +253,52 @@ out: return events; }
+static long dma_buf_ioctl(struct file *file, + unsigned int cmd, unsigned long arg) +{ + struct dma_buf *dmabuf; + struct dma_buf_sync sync; + enum dma_data_direction direction; + + dmabuf = file->private_data; + + if (!is_dma_buf_file(file)) + return -EINVAL; + + switch (cmd) { + case DMA_BUF_IOCTL_SYNC: + if (copy_from_user(&sync, (void __user *) arg, sizeof(sync))) + return -EFAULT; + + if (sync.flags & DMA_BUF_SYNC_RW) + direction = DMA_BIDIRECTIONAL; + else if (sync.flags & DMA_BUF_SYNC_READ) + direction = DMA_FROM_DEVICE; + else if (sync.flags & DMA_BUF_SYNC_WRITE) + direction = DMA_TO_DEVICE; + else + return -EINVAL; + + if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK) + return -EINVAL; + + if (sync.flags & DMA_BUF_SYNC_END) + dma_buf_end_cpu_access(dmabuf, direction); + else + dma_buf_begin_cpu_access(dmabuf, direction); + + return 0; + default: + return -ENOTTY; + } +} + static const struct file_operations dma_buf_fops = { .release = dma_buf_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, .poll = dma_buf_poll, + .unlocked_ioctl = dma_buf_ioctl, };
/* diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h new file mode 100644 index 0000000..262f7a7 --- /dev/null +++ b/include/uapi/linux/dma-buf.h @@ -0,0 +1,41 @@ +/* + * Framework for buffer objects that can be shared across devices/subsystems. + * + * Copyright(C) 2015 Intel Ltd + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef _DMA_BUF_UAPI_H_ +#define _DMA_BUF_UAPI_H_ + +enum dma_buf_sync_flags { + DMA_BUF_SYNC_READ = (1 << 0), + DMA_BUF_SYNC_WRITE = (2 << 0), + DMA_BUF_SYNC_RW = (3 << 0), + DMA_BUF_SYNC_START = (0 << 2), + DMA_BUF_SYNC_END = (1 << 2), + + DMA_BUF_SYNC_VALID_FLAGS_MASK = DMA_BUF_SYNC_RW | + DMA_BUF_SYNC_END +}; + +/* begin/end dma-buf functions used for userspace mmap. */ +struct dma_buf_sync { + enum dma_buf_sync_flags flags; +}; + +#define DMA_BUF_BASE 'b' +#define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync) + +#endif
On Wed, Aug 26, 2015 at 08:29:15PM -0300, Tiago Vignatti wrote:
+#ifndef _DMA_BUF_UAPI_H_ +#define _DMA_BUF_UAPI_H_
+enum dma_buf_sync_flags {
- DMA_BUF_SYNC_READ = (1 << 0),
- DMA_BUF_SYNC_WRITE = (2 << 0),
- DMA_BUF_SYNC_RW = (3 << 0),
- DMA_BUF_SYNC_START = (0 << 2),
- DMA_BUF_SYNC_END = (1 << 2),
- DMA_BUF_SYNC_VALID_FLAGS_MASK = DMA_BUF_SYNC_RW |
DMA_BUF_SYNC_END
+};
+/* begin/end dma-buf functions used for userspace mmap. */ +struct dma_buf_sync {
- enum dma_buf_sync_flags flags;
It is better to use explicitly sized types. And since this is not 64b padded, probably best to add that padding now. -Chris
On 08/27/2015 05:03 AM, Chris Wilson wrote:
On Wed, Aug 26, 2015 at 08:29:15PM -0300, Tiago Vignatti wrote:
+#ifndef _DMA_BUF_UAPI_H_ +#define _DMA_BUF_UAPI_H_
+enum dma_buf_sync_flags {
- DMA_BUF_SYNC_READ = (1 << 0),
- DMA_BUF_SYNC_WRITE = (2 << 0),
- DMA_BUF_SYNC_RW = (3 << 0),
- DMA_BUF_SYNC_START = (0 << 2),
- DMA_BUF_SYNC_END = (1 << 2),
- DMA_BUF_SYNC_VALID_FLAGS_MASK = DMA_BUF_SYNC_RW |
DMA_BUF_SYNC_END
+};
+/* begin/end dma-buf functions used for userspace mmap. */ +struct dma_buf_sync {
- enum dma_buf_sync_flags flags;
It is better to use explicitly sized types. And since this is not 64b padded, probably best to add that padding now.
ahh, I've changed it to use enum instead. If I rollback to use defines then do you think works better? Like this:
struct dma_buf_sync { __u64 flags; };
#define DMA_BUF_SYNC_READ (1 << 0) #define DMA_BUF_SYNC_WRITE (2 << 0) #define DMA_BUF_SYNC_RW (3 << 0) #define DMA_BUF_SYNC_START (0 << 2) #define DMA_BUF_SYNC_END (1 << 2) #define DMA_BUF_SYNC_VALID_FLAGS_MASK \ (DMA_BUF_SYNC_RW | DMA_BUF_SYNC_END)
#define DMA_BUF_BASE 'b' #define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
On Thu, Aug 27, 2015 at 2:29 AM, Tiago Vignatti tiago.vignatti@intel.com wrote:
From: Daniel Vetter daniel.vetter@ffwll.ch
The userspace might need some sort of cache coherency management e.g. when CPU and GPU domains are being accessed through dma-buf at the same time. To circumvent this problem there are begin/end coherency markers, that forward directly to existing dma-buf device drivers vfunc hooks. Userspace can make use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The sequence would be used like following: - mmap dma-buf fd - for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write to mmap area 3. SYNC_END ioctl. This can be repeated as often as you want (with the new data being consumed by the GPU or say scanout device) - munamp once you don't need the buffer any more
v2 (Tiago): Fix header file type names (u64 -> __u64) v3 (Tiago): Add documentation. Use enum dma_buf_sync_flags to the begin/end dma-buf functions. Check for overflows in start/length. v4 (Tiago): use 2d regions for sync. v5 (Tiago): forget about 2d regions (v4); use _IOW in DMA_BUF_IOCTL_SYNC and remove range information from struct dma_buf_sync.
Cc: Sumit Semwal sumit.semwal@linaro.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com Signed-off-by: Tiago Vignatti tiago.vignatti@intel.com
Documentation/dma-buf-sharing.txt | 12 +++++++++++ drivers/dma-buf/dma-buf.c | 43 +++++++++++++++++++++++++++++++++++++++ include/uapi/linux/dma-buf.h | 41 +++++++++++++++++++++++++++++++++++++ 3 files changed, 96 insertions(+) create mode 100644 include/uapi/linux/dma-buf.h
diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 4f4a84b..ec0ab1d 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -352,6 +352,18 @@ Being able to mmap an export dma-buf buffer object has 2 main use-cases:
No special interfaces, userspace simply calls mmap on the dma-buf fd.
- Also, the userspace might need some sort of cache coherency management e.g.
- when CPU and GPU domains are being accessed through dma-buf at the same
- time. To circumvent this problem there are begin/end coherency markers, that
- forward directly to existing dma-buf device drivers vfunc hooks. Userspace
- can make use of those markers through the DMA_BUF_IOCTL_SYNC ioctl. The
- sequence would be used like following:
- mmap dma-buf fd
- for each drawing/upload cycle in CPU 1. SYNC_START ioctl, 2. read/write
to mmap area 3. SYNC_END ioctl. This can be repeated as often as you
want (with the new data being consumed by the GPU or say scanout device)
- munamp once you don't need the buffer any more
Supporting existing mmap interfaces in importers
Similar to the motivation for kernel cpu access it is again important that
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index b2ac13b..9a298bd 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -34,6 +34,8 @@ #include <linux/poll.h> #include <linux/reservation.h>
+#include <uapi/linux/dma-buf.h>
static inline int is_dma_buf_file(struct file *);
struct dma_buf_list { @@ -251,11 +253,52 @@ out: return events; }
+static long dma_buf_ioctl(struct file *file,
unsigned int cmd, unsigned long arg)
+{
struct dma_buf *dmabuf;
struct dma_buf_sync sync;
enum dma_data_direction direction;
dmabuf = file->private_data;
if (!is_dma_buf_file(file))
return -EINVAL;
switch (cmd) {
case DMA_BUF_IOCTL_SYNC:
if (copy_from_user(&sync, (void __user *) arg, sizeof(sync)))
return -EFAULT;
if (sync.flags & DMA_BUF_SYNC_RW)
direction = DMA_BIDIRECTIONAL;
else if (sync.flags & DMA_BUF_SYNC_READ)
direction = DMA_FROM_DEVICE;
else if (sync.flags & DMA_BUF_SYNC_WRITE)
direction = DMA_TO_DEVICE;
else
return -EINVAL;
if (sync.flags & ~DMA_BUF_SYNC_VALID_FLAGS_MASK)
return -EINVAL;
if (sync.flags & DMA_BUF_SYNC_END)
dma_buf_end_cpu_access(dmabuf, direction);
else
dma_buf_begin_cpu_access(dmabuf, direction);
return 0;
default:
return -ENOTTY;
}
+}
static const struct file_operations dma_buf_fops = { .release = dma_buf_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, .poll = dma_buf_poll,
.unlocked_ioctl = dma_buf_ioctl,
};
/* diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h new file mode 100644 index 0000000..262f7a7 --- /dev/null +++ b/include/uapi/linux/dma-buf.h @@ -0,0 +1,41 @@ +/*
- Framework for buffer objects that can be shared across devices/subsystems.
- Copyright(C) 2015 Intel Ltd
- This program is free software; you can redistribute it and/or modify it
- under the terms of the GNU General Public License version 2 as published by
- the Free Software Foundation.
- This program is distributed in the hope that it will be useful, but WITHOUT
- ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- more details.
- You should have received a copy of the GNU General Public License along with
- this program. If not, see http://www.gnu.org/licenses/.
- */
+#ifndef _DMA_BUF_UAPI_H_ +#define _DMA_BUF_UAPI_H_
+enum dma_buf_sync_flags {
DMA_BUF_SYNC_READ = (1 << 0),
DMA_BUF_SYNC_WRITE = (2 << 0),
DMA_BUF_SYNC_RW = (3 << 0),
DMA_BUF_SYNC_START = (0 << 2),
DMA_BUF_SYNC_END = (1 << 2),
DMA_BUF_SYNC_VALID_FLAGS_MASK = DMA_BUF_SYNC_RW |
DMA_BUF_SYNC_END
+};
+/* begin/end dma-buf functions used for userspace mmap. */ +struct dma_buf_sync {
enum dma_buf_sync_flags flags;
+};
+#define DMA_BUF_BASE 'b'
'b' is occupied by vme and qat driver; https://github.com/torvalds/linux/blob/master/Documentation/ioctl/ioctl-numb...
is it bad idea for drm.h to include these definition? http://lxr.free-electrons.com/source/include/uapi/drm/drm.h#L684
Br, DS
+#define DMA_BUF_IOCTL_SYNC _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
+#endif
2.1.0
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Intel Finland Oy Registered Address: PL 281, 00181 Helsinki Business Identity Code: 0357606 - 4 Domiciled in Helsinki
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
On 08/27/2015 09:06 AM, Hwang, Dongseong wrote:
On Thu, Aug 27, 2015 at 2:29 AM, Tiago Vignatti
+#define DMA_BUF_BASE 'b'
'b' is occupied by vme and qat driver; https://github.com/torvalds/linux/blob/master/Documentation/ioctl/ioctl-numb...
I believe this is alright, as noted in that txt file: "Because of the large number of drivers, many drivers share a partial letter with other drivers".
is it bad idea for drm.h to include these definition? http://lxr.free-electrons.com/source/include/uapi/drm/drm.h#L684
this is not a drm code and other type of device drivers might want to use it as well.
Tiago
This is my (failed) attempt to make the SYNC_* mandatory. I've tried to revoke write access to the mapped region until begin_cpu_access is called.
The tasklet schedule order seems alright but the whole logic is not working and I guess it's something related to the fs trick I'm trying to do with the put{,get}_write_access pair...
Not sure if I should follow this direction though. I've spent much time already on it!. What do you think?
Cc: Thomas Hellstrom thellstrom@vmware.com Cc: Jérôme Glisse jglisse@redhat.com
--- drivers/dma-buf/dma-buf.c | 31 ++++++++++++++++++++++++++++++- include/linux/dma-buf.h | 3 +++ 2 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 9a298bd..06cb37b 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -75,14 +75,34 @@ static int dma_buf_release(struct inode *inode, struct file *file) if (dmabuf->resv == (struct reservation_object *)&dmabuf[1]) reservation_object_fini(dmabuf->resv);
+ tasklet_kill(&dmabuf->tasklet); + module_put(dmabuf->owner); kfree(dmabuf); return 0; }
+static void dmabuf_mmap_tasklet(unsigned long data) +{ + struct dma_buf *dmabuf = (struct dma_buf *) data; + struct inode *inode = file_inode(dmabuf->file); + + if (!inode) + return; + + /* the CPU accessing another device may put the cache in an incoherent state. + * Therefore if the mmap succeeds, we forbid any further write access to the + * dma-buf until SYNC_START ioctl takes place, which gets back the write + * access. */ + put_write_access(inode); + + inode_dio_wait(inode); +} + static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) { struct dma_buf *dmabuf; + int ret;
if (!is_dma_buf_file(file)) return -EINVAL; @@ -94,7 +114,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) dmabuf->size >> PAGE_SHIFT) return -EINVAL;
- return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + if (!ret) + tasklet_schedule(&dmabuf->tasklet); + + return ret; }
static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) @@ -389,6 +413,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) list_add(&dmabuf->list_node, &db_list.head); mutex_unlock(&db_list.lock);
+ tasklet_init(&dmabuf->tasklet, dmabuf_mmap_tasklet, (unsigned long) dmabuf); + return dmabuf; } EXPORT_SYMBOL_GPL(dma_buf_export); @@ -589,6 +615,7 @@ EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment); int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { + struct inode *inode = file_inode(dmabuf->file); int ret = 0;
if (WARN_ON(!dmabuf)) @@ -597,6 +624,8 @@ int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, if (dmabuf->ops->begin_cpu_access) ret = dmabuf->ops->begin_cpu_access(dmabuf, direction);
+ get_write_access(inode); + return ret; } EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 532108e..0359792 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -24,6 +24,7 @@ #ifndef __DMA_BUF_H__ #define __DMA_BUF_H__
+#include <linux/interrupt.h> #include <linux/file.h> #include <linux/err.h> #include <linux/scatterlist.h> @@ -118,6 +119,7 @@ struct dma_buf_ops { * @list_node: node for dma_buf accounting and debugging. * @priv: exporter specific private data for this buffer object. * @resv: reservation object linked to this dma-buf + * @tasklet: tasklet for deferred mmap tasks. */ struct dma_buf { size_t size; @@ -133,6 +135,7 @@ struct dma_buf { struct list_head list_node; void *priv; struct reservation_object *resv; + struct tasklet_struct tasklet;
/* poll support */ wait_queue_head_t poll;
This function is meant to be used with dma-buf mmap, when finishing the CPU access of the mapped pointer.
The error case should be rare to happen though, requiring the buffer become active during the sync period and for the end_cpu_access to be interrupted. So we use a uninterruptible mutex_lock to spit out when it ever happens.
v2: disable interruption to make sure errors are reported. v3: update to the new end_cpu_access API.
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Tiago Vignatti tiago.vignatti@intel.com --- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 65ab2bd..9dba876 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -212,6 +212,27 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire return ret; }
+static void i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction) +{ + struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); + struct drm_device *dev = obj->base.dev; + struct drm_i915_private *dev_priv = to_i915(dev); + bool was_interruptible, write = (direction == DMA_BIDIRECTIONAL || direction == DMA_TO_DEVICE); + int ret; + + mutex_lock(&dev->struct_mutex); + was_interruptible = dev_priv->mm.interruptible; + dev_priv->mm.interruptible = false; + + ret = i915_gem_object_set_to_gtt_domain(obj, write); + + dev_priv->mm.interruptible = was_interruptible; + mutex_unlock(&dev->struct_mutex); + + if (unlikely(ret)) + DRM_ERROR("unable to flush buffer following CPU access; rendering may be corrupt\n"); +} + static const struct dma_buf_ops i915_dmabuf_ops = { .map_dma_buf = i915_gem_map_dma_buf, .unmap_dma_buf = i915_gem_unmap_dma_buf, @@ -224,6 +245,7 @@ static const struct dma_buf_ops i915_dmabuf_ops = { .vmap = i915_gem_dmabuf_vmap, .vunmap = i915_gem_dmabuf_vunmap, .begin_cpu_access = i915_gem_begin_cpu_access, + .end_cpu_access = i915_gem_end_cpu_access, };
struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
Userspace is the one in charge of flush CPU by wrapping mmap with begin{,end}_cpu_access.
v2: Remove LLC check cause we have dma-buf sync providers now. Also, fix return before transferring ownership when mmap fails. v3: Fix return values. v4: !obj->base.filp is user triggerable, so removed the WARN_ON.
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Tiago Vignatti tiago.vignatti@intel.com --- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 9dba876..b7e7a90 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -193,7 +193,23 @@ static void i915_gem_dmabuf_kunmap(struct dma_buf *dma_buf, unsigned long page_n
static int i915_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma) { - return -EINVAL; + struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); + int ret; + + if (obj->base.size < vma->vm_end - vma->vm_start) + return -EINVAL; + + if (!obj->base.filp) + return -ENODEV; + + ret = obj->base.filp->f_op->mmap(obj->base.filp, vma); + if (ret) + return ret; + + fput(vma->vm_file); + vma->vm_file = get_file(obj->base.filp); + + return 0; }
static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_direction direction)
dri-devel@lists.freedesktop.org