[PATCH 0/4] Prime helpers

List overview All Threads
Download

newer

older

[PATCH -next] drm/exynos/iommu:...

[Bug 22576] [KMS] mesa demo...

Aaron Plattner

6 Dec 2012 6 Dec '12

6:07 p.m.

This series adds helper functions that abstract the core parts of .gem_prime_import and .gem_prime_export so that drivers don't have to worry about the low-level details. These helpers are optional. A driver can use them by plugging in drm_gem_prime_import and drm_gem_prime_export into the drm_driver structure, or it can bypass them by plugging in its own functions. The first patch adds these helpers, and the later patches switch three drivers over to using them.

Jerome, I tried to address your concerns about code sharing by making the helpers available without making them required.

I haven't yet compile-tested the Exynos change since I don't have a toolchain set up for that, but I'll try to do that today.

Aaron Plattner (4): drm: add prime helpers drm/nouveau: use prime helpers drm/radeon: use prime helpers drm/exynos: use prime helpers

drivers/gpu/drm/drm_prime.c | 190 ++++++++++++++++++++++++++++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 151 ++--------------------- drivers/gpu/drm/exynos/exynos_drm_dmabuf.h | 13 +- drivers/gpu/drm/exynos/exynos_drm_drv.c | 2 + drivers/gpu/drm/nouveau/nouveau_bo.h | 1 - drivers/gpu/drm/nouveau/nouveau_drm.c | 9 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 - drivers/gpu/drm/nouveau/nouveau_gem.h | 11 +- drivers/gpu/drm/nouveau/nouveau_prime.c | 172 ++++---------------------- drivers/gpu/drm/radeon/radeon_drv.c | 17 +-- drivers/gpu/drm/radeon/radeon_prime.c | 169 ++++--------------------- include/drm/drmP.h | 15 +++ 12 files changed, 288 insertions(+), 464 deletions(-)

-- 1.8.0.1

Show replies by date

Aaron Plattner

6 Dec 6 Dec

6:07 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

Instead of reimplementing all of the dma_buf functionality in every driver, create helpers drm_prime_import and drm_prime_export that implement them in terms of new, lower-level hook functions:

gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT gem_prime_get_pages: convert a drm_gem_object to an sg_table for export gem_prime_import_sg: convert an sg_table into a drm_gem_object gem_prime_vmap, gem_prime_vunmap: map and unmap an object

These hooks are optional; drivers can opt in by using drm_gem_prime_import and drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of struct drm_driver.

Signed-off-by: Aaron Plattner aplattner@nvidia.com --- drivers/gpu/drm/drm_prime.c | 190 +++++++++++++++++++++++++++++++++++++++++++- include/drm/drmP.h | 15 ++++ 2 files changed, 204 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7f12573..566c2ac 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -53,7 +53,8 @@ * Self-importing: if userspace is using PRIME as a replacement for flink * then it will get a fd->handle request for a GEM object that it created. * Drivers should detect this situation and return back the gem object - * from the dma-buf private. + * from the dma-buf private. Prime will do this automatically for drivers that + * use the drm_gem_prime_{import,export} helpers. */

struct drm_prime_member { @@ -62,6 +63,146 @@ struct drm_prime_member { uint32_t handle; };

+static struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach, + enum dma_data_direction dir) +{ + struct drm_gem_object *obj = attach->dmabuf->priv; + struct sg_table *st; + + mutex_lock_interruptible(&obj->dev->struct_mutex); + + st = obj->dev->driver->gem_prime_get_pages(obj); + + if (!IS_ERR_OR_NULL(st)) + dma_map_sg(attach->dev, st->sgl, st->nents, dir); + + mutex_unlock(&obj->dev->struct_mutex); + return st; +} + +static void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach, + struct sg_table *sg, enum dma_data_direction dir) +{ + dma_unmap_sg(attach->dev, sg->sgl, sg->nents, dir); + sg_free_table(sg); + kfree(sg); +} + +static void drm_gem_dmabuf_release(struct dma_buf *dma_buf) +{ + struct drm_gem_object *obj = dma_buf->priv; + + if (obj->export_dma_buf == dma_buf) { + /* drop the reference on the export fd holds */ + obj->export_dma_buf = NULL; + drm_gem_object_unreference_unlocked(obj); + } +} + +static void *drm_gem_dmabuf_vmap(struct dma_buf *dma_buf) +{ + struct drm_gem_object *obj = dma_buf->priv; + struct drm_device *dev = obj->dev; + void *virtual = ERR_PTR(-EINVAL); + + if (!dev->driver->gem_prime_vmap) + return ERR_PTR(-EINVAL); + + mutex_lock(&dev->struct_mutex); + if (obj->vmapping_count) { + obj->vmapping_count++; + virtual = obj->vmapping_ptr; + goto out_unlock; + } + + virtual = dev->driver->gem_prime_vmap(obj); + if (IS_ERR(virtual)) { + mutex_unlock(&dev->struct_mutex); + return virtual; + } + obj->vmapping_count = 1; + obj->vmapping_ptr = virtual; +out_unlock: + mutex_unlock(&dev->struct_mutex); + return obj->vmapping_ptr; +} + +static void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr) +{ + struct drm_gem_object *obj = dma_buf->priv; + struct drm_device *dev = obj->dev; + + mutex_lock(&dev->struct_mutex); + obj->vmapping_count--; + if (obj->vmapping_count == 0) { + dev->driver->gem_prime_vunmap(obj); + obj->vmapping_ptr = NULL; + } + mutex_unlock(&dev->struct_mutex); +} + +static void *drm_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, + unsigned long page_num) +{ + return NULL; +} + +static void drm_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf, + unsigned long page_num, void *addr) +{ + +} +static void *drm_gem_dmabuf_kmap(struct dma_buf *dma_buf, + unsigned long page_num) +{ + return NULL; +} + +static void drm_gem_dmabuf_kunmap(struct dma_buf *dma_buf, + unsigned long page_num, void *addr) +{ + +} + +static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf, + struct vm_area_struct *vma) +{ + return -EINVAL; +} + +static int drm_gem_begin_cpu_access(struct dma_buf *dma_buf, + size_t start, size_t length, enum dma_data_direction direction) +{ + return -EINVAL; +} + +static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = { + .map_dma_buf = drm_gem_map_dma_buf, + .unmap_dma_buf = drm_gem_unmap_dma_buf, + .release = drm_gem_dmabuf_release, + .kmap = drm_gem_dmabuf_kmap, + .kmap_atomic = drm_gem_dmabuf_kmap_atomic, + .kunmap = drm_gem_dmabuf_kunmap, + .kunmap_atomic = drm_gem_dmabuf_kunmap_atomic, + .mmap = drm_gem_dmabuf_mmap, + .vmap = drm_gem_dmabuf_vmap, + .vunmap = drm_gem_dmabuf_vunmap, + .begin_cpu_access = drm_gem_begin_cpu_access, +}; + +struct dma_buf *drm_gem_prime_export(struct drm_device *dev, + struct drm_gem_object *obj, int flags) +{ + if (dev->driver->gem_prime_pin) { + int ret = dev->driver->gem_prime_pin(obj); + if (ret) + return ERR_PTR(ret); + } + return dma_buf_export(obj, &drm_gem_prime_dmabuf_ops, obj->size, + 0600); +} +EXPORT_SYMBOL(drm_gem_prime_export); + int drm_gem_prime_handle_to_fd(struct drm_device *dev, struct drm_file *file_priv, uint32_t handle, uint32_t flags, int *prime_fd) @@ -117,6 +258,53 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, } EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);

+struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, + struct dma_buf *dma_buf) +{ + struct dma_buf_attachment *attach; + struct sg_table *sg; + struct drm_gem_object *obj; + int ret; + + if (!dev->driver->gem_prime_import_sg) + return ERR_PTR(-EINVAL); + + if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { + obj = dma_buf->priv; + if (obj->dev == dev) { + drm_gem_object_reference(obj); + return obj; + } + } + + attach = dma_buf_attach(dma_buf, dev->dev); + if (IS_ERR(attach)) + return ERR_PTR(PTR_ERR(attach)); + + sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); + if (IS_ERR_OR_NULL(sg)) { + ret = PTR_ERR(sg); + goto fail_detach; + } + + obj = dev->driver->gem_prime_import_sg(dev, dma_buf->size, sg); + if (IS_ERR(obj)) { + ret = PTR_ERR(obj); + goto fail_unmap; + } + + obj->import_attach = attach; + + return obj; + +fail_unmap: + dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL); +fail_detach: + dma_buf_detach(dma_buf, attach); + return ERR_PTR(ret); +} +EXPORT_SYMBOL(drm_gem_prime_import); + int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle) { diff --git a/include/drm/drmP.h b/include/drm/drmP.h index fad21c9..f65cae9 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -674,6 +674,10 @@ struct drm_gem_object {

/* dma buf attachment backing this object */ struct dma_buf_attachment *import_attach; + + /* vmap information */ + int vmapping_count; + void *vmapping_ptr; };

#include <drm/drm_crtc.h> @@ -919,6 +923,13 @@ struct drm_driver { /* import dmabuf -> GEM */ struct drm_gem_object * (*gem_prime_import)(struct drm_device *dev, struct dma_buf *dma_buf); + /* low-level interface used by drm_gem_prime_{import,export} */ + int (*gem_prime_pin)(struct drm_gem_object *obj); + struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj); + struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev, + size_t size, struct sg_table *sg); + void *(*gem_prime_vmap)(struct drm_gem_object *obj); + void (*gem_prime_vunmap)(struct drm_gem_object *obj);

/* vga arb irq handler */ void (*vgaarb_irq)(struct drm_device *dev, bool state); @@ -1540,9 +1551,13 @@ extern int drm_clients_info(struct seq_file *m, void* data); extern int drm_gem_name_info(struct seq_file *m, void *data);

+extern struct dma_buf *drm_gem_prime_export(struct drm_device *dev, + struct drm_gem_object *obj, int flags); extern int drm_gem_prime_handle_to_fd(struct drm_device *dev, struct drm_file *file_priv, uint32_t handle, uint32_t flags, int *prime_fd); +extern struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, + struct dma_buf *dma_buf); extern int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle);

-- 1.8.0.1

Daniel Vetter

9:46 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:

...

Instead of reimplementing all of the dma_buf functionality in every driver, create helpers drm_prime_import and drm_prime_export that implement them in terms of new, lower-level hook functions:

gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT gem_prime_get_pages: convert a drm_gem_object to an sg_table for export gem_prime_import_sg: convert an sg_table into a drm_gem_object gem_prime_vmap, gem_prime_vunmap: map and unmap an object

These hooks are optional; drivers can opt in by using drm_gem_prime_import and drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of struct drm_driver.

Signed-off-by: Aaron Plattner aplattner@nvidia.com

A few comments:

- Can you please add kerneldoc entries for these new helpers? Laurent Pinchart started a nice drm kerneldoc as an overview. And since then we've at least integrated the kerneldoc api reference at a nice place there and brushed it up a bit when touching drm core or helpers in a bigger patch series. See e.g. my recent dp helper series for what I'm looking for. I think we should have kerneldoc at least for the exported functions.

- Just an idea for all the essential noop cpu helpers: Maybe just call them dma_buf*noop and shovel them as convenience helpers into the dma-buf.c code in the core, for drivers that don't want/can't/won't bother to implement the cpu access support. Related: Exporting the functions in general so that drivers could pick&choose

- s/gem_prime_get_pages/gem_prime_get_sg/ drm/i915 now uses sg_tables everywhere to manage backing storage, and we're starting to use the stolen memory, too. Stolen memory is not struct page backed, so better make this clear in the function calls. I know that the current prime/dma_buf code from Dave doesn't really work too well (*cough*) if the backing storage isn't struct page backed, at least on the ttm import side. This also links in with my 2nd comment above - dma_buf requires that exporter map the sg into the importers device space to support fake subdevices which share the exporters iommu/gart, hence such incetious exporter might want to overwrite some of the functions.

- To make it a first-class helper and remove any and all midlayer stints from this (I truly loath midlayers ...) maybe shovel this into a drm_prime_helper.c file. Also, adding functions to the core vtable for pure helpers is usually not too neat, since it makes it way too damn easy to mix things up and transform the helper into a midlayer by accident. E.g. the crtc helper functions all just use a generic void * pointer for their vtables, other helpers either subclass an existing struct or use their completely independent structs (which the driver can both embed into it's own object struct and then do the right upclassing in the callbacks). tbh haven't looked to closely at the series to asses what would make sense, but we have way too much gunk in the drm vtable already ...

Dunno whether this aligns with what you want to achieve here. But on a quick hunch this code feels rather unflexible and just refactors the identical copypasta code back into one copy. Anyway, just figured I'll drop my 2 cents here, feel free to ignore (maybe safe for the last one).

Cheers, Daniel

...

drivers/gpu/drm/drm_prime.c | 190 +++++++++++++++++++++++++++++++++++++++++++- include/drm/drmP.h | 15 ++++ 2 files changed, 204 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7f12573..566c2ac 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -53,7 +53,8 @@

Self-importing: if userspace is using PRIME as a replacement for flink

then it will get a fd->handle request for a GEM object that it created.

Drivers should detect this situation and return back the gem object

from the dma-buf private.

from the dma-buf private. Prime will do this automatically for drivers that

use the drm_gem_prime_{import,export} helpers.

*/

struct drm_prime_member { @@ -62,6 +63,146 @@ struct drm_prime_member { uint32_t handle; };

+static struct sg_table *drm_gem_map_dma_buf(struct dma_buf_attachment *attach,
enum dma_data_direction dir)
+{
struct drm_gem_object *obj = attach->dmabuf->priv;

struct sg_table *st;

mutex_lock_interruptible(&obj->dev->struct_mutex);

st = obj->dev->driver->gem_prime_get_pages(obj);

if (!IS_ERR_OR_NULL(st))
dma_map_sg(attach->dev, st->sgl, st->nents, dir);
mutex_unlock(&obj->dev->struct_mutex);

return st;
+}

+static void drm_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
struct sg_table *sg, enum dma_data_direction dir)
+{

dma_unmap_sg(attach->dev, sg->sgl, sg->nents, dir);

sg_free_table(sg);

kfree(sg);

+}

+static void drm_gem_dmabuf_release(struct dma_buf *dma_buf) +{
struct drm_gem_object *obj = dma_buf->priv;

if (obj->export_dma_buf == dma_buf) {
/* drop the reference on the export fd holds */
obj->export_dma_buf = NULL;
drm_gem_object_unreference_unlocked(obj);
}
+}

+static void *drm_gem_dmabuf_vmap(struct dma_buf *dma_buf) +{
struct drm_gem_object *obj = dma_buf->priv;

struct drm_device *dev = obj->dev;

void *virtual = ERR_PTR(-EINVAL);

if (!dev->driver->gem_prime_vmap)
return ERR_PTR(-EINVAL);
mutex_lock(&dev->struct_mutex);

if (obj->vmapping_count) {
obj->vmapping_count++;
virtual = obj->vmapping_ptr;
goto out_unlock;
}

virtual = dev->driver->gem_prime_vmap(obj);

if (IS_ERR(virtual)) {
mutex_unlock(&dev->struct_mutex);
return virtual;
}

obj->vmapping_count = 1;

obj->vmapping_ptr = virtual;
+out_unlock:

mutex_unlock(&dev->struct_mutex);

return obj->vmapping_ptr;

+}

+static void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr) +{
struct drm_gem_object *obj = dma_buf->priv;

struct drm_device *dev = obj->dev;

mutex_lock(&dev->struct_mutex);

obj->vmapping_count--;

if (obj->vmapping_count == 0) {
dev->driver->gem_prime_vunmap(obj);
obj->vmapping_ptr = NULL;
}

mutex_unlock(&dev->struct_mutex);
+}

+static void *drm_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf,
unsigned long page_num)
+{

return NULL;

+}

+static void drm_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf,
unsigned long page_num, void *addr)
+{

+} +static void *drm_gem_dmabuf_kmap(struct dma_buf *dma_buf,
unsigned long page_num)
+{

return NULL;

+}

+static void drm_gem_dmabuf_kunmap(struct dma_buf *dma_buf,
unsigned long page_num, void *addr)
+{

+}

+static int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf,
struct vm_area_struct *vma)
+{

return -EINVAL;

+}

+static int drm_gem_begin_cpu_access(struct dma_buf *dma_buf,
size_t start, size_t length, enum dma_data_direction direction)
+{

return -EINVAL;

+}

+static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = {

.map_dma_buf = drm_gem_map_dma_buf,

.unmap_dma_buf = drm_gem_unmap_dma_buf,

.release = drm_gem_dmabuf_release,

.kmap = drm_gem_dmabuf_kmap,

.kmap_atomic = drm_gem_dmabuf_kmap_atomic,

.kunmap = drm_gem_dmabuf_kunmap,

.kunmap_atomic = drm_gem_dmabuf_kunmap_atomic,

.mmap = drm_gem_dmabuf_mmap,

.vmap = drm_gem_dmabuf_vmap,

.vunmap = drm_gem_dmabuf_vunmap,

.begin_cpu_access = drm_gem_begin_cpu_access,

+};

+struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
		     struct drm_gem_object *obj, int flags)
+{
if (dev->driver->gem_prime_pin) {
int ret = dev->driver->gem_prime_pin(obj);
if (ret)
	return ERR_PTR(ret);
}

return dma_buf_export(obj, &drm_gem_prime_dmabuf_ops, obj->size,
	      0600);
+} +EXPORT_SYMBOL(drm_gem_prime_export);

int drm_gem_prime_handle_to_fd(struct drm_device *dev, struct drm_file *file_priv, uint32_t handle, uint32_t flags, int *prime_fd) @@ -117,6 +258,53 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, } EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);

+struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
			    struct dma_buf *dma_buf)
+{
struct dma_buf_attachment *attach;

struct sg_table *sg;

struct drm_gem_object *obj;

int ret;

if (!dev->driver->gem_prime_import_sg)
return ERR_PTR(-EINVAL);
if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) {
obj = dma_buf->priv;
if (obj->dev == dev) {
	drm_gem_object_reference(obj);
	return obj;
}
}

attach = dma_buf_attach(dma_buf, dev->dev);

if (IS_ERR(attach))
return ERR_PTR(PTR_ERR(attach));
sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);

if (IS_ERR_OR_NULL(sg)) {
ret = PTR_ERR(sg);
goto fail_detach;
}

obj = dev->driver->gem_prime_import_sg(dev, dma_buf->size, sg);

if (IS_ERR(obj)) {
ret = PTR_ERR(obj);
goto fail_unmap;
}

obj->import_attach = attach;

return obj;
+fail_unmap:

dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL);

+fail_detach:

dma_buf_detach(dma_buf, attach);

return ERR_PTR(ret);

+} +EXPORT_SYMBOL(drm_gem_prime_import);

int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle) { diff --git a/include/drm/drmP.h b/include/drm/drmP.h index fad21c9..f65cae9 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -674,6 +674,10 @@ struct drm_gem_object {

/* dma buf attachment backing this object */ struct dma_buf_attachment *import_attach;

/* vmap information */

int vmapping_count;

void *vmapping_ptr;

};

#include <drm/drm_crtc.h> @@ -919,6 +923,13 @@ struct drm_driver { /* import dmabuf -> GEM */ struct drm_gem_object * (*gem_prime_import)(struct drm_device *dev, struct dma_buf *dma_buf);
/* low-level interface used by drm_gem_prime_{import,export} */

int (*gem_prime_pin)(struct drm_gem_object *obj);

struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj);

struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev,
		size_t size, struct sg_table *sg);
void *(*gem_prime_vmap)(struct drm_gem_object *obj);

void (*gem_prime_vunmap)(struct drm_gem_object *obj);

/* vga arb irq handler */ void (*vgaarb_irq)(struct drm_device *dev, bool state);
@@ -1540,9 +1551,13 @@ extern int drm_clients_info(struct seq_file *m, void* data); extern int drm_gem_name_info(struct seq_file *m, void *data);

+extern struct dma_buf *drm_gem_prime_export(struct drm_device *dev,
struct drm_gem_object *obj, int flags);
extern int drm_gem_prime_handle_to_fd(struct drm_device *dev, struct drm_file *file_priv, uint32_t handle, uint32_t flags, int *prime_fd); +extern struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf);
extern int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle);

-- 1.8.0.1

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter

9:50 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On Thu, Dec 06, 2012 at 10:46:30PM +0100, Daniel Vetter wrote:

...

On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:

...
Instead of reimplementing all of the dma_buf functionality in every driver, create helpers drm_prime_import and drm_prime_export that implement them in terms of new, lower-level hook functions:

gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT gem_prime_get_pages: convert a drm_gem_object to an sg_table for export gem_prime_import_sg: convert an sg_table into a drm_gem_object gem_prime_vmap, gem_prime_vunmap: map and unmap an object

These hooks are optional; drivers can opt in by using drm_gem_prime_import and drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of struct drm_driver.

Signed-off-by: Aaron Plattner aplattner@nvidia.com

A few comments:

Can you please add kerneldoc entries for these new helpers? Laurent Pinchart started a nice drm kerneldoc as an overview. And since then we've at least integrated the kerneldoc api reference at a nice place there and brushed it up a bit when touching drm core or helpers in a bigger patch series. See e.g. my recent dp helper series for what I'm looking for. I think we should have kerneldoc at least for the exported functions.

Just an idea for all the essential noop cpu helpers: Maybe just call them dma_buf*noop and shovel them as convenience helpers into the dma-buf.c code in the core, for drivers that don't want/can't/won't bother to implement the cpu access support. Related: Exporting the functions in general so that drivers could pick&choose

One more: For the cpu access noop helpers I'd vote for -ENOTTY as the more canonical "not implemented error, you're talking to the wrong thing" error instead of -EINVAL, which an exporter could throw back to the importer if e.g. the range is outside of the size of the dma_buf. With a quick dma_buf doc update we could then bless this as the official way to denounce cpu access support. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Maarten Lankhorst

7 Dec 7 Dec

2:03 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

Op 06-12-12 22:50, Daniel Vetter schreef:

...

On Thu, Dec 06, 2012 at 10:46:30PM +0100, Daniel Vetter wrote:

...
On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:

...
Instead of reimplementing all of the dma_buf functionality in every driver, create helpers drm_prime_import and drm_prime_export that implement them in terms of new, lower-level hook functions:

gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT gem_prime_get_pages: convert a drm_gem_object to an sg_table for export gem_prime_import_sg: convert an sg_table into a drm_gem_object gem_prime_vmap, gem_prime_vunmap: map and unmap an object

These hooks are optional; drivers can opt in by using drm_gem_prime_import and drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of struct drm_driver.

Signed-off-by: Aaron Plattner aplattner@nvidia.com

A few comments:

Can you please add kerneldoc entries for these new helpers? Laurent Pinchart started a nice drm kerneldoc as an overview. And since then we've at least integrated the kerneldoc api reference at a nice place there and brushed it up a bit when touching drm core or helpers in a bigger patch series. See e.g. my recent dp helper series for what I'm looking for. I think we should have kerneldoc at least for the exported functions.

Just an idea for all the essential noop cpu helpers: Maybe just call them dma_buf*noop and shovel them as convenience helpers into the dma-buf.c code in the core, for drivers that don't want/can't/won't bother to implement the cpu access support. Related: Exporting the functions in general so that drivers could pick&choose

One more: For the cpu access noop helpers I'd vote for -ENOTTY as the more canonical "not implemented error, you're talking to the wrong thing" error instead of -EINVAL, which an exporter could throw back to the importer if e.g. the range is outside of the size of the dma_buf. With a quick dma_buf doc update we could then bless this as the official way to denounce cpu access support.

Yeah lets reinvent a new error to return, and make it not a typewriter to confuse users, instead of using -ENODEV which would actually be valid and most descriptive here if you read the mmap page:

-ENODEV The underlying file system of the specified file does not support memory mapping.

~Maarten

Aaron Plattner

5:58 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On 12/06/2012 01:46 PM, Daniel Vetter wrote:

...

On Thu, Dec 06, 2012 at 10:07:48AM -0800, Aaron Plattner wrote:

...
Instead of reimplementing all of the dma_buf functionality in every driver, create helpers drm_prime_import and drm_prime_export that implement them in terms of new, lower-level hook functions:

gem_prime_pin: callback when a buffer is created, used to pin buffers into GTT gem_prime_get_pages: convert a drm_gem_object to an sg_table for export gem_prime_import_sg: convert an sg_table into a drm_gem_object gem_prime_vmap, gem_prime_vunmap: map and unmap an object

These hooks are optional; drivers can opt in by using drm_gem_prime_import and drm_gem_prime_export as the .gem_prime_import and .gem_prime_export fields of struct drm_driver.

Signed-off-by: Aaron Plattner aplattner@nvidia.com

A few comments:

Can you please add kerneldoc entries for these new helpers? Laurent Pinchart started a nice drm kerneldoc as an overview. And since then we've at least integrated the kerneldoc api reference at a nice place there and brushed it up a bit when touching drm core or helpers in a bigger patch series. See e.g. my recent dp helper series for what I'm looking for. I think we should have kerneldoc at least for the exported functions.

Good call, I'll look into that.

...

Just an idea for all the essential noop cpu helpers: Maybe just call them dma_buf*noop and shovel them as convenience helpers into the dma-buf.c code in the core, for drivers that don't want/can't/won't bother to implement the cpu access support. Related: Exporting the functions in general so that drivers could pick&choose

Seems like a good idea as a follow-on change. Do you think it would make more sense to have noop helpers, or allow them to be NULL in dma-buf.c?

...

s/gem_prime_get_pages/gem_prime_get_sg/ drm/i915 now uses sg_tables everywhere to manage backing storage, and we're starting to use the stolen memory, too. Stolen memory is not struct page backed, so better make this clear in the function calls. I know that the current prime/dma_buf code from Dave doesn't really work too well (*cough*) if the backing storage isn't struct page backed, at least on the ttm import side. This also links in with my 2nd comment above - dma_buf requires that exporter map the sg into the importers device space to support fake subdevices which share the exporters iommu/gart, hence such incetious exporter might want to overwrite some of the functions.

Will do in a v2 set.

...

To make it a first-class helper and remove any and all midlayer stints from this (I truly loath midlayers ...) maybe shovel this into a drm_prime_helper.c file. Also, adding functions to the core vtable for pure helpers is usually not too neat, since it makes it way too damn easy to mix things up and transform the helper into a midlayer by accident. E.g. the crtc helper functions all just use a generic void * pointer for their vtables, other helpers either subclass an existing struct or use their completely independent structs (which the driver can both embed into it's own object struct and then do the right upclassing in the callbacks). tbh haven't looked to closely at the series to asses what would make sense, but we have way too much gunk in the drm vtable already ...

I'm not sure I fully understand what you mean by this. Are you suggesting creating a separate struct for these functions and having a void* pointer to that in struct drm_driver?

...

Dunno whether this aligns with what you want to achieve here. But on a quick hunch this code feels rather unflexible and just refactors the identical copypasta code back into one copy. Anyway, just figured I'll drop my 2 cents here, feel free to ignore (maybe safe for the last one).

I think uncopypasta-ing the current code is beneficial on its own. Do you think doing this now would make it harder to do the further refactoring you suggest later?

-- Aaron

Daniel Vetter

6:48 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On Fri, Dec 07, 2012 at 09:58:38AM -0800, Aaron Plattner wrote:

...

On 12/06/2012 01:46 PM, Daniel Vetter wrote:

[snip]

...

...

Just an idea for all the essential noop cpu helpers: Maybe just call them dma_buf*noop and shovel them as convenience helpers into the dma-buf.c code in the core, for drivers that don't want/can't/won't bother to implement the cpu access support. Related: Exporting the functions in general so that drivers could pick&choose

Seems like a good idea as a follow-on change. Do you think it would make more sense to have noop helpers, or allow them to be NULL in dma-buf.c?

Yeah, sounds good. Just thought that drm drivers aren't the only ones not caring about cpu access support too much, so an officially blessed way with documentation and unified errno would be good

[snip]

...

...

To make it a first-class helper and remove any and all midlayer stints from this (I truly loath midlayers ...) maybe shovel this into a drm_prime_helper.c file. Also, adding functions to the core vtable for pure helpers is usually not too neat, since it makes it way too damn easy to mix things up and transform the helper into a midlayer by accident. E.g. the crtc helper functions all just use a generic void * pointer for their vtables, other helpers either subclass an existing struct or use their completely independent structs (which the driver can both embed into it's own object struct and then do the right upclassing in the callbacks). tbh haven't looked to closely at the series to asses what would make sense, but we have way too much gunk in the drm vtable already ...

I'm not sure I fully understand what you mean by this. Are you suggesting creating a separate struct for these functions and having a void* pointer to that in struct drm_driver?

Create a new struct like

struct drm_prime_helper_funcs { int (*gem_prime_pin)(struct drm_gem_object *obj); struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj); struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev, size_t size, struct sg_table *sg); void *(*gem_prime_vmap)(struct drm_gem_object *obj); void (*gem_prime_vunmap)(struct drm_gem_object *obj); }

struct drm_prime_helper_obj { /* vmap information */ int vmapping_count; void *vmapping_ptr;

/* any other data the helpers need ... */ struct drm_prime_helper_funcs *funcs; }

Drivers then fill out a func vtable and embedded the helper obj into their own gpu bo struct, i.e.

struct radeon_bo { struct ttm_bo ttm_bo; struct gem_buffer_object gem_bo; struct drm_prime_helper_obj prime_bo;

/* other stuff */ }

In the prime helper callbacks in the driver then upcast the prime_bo to the radeon_bo instead of the gem_bo to the radeon bo.

Upsides: Guaranteed to not interfere with drivers not using it, with an imo higher chance that the helper is cleanly separate from core stuff (no guarantee ofc) and so easier to refactor in the future. And drm is full with legacy stuff used by very few drivers, but highly intermingled with drm core used by other drivers (my little holy war here is to slowly refactor those things out and shovel them into drivers), hence why I prefer to have an as clean separation of optional stuff as possible ...

Also, this way allows different drivers to use completely different helper libraries for the same task, without those two helper libraries ever interfering.

Downside: Adds a pointer to each bo struct for drivers using the helpers. Given how big those tend to be, not too onerous. Given And maybe a bit of confusion in driver callbacks, but since the linux container_of is typechecked by the compiler, not too onerous either.

...

...
Dunno whether this aligns with what you want to achieve here. But on a quick hunch this code feels rather unflexible and just refactors the identical copypasta code back into one copy. Anyway, just figured I'll drop my 2 cents here, feel free to ignore (maybe safe for the last one).

I think uncopypasta-ing the current code is beneficial on its own. Do you think doing this now would make it harder to do the further refactoring you suggest later?

If we later on go with something like the above, it'll require going over all drivers again, doing similar mechanic changes. If no one yet managed to intermingle the helpers with the core in the meantime, that is ;-)

Cheers, Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Aaron Plattner

8:33 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On 12/07/2012 10:48 AM, Daniel Vetter wrote:

...

On Fri, Dec 07, 2012 at 09:58:38AM -0800, Aaron Plattner wrote:

...
On 12/06/2012 01:46 PM, Daniel Vetter wrote:

...

To make it a first-class helper and remove any and all midlayer stints from this (I truly loath midlayers ...) maybe shovel this into a drm_prime_helper.c file. Also, adding functions to the core vtable for pure helpers is usually not too neat, since it makes it way too damn easy to mix things up and transform the helper into a midlayer by accident. E.g. the crtc helper functions all just use a generic void * pointer for their vtables, other helpers either subclass an existing struct or use their completely independent structs (which the driver can both embed into it's own object struct and then do the right upclassing in the callbacks). tbh haven't looked to closely at the series to asses what would make sense, but we have way too much gunk in the drm vtable already ...

I'm not sure I fully understand what you mean by this. Are you suggesting creating a separate struct for these functions and having a void* pointer to that in struct drm_driver?

Create a new struct like

struct drm_prime_helper_funcs { int (*gem_prime_pin)(struct drm_gem_object *obj); struct sg_table *(*gem_prime_get_pages)(struct drm_gem_object *obj); struct drm_gem_object *(*gem_prime_import_sg)(struct drm_device *dev, size_t size, struct sg_table *sg); void *(*gem_prime_vmap)(struct drm_gem_object *obj); void (*gem_prime_vunmap)(struct drm_gem_object *obj); }

struct drm_prime_helper_obj { /* vmap information */ int vmapping_count; void *vmapping_ptr;

/* any other data the helpers need ... */ struct drm_prime_helper_funcs *funcs; }

Ah, I see what you mean. This would also need either a pointer to the drv directly so the helpers can lock drv->struct_mutex, or a helper function to convert from a drm_prime_helper_obj* to a gem_buffer_object* in a driver-specific way since the helper doesn't know the layout of the driver's bo structure.

So it would be something like

struct drm_prime_helper_obj *helper_obj = dma_buf->priv; struct drm_prime_helper_funcs *funcs = helper_obj->funcs; struct drm_gem_object *obj = funcs->get_gem_obj(helper_obj);

mutex_lock(&obj->dev->struct_mutex); funcs->whatever(obj); mutex_unlock&obj->dev->struct_mutex);

and then for example, radeon_drm_prime_get_gem_obj would be

struct radeon_bo *bo = container_of(helper_obj, struct radeon_bo, prime_helper); return &bo->gem_base;

That seems a little over-engineered to me, but I can code it up.

-- Aaron

...

Drivers then fill out a func vtable and embedded the helper obj into their own gpu bo struct, i.e.

struct radeon_bo { struct ttm_bo ttm_bo; struct gem_buffer_object gem_bo; struct drm_prime_helper_obj prime_bo;

/* other stuff */ }

In the prime helper callbacks in the driver then upcast the prime_bo to the radeon_bo instead of the gem_bo to the radeon bo.

Upsides: Guaranteed to not interfere with drivers not using it, with an imo higher chance that the helper is cleanly separate from core stuff (no guarantee ofc) and so easier to refactor in the future. And drm is full with legacy stuff used by very few drivers, but highly intermingled with drm core used by other drivers (my little holy war here is to slowly refactor those things out and shovel them into drivers), hence why I prefer to have an as clean separation of optional stuff as possible ...

Also, this way allows different drivers to use completely different helper libraries for the same task, without those two helper libraries ever interfering.

Downside: Adds a pointer to each bo struct for drivers using the helpers. Given how big those tend to be, not too onerous. Given And maybe a bit of confusion in driver callbacks, but since the linux container_of is typechecked by the compiler, not too onerous either.

...
...
Dunno whether this aligns with what you want to achieve here. But on a quick hunch this code feels rather unflexible and just refactors the identical copypasta code back into one copy. Anyway, just figured I'll drop my 2 cents here, feel free to ignore (maybe safe for the last one).

I think uncopypasta-ing the current code is beneficial on its own. Do you think doing this now would make it harder to do the further refactoring you suggest later?

If we later on go with something like the above, it'll require going over all drivers again, doing similar mechanic changes. If no one yet managed to intermingle the helpers with the core in the meantime, that is ;-)

Cheers, Daniel

Daniel Vetter

9:38 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On Fri, Dec 7, 2012 at 9:33 PM, Aaron Plattner aplattner@nvidia.com wrote:

...

Ah, I see what you mean. This would also need either a pointer to the drv directly so the helpers can lock drv->struct_mutex, or a helper function to convert from a drm_prime_helper_obj* to a gem_buffer_object* in a driver-specific way since the helper doesn't know the layout of the driver's bo structure.

Ah, locking, and it involves dev->struct_mutex. Another pet peeve of mine ;-) Tbh I haven't looked at locking issues when stitching together my quick proposal.

The problem with the dev->struct_mutex lock is that it's everywhere and hence it's way to easy to create deadlocks with it. Especially with prime, where a simple rule like "take this lock first, always" are suddenly really more complicated since gem drivers can assume both the importer and exporter role ... I haven't done a full locking review, but I expect current prime to deadlock (through driver calls) when userspace starts doing funny things.

So imo it's best to move dev->struct_mutex as far away as possible from helper functions and think really hard whether we really need it. I've noticed three functions:

drm_gem_map_dma_buf: We can remove the locking here imo, if drivers need dev->struct_mutex for internal consistency, they can take it in their callbacks. The dma_buf_attachment is very much like a fancy sg list + backing storage pointer. If multiple callers concurrently interact with the same attachment, havoc might ensue. Importers should add their own locking if concurrent access is possible (e.g. similar to how filling in the same sg table concurrently and then calling dma_map_sg concurrently would lead to chaos). Otoh creating/destroying attachments with attach/detach is protected by the dma_buf->lock.

I've checked dma-buf docs, and that this is not specced in the docs, but was very much the intention. So I think we can solve this with a simple doc update ;-)

drm_gem_dmabuf_v(un)map: Beside that drivers might need to lock for internal consistency the lock only protects the vmapping_counter and pointer and avoids that we race concurrent vmap/vunmap calls to the driver. Now vmap/vunmap is very much like an attachment, just that we don't have an explicit struct for each client since there's only one cpu address space.

So pretty much every driver is forced to reinvent vmap refcounting, which feels like an interface bug. What about moving this code to the dma-buf.c interface shim and protecting it with dma_buf->lock? Atm we only take that lock for attach/detach, and drivers using vmap place the vmap call at the same spot hw drivers would place the attach call, so this shouldn't introduce any nefarious locking issues. So I think this would be the right way to move forward.

And with that we've weaned the prime helpers off their dependency of dev->struct_mutex, which should make them a notch more flexible and so useful (since fighting locking inversions is a royal pain best avoided).

Comments?

Cheers, Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Aaron Plattner

10:07 p.m.

New subject: [PATCH 1/4] drm: add prime helpers

On 12/07/2012 01:38 PM, Daniel Vetter wrote:

...

On Fri, Dec 7, 2012 at 9:33 PM, Aaron Plattner aplattner@nvidia.com wrote:

...
Ah, I see what you mean. This would also need either a pointer to the drv directly so the helpers can lock drv->struct_mutex, or a helper function to convert from a drm_prime_helper_obj* to a gem_buffer_object* in a driver-specific way since the helper doesn't know the layout of the driver's bo structure.

Ah, locking, and it involves dev->struct_mutex. Another pet peeve of mine ;-) Tbh I haven't looked at locking issues when stitching together my quick proposal.

The problem with the dev->struct_mutex lock is that it's everywhere and hence it's way to easy to create deadlocks with it. Especially with prime, where a simple rule like "take this lock first, always" are suddenly really more complicated since gem drivers can assume both the importer and exporter role ... I haven't done a full locking review, but I expect current prime to deadlock (through driver calls) when userspace starts doing funny things.

So imo it's best to move dev->struct_mutex as far away as possible from helper functions and think really hard whether we really need it. I've noticed three functions:

drm_gem_map_dma_buf: We can remove the locking here imo, if drivers need dev->struct_mutex for internal consistency, they can take it in their callbacks. The dma_buf_attachment is very much like a fancy sg list + backing storage pointer. If multiple callers concurrently interact with the same attachment, havoc might ensue. Importers should add their own locking if concurrent access is possible (e.g. similar to how filling in the same sg table concurrently and then calling dma_map_sg concurrently would lead to chaos). Otoh creating/destroying attachments with attach/detach is protected by the dma_buf->lock.

I've checked dma-buf docs, and that this is not specced in the docs, but was very much the intention. So I think we can solve this with a simple doc update ;-)

drm_gem_dmabuf_v(un)map: Beside that drivers might need to lock for internal consistency the lock only protects the vmapping_counter and pointer and avoids that we race concurrent vmap/vunmap calls to the driver. Now vmap/vunmap is very much like an attachment, just that we don't have an explicit struct for each client since there's only one cpu address space.

So pretty much every driver is forced to reinvent vmap refcounting, which feels like an interface bug. What about moving this code to the dma-buf.c interface shim and protecting it with dma_buf->lock? Atm we only take that lock for attach/detach, and drivers using vmap place the vmap call at the same spot hw drivers would place the attach call, so this shouldn't introduce any nefarious locking issues. So I think this would be the right way to move forward.

And with that we've weaned the prime helpers off their dependency of dev->struct_mutex, which should make them a notch more flexible and so useful (since fighting locking inversions is a royal pain best avoided).

Comments?

This all sounds good, but it's getting well outside the realm of what I'm comfortable doing in the short term as a kernel noob. Would you object to going with a version of these changes that leaves the new hooks where they are now and then applying these suggestions later? I don't think the risk of the helpers turning into a required mid-layer is very high given that i915 won't use them and it's one of the most-tested drivers.

-- Aaron

Aaron Plattner

6 Dec 6 Dec

6:07 p.m.

New subject: [PATCH 2/4] drm/nouveau: use prime helpers

Simplify the Nouveau prime implementation by using the default behavior provided by drm_gem_prime_import and drm_gem_prime_export.

Signed-off-by: Aaron Plattner aplattner@nvidia.com --- drivers/gpu/drm/nouveau/nouveau_bo.h | 1 - drivers/gpu/drm/nouveau/nouveau_drm.c | 9 +- drivers/gpu/drm/nouveau/nouveau_gem.c | 2 - drivers/gpu/drm/nouveau/nouveau_gem.h | 11 +- drivers/gpu/drm/nouveau/nouveau_prime.c | 172 ++++---------------------------- 5 files changed, 35 insertions(+), 160 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h index dec51b1..1793631 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.h +++ b/drivers/gpu/drm/nouveau/nouveau_bo.h @@ -31,7 +31,6 @@ struct nouveau_bo { int pin_refcnt;

struct ttm_bo_kmap_obj dma_buf_vmap; - int vmapping_count; };

static inline struct nouveau_bo * diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index a7529b3..7bbe1c4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -644,8 +644,13 @@ driver = {

.prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = nouveau_gem_prime_export, - .gem_prime_import = nouveau_gem_prime_import, + .gem_prime_export = drm_gem_prime_export, + .gem_prime_import = drm_gem_prime_import, + .gem_prime_pin = nouveau_gem_prime_pin, + .gem_prime_get_pages = nouveau_gem_prime_get_pages, + .gem_prime_import_sg = nouveau_gem_prime_import_sg, + .gem_prime_vmap = nouveau_gem_prime_vmap, + .gem_prime_vunmap = nouveau_gem_prime_vunmap,

.gem_init_object = nouveau_gem_object_new, .gem_free_object = nouveau_gem_object_del, diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index 5e2f521..9c01f7c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -24,8 +24,6 @@ * */

-#include <linux/dma-buf.h> - #include <subdev/fb.h>

#include "nouveau_drm.h" diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.h b/drivers/gpu/drm/nouveau/nouveau_gem.h index 5c10492..195e23c 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.h +++ b/drivers/gpu/drm/nouveau/nouveau_gem.h @@ -35,9 +35,12 @@ extern int nouveau_gem_ioctl_cpu_fini(struct drm_device *, void *, extern int nouveau_gem_ioctl_info(struct drm_device *, void *, struct drm_file *);

-extern struct dma_buf *nouveau_gem_prime_export(struct drm_device *dev, - struct drm_gem_object *obj, int flags); -extern struct drm_gem_object *nouveau_gem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf); +extern int nouveau_gem_prime_pin(struct drm_gem_object *); +extern struct sg_table *nouveau_gem_prime_get_pages(struct drm_gem_object *); +extern struct drm_gem_object *nouveau_gem_prime_import_sg(struct drm_device *, + size_t size, + struct sg_table *); +extern void *nouveau_gem_prime_vmap(struct drm_gem_object *); +extern void nouveau_gem_prime_vunmap(struct drm_gem_object *);

#endif diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c b/drivers/gpu/drm/nouveau/nouveau_prime.c index 3543fec..c3683f0 100644 --- a/drivers/gpu/drm/nouveau/nouveau_prime.c +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c @@ -22,126 +22,42 @@ * Authors: Dave Airlie */

-#include <linux/dma-buf.h> - #include <drm/drmP.h>

#include "nouveau_drm.h" #include "nouveau_gem.h"

-static struct sg_table *nouveau_gem_map_dma_buf(struct dma_buf_attachment *attachment, - enum dma_data_direction dir) +struct sg_table *nouveau_gem_prime_get_pages(struct drm_gem_object *obj) { - struct nouveau_bo *nvbo = attachment->dmabuf->priv; - struct drm_device *dev = nvbo->gem->dev; + struct nouveau_bo *nvbo = nouveau_gem_object(obj); int npages = nvbo->bo.num_pages; - struct sg_table *sg; - int nents; - - mutex_lock(&dev->struct_mutex); - sg = drm_prime_pages_to_sg(nvbo->bo.ttm->pages, npages); - nents = dma_map_sg(attachment->dev, sg->sgl, sg->nents, dir); - mutex_unlock(&dev->struct_mutex); - return sg; -} - -static void nouveau_gem_unmap_dma_buf(struct dma_buf_attachment *attachment, - struct sg_table *sg, enum dma_data_direction dir) -{ - dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir); - sg_free_table(sg); - kfree(sg); -} - -static void nouveau_gem_dmabuf_release(struct dma_buf *dma_buf) -{ - struct nouveau_bo *nvbo = dma_buf->priv; - - if (nvbo->gem->export_dma_buf == dma_buf) { - nvbo->gem->export_dma_buf = NULL; - drm_gem_object_unreference_unlocked(nvbo->gem); - } -} - -static void *nouveau_gem_kmap_atomic(struct dma_buf *dma_buf, unsigned long page_num) -{ - return NULL; -} - -static void nouveau_gem_kunmap_atomic(struct dma_buf *dma_buf, unsigned long page_num, void *addr) -{

-} -static void *nouveau_gem_kmap(struct dma_buf *dma_buf, unsigned long page_num) -{ - return NULL; + return drm_prime_pages_to_sg(nvbo->bo.ttm->pages, npages); }

-static void nouveau_gem_kunmap(struct dma_buf *dma_buf, unsigned long page_num, void *addr) +void *nouveau_gem_prime_vmap(struct drm_gem_object *obj) { - -} - -static int nouveau_gem_prime_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma) -{ - return -EINVAL; -} - -static void *nouveau_gem_prime_vmap(struct dma_buf *dma_buf) -{ - struct nouveau_bo *nvbo = dma_buf->priv; - struct drm_device *dev = nvbo->gem->dev; + struct nouveau_bo *nvbo = nouveau_gem_object(obj); int ret;

- mutex_lock(&dev->struct_mutex); - if (nvbo->vmapping_count) { - nvbo->vmapping_count++; - goto out_unlock; - } - ret = ttm_bo_kmap(&nvbo->bo, 0, nvbo->bo.num_pages, &nvbo->dma_buf_vmap); - if (ret) { - mutex_unlock(&dev->struct_mutex); + if (ret) return ERR_PTR(ret); - } - nvbo->vmapping_count = 1; -out_unlock: - mutex_unlock(&dev->struct_mutex); + return nvbo->dma_buf_vmap.virtual; }

-static void nouveau_gem_prime_vunmap(struct dma_buf *dma_buf, void *vaddr) +void nouveau_gem_prime_vunmap(struct drm_gem_object *obj) { - struct nouveau_bo *nvbo = dma_buf->priv; - struct drm_device *dev = nvbo->gem->dev; + struct nouveau_bo *nvbo = nouveau_gem_object(obj);

- mutex_lock(&dev->struct_mutex); - nvbo->vmapping_count--; - if (nvbo->vmapping_count == 0) { - ttm_bo_kunmap(&nvbo->dma_buf_vmap); - } - mutex_unlock(&dev->struct_mutex); + ttm_bo_kunmap(&nvbo->dma_buf_vmap); }

-static const struct dma_buf_ops nouveau_dmabuf_ops = { - .map_dma_buf = nouveau_gem_map_dma_buf, - .unmap_dma_buf = nouveau_gem_unmap_dma_buf, - .release = nouveau_gem_dmabuf_release, - .kmap = nouveau_gem_kmap, - .kmap_atomic = nouveau_gem_kmap_atomic, - .kunmap = nouveau_gem_kunmap, - .kunmap_atomic = nouveau_gem_kunmap_atomic, - .mmap = nouveau_gem_prime_mmap, - .vmap = nouveau_gem_prime_vmap, - .vunmap = nouveau_gem_prime_vunmap, -}; - -static int -nouveau_prime_new(struct drm_device *dev, - size_t size, - struct sg_table *sg, - struct nouveau_bo **pnvbo) +struct drm_gem_object *nouveau_gem_prime_import_sg(struct drm_device *dev, + size_t size, + struct sg_table *sg) { struct nouveau_bo *nvbo; u32 flags = 0; @@ -150,24 +66,22 @@ nouveau_prime_new(struct drm_device *dev, flags = TTM_PL_FLAG_TT;

ret = nouveau_bo_new(dev, size, 0, flags, 0, 0, - sg, pnvbo); + sg, &nvbo); if (ret) - return ret; - nvbo = *pnvbo; + return ERR_PTR(ret);

nvbo->valid_domains = NOUVEAU_GEM_DOMAIN_GART; nvbo->gem = drm_gem_object_alloc(dev, nvbo->bo.mem.size); if (!nvbo->gem) { - nouveau_bo_ref(NULL, pnvbo); - return -ENOMEM; + nouveau_bo_ref(NULL, &nvbo); + return ERR_PTR(-ENOMEM); }

nvbo->gem->driver_private = nvbo; - return 0; + return nvbo->gem; }

-struct dma_buf *nouveau_gem_prime_export(struct drm_device *dev, - struct drm_gem_object *obj, int flags) +int nouveau_gem_prime_pin(struct drm_gem_object *obj) { struct nouveau_bo *nvbo = nouveau_gem_object(obj); int ret = 0; @@ -175,51 +89,7 @@ struct dma_buf *nouveau_gem_prime_export(struct drm_device *dev, /* pin buffer into GTT */ ret = nouveau_bo_pin(nvbo, TTM_PL_FLAG_TT); if (ret) - return ERR_PTR(-EINVAL); - - return dma_buf_export(nvbo, &nouveau_dmabuf_ops, obj->size, flags); -} - -struct drm_gem_object *nouveau_gem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct dma_buf_attachment *attach; - struct sg_table *sg; - struct nouveau_bo *nvbo; - int ret; - - if (dma_buf->ops == &nouveau_dmabuf_ops) { - nvbo = dma_buf->priv; - if (nvbo->gem) { - if (nvbo->gem->dev == dev) { - drm_gem_object_reference(nvbo->gem); - return nvbo->gem; - } - } - } - /* need to attach */ - attach = dma_buf_attach(dma_buf, dev->dev); - if (IS_ERR(attach)) - return ERR_PTR(PTR_ERR(attach)); - - sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); - if (IS_ERR(sg)) { - ret = PTR_ERR(sg); - goto fail_detach; - } - - ret = nouveau_prime_new(dev, dma_buf->size, sg, &nvbo); - if (ret) - goto fail_unmap; - - nvbo->gem->import_attach = attach; - - return nvbo->gem; + return -EINVAL;

-fail_unmap: - dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL); -fail_detach: - dma_buf_detach(dma_buf, attach); - return ERR_PTR(ret); + return 0; } -

-- 1.8.0.1

Aaron Plattner

6:07 p.m.

New subject: [PATCH 3/4] drm/radeon: use prime helpers

Simplify the Radeon prime implementation by using the default behavior provided by drm_gem_prime_import and drm_gem_prime_export.

Signed-off-by: Aaron Plattner aplattner@nvidia.com --- drivers/gpu/drm/radeon/radeon_drv.c | 17 ++-- drivers/gpu/drm/radeon/radeon_prime.c | 169 +++++----------------------------- 2 files changed, 31 insertions(+), 155 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 8c1a83c..a724c3c 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -113,11 +113,11 @@ int radeon_mode_dumb_create(struct drm_file *file_priv, int radeon_mode_dumb_destroy(struct drm_file *file_priv, struct drm_device *dev, uint32_t handle); -struct dma_buf *radeon_gem_prime_export(struct drm_device *dev, - struct drm_gem_object *obj, - int flags); -struct drm_gem_object *radeon_gem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf); +struct sg_table *radeon_gem_prime_get_pages(struct drm_gem_object *obj); +struct drm_gem_object *radeon_gem_prime_import_sg(struct drm_device *dev, + size_t size, + struct sg_table *sg); +int radeon_gem_prime_pin(struct drm_gem_object *obj);

#if defined(CONFIG_DEBUG_FS) int radeon_debugfs_init(struct drm_minor *minor); @@ -392,8 +392,11 @@ static struct drm_driver kms_driver = {

.prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_import = radeon_gem_prime_import, + .gem_prime_export = drm_gem_prime_export, + .gem_prime_import = drm_gem_prime_import, + .gem_prime_pin = radeon_gem_prime_pin, + .gem_prime_get_pages = radeon_gem_prime_get_pages, + .gem_prime_import_sg = radeon_gem_prime_import_sg,

.name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_prime.c b/drivers/gpu/drm/radeon/radeon_prime.c index e095218..3a047c7 100644 --- a/drivers/gpu/drm/radeon/radeon_prime.c +++ b/drivers/gpu/drm/radeon/radeon_prime.c @@ -28,198 +28,71 @@ #include "radeon.h" #include <drm/radeon_drm.h>

-#include <linux/dma-buf.h> - -static struct sg_table *radeon_gem_map_dma_buf(struct dma_buf_attachment *attachment, - enum dma_data_direction dir) +struct sg_table *radeon_gem_prime_get_pages(struct drm_gem_object *obj) { - struct radeon_bo *bo = attachment->dmabuf->priv; - struct drm_device *dev = bo->rdev->ddev; + struct radeon_bo *bo = gem_to_radeon_bo(obj); int npages = bo->tbo.num_pages; - struct sg_table *sg; - int nents; - - mutex_lock(&dev->struct_mutex); - sg = drm_prime_pages_to_sg(bo->tbo.ttm->pages, npages); - nents = dma_map_sg(attachment->dev, sg->sgl, sg->nents, dir); - mutex_unlock(&dev->struct_mutex); - return sg; -} - -static void radeon_gem_unmap_dma_buf(struct dma_buf_attachment *attachment, - struct sg_table *sg, enum dma_data_direction dir) -{ - dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir); - sg_free_table(sg); - kfree(sg); -} - -static void radeon_gem_dmabuf_release(struct dma_buf *dma_buf) -{ - struct radeon_bo *bo = dma_buf->priv; - - if (bo->gem_base.export_dma_buf == dma_buf) { - DRM_ERROR("unreference dmabuf %p\n", &bo->gem_base); - bo->gem_base.export_dma_buf = NULL; - drm_gem_object_unreference_unlocked(&bo->gem_base); - } -} - -static void *radeon_gem_kmap_atomic(struct dma_buf *dma_buf, unsigned long page_num) -{ - return NULL; -} - -static void radeon_gem_kunmap_atomic(struct dma_buf *dma_buf, unsigned long page_num, void *addr) -{ - -} -static void *radeon_gem_kmap(struct dma_buf *dma_buf, unsigned long page_num) -{ - return NULL; -} - -static void radeon_gem_kunmap(struct dma_buf *dma_buf, unsigned long page_num, void *addr) -{

+ return drm_prime_pages_to_sg(bo->tbo.ttm->pages, npages); }

-static int radeon_gem_prime_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma) +void *radeon_gem_prime_vmap(struct drm_gem_object *obj) { - return -EINVAL; -} - -static void *radeon_gem_prime_vmap(struct dma_buf *dma_buf) -{ - struct radeon_bo *bo = dma_buf->priv; - struct drm_device *dev = bo->rdev->ddev; + struct radeon_bo *bo = gem_to_radeon_bo(obj); int ret;

- mutex_lock(&dev->struct_mutex); - if (bo->vmapping_count) { - bo->vmapping_count++; - goto out_unlock; - } - ret = ttm_bo_kmap(&bo->tbo, 0, bo->tbo.num_pages, &bo->dma_buf_vmap); - if (ret) { - mutex_unlock(&dev->struct_mutex); + if (ret) return ERR_PTR(ret); - } - bo->vmapping_count = 1; -out_unlock: - mutex_unlock(&dev->struct_mutex); + return bo->dma_buf_vmap.virtual; }

-static void radeon_gem_prime_vunmap(struct dma_buf *dma_buf, void *vaddr) +void radeon_gem_prime_vunmap(struct drm_gem_object *obj) { - struct radeon_bo *bo = dma_buf->priv; - struct drm_device *dev = bo->rdev->ddev; + struct radeon_bo *bo = gem_to_radeon_bo(obj);

- mutex_lock(&dev->struct_mutex); - bo->vmapping_count--; - if (bo->vmapping_count == 0) { - ttm_bo_kunmap(&bo->dma_buf_vmap); - } - mutex_unlock(&dev->struct_mutex); + ttm_bo_kunmap(&bo->dma_buf_vmap); } -const static struct dma_buf_ops radeon_dmabuf_ops = { - .map_dma_buf = radeon_gem_map_dma_buf, - .unmap_dma_buf = radeon_gem_unmap_dma_buf, - .release = radeon_gem_dmabuf_release, - .kmap = radeon_gem_kmap, - .kmap_atomic = radeon_gem_kmap_atomic, - .kunmap = radeon_gem_kunmap, - .kunmap_atomic = radeon_gem_kunmap_atomic, - .mmap = radeon_gem_prime_mmap, - .vmap = radeon_gem_prime_vmap, - .vunmap = radeon_gem_prime_vunmap, -}; - -static int radeon_prime_create(struct drm_device *dev, - size_t size, - struct sg_table *sg, - struct radeon_bo **pbo) + +struct drm_gem_object *radeon_gem_prime_import_sg(struct drm_device *dev, + size_t size, + struct sg_table *sg) { struct radeon_device *rdev = dev->dev_private; struct radeon_bo *bo; int ret;

ret = radeon_bo_create(rdev, size, PAGE_SIZE, false, - RADEON_GEM_DOMAIN_GTT, sg, pbo); + RADEON_GEM_DOMAIN_GTT, sg, &bo); if (ret) - return ret; - bo = *pbo; + return ERR_PTR(ret); bo->gem_base.driver_private = bo;

mutex_lock(&rdev->gem.mutex); list_add_tail(&bo->list, &rdev->gem.objects); mutex_unlock(&rdev->gem.mutex);

- return 0; + return &bo->gem_base; }

-struct dma_buf *radeon_gem_prime_export(struct drm_device *dev, - struct drm_gem_object *obj, - int flags) +int radeon_gem_prime_pin(struct drm_gem_object *obj) { struct radeon_bo *bo = gem_to_radeon_bo(obj); int ret = 0;

ret = radeon_bo_reserve(bo, false); if (unlikely(ret != 0)) - return ERR_PTR(ret); + return ret;

/* pin buffer into GTT */ ret = radeon_bo_pin(bo, RADEON_GEM_DOMAIN_GTT, NULL); if (ret) { radeon_bo_unreserve(bo); - return ERR_PTR(ret); + return ret; } radeon_bo_unreserve(bo); - return dma_buf_export(bo, &radeon_dmabuf_ops, obj->size, flags); -}

-struct drm_gem_object *radeon_gem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct dma_buf_attachment *attach; - struct sg_table *sg; - struct radeon_bo *bo; - int ret; - - if (dma_buf->ops == &radeon_dmabuf_ops) { - bo = dma_buf->priv; - if (bo->gem_base.dev == dev) { - drm_gem_object_reference(&bo->gem_base); - return &bo->gem_base; - } - } - - /* need to attach */ - attach = dma_buf_attach(dma_buf, dev->dev); - if (IS_ERR(attach)) - return ERR_CAST(attach); - - sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); - if (IS_ERR(sg)) { - ret = PTR_ERR(sg); - goto fail_detach; - } - - ret = radeon_prime_create(dev, dma_buf->size, sg, &bo); - if (ret) - goto fail_unmap; - - bo->gem_base.import_attach = attach; - - return &bo->gem_base; - -fail_unmap: - dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL); -fail_detach: - dma_buf_detach(dma_buf, attach); - return ERR_PTR(ret); + return 0; }

-- 1.8.0.1

Aaron Plattner

6:07 p.m.

New subject: [PATCH 4/4] drm/exynos: use prime helpers

Simplify the Exynos prime implementation by using the default behavior provided by drm_gem_prime_import and drm_gem_prime_export.

Signed-off-by: Aaron Plattner aplattner@nvidia.com --- I still need to find a system to test this.

drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 151 ++--------------------------- drivers/gpu/drm/exynos/exynos_drm_dmabuf.h | 13 ++- drivers/gpu/drm/exynos/exynos_drm_drv.c | 2 + 3 files changed, 18 insertions(+), 148 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c index 539da9f..7f6610d 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c @@ -28,8 +28,6 @@ #include "exynos_drm_drv.h" #include "exynos_drm_gem.h"

-#include <linux/dma-buf.h> - static struct sg_table *exynos_get_sgt(struct drm_device *drm_dev, struct exynos_drm_gem_buf *buf) { @@ -56,15 +54,12 @@ out: return NULL; }

-static struct sg_table * - exynos_gem_map_dma_buf(struct dma_buf_attachment *attach, - enum dma_data_direction dir) +struct sg_table *exynos_gem_prime_get_pages(struct drm_gem_object *obj) { - struct exynos_drm_gem_obj *gem_obj = attach->dmabuf->priv; + struct exynos_drm_gem_obj *gem_obj = to_exynos_gem_obj(obj); struct drm_device *dev = gem_obj->base.dev; struct exynos_drm_gem_buf *buf; struct sg_table *sgt = NULL; - int nents;

DRM_DEBUG_PRIME("%s\n", __FILE__);

@@ -74,119 +69,20 @@ static struct sg_table * return sgt; }

- mutex_lock(&dev->struct_mutex); - sgt = exynos_get_sgt(dev, buf); if (!sgt) - goto err_unlock; - - nents = dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir); - if (!nents) { - DRM_ERROR("failed to map sgl with iommu.\n"); - sgt = NULL; - goto err_unlock; - } + goto err;

DRM_DEBUG_PRIME("buffer size = 0x%lx\n", buf->size);

-err_unlock: - mutex_unlock(&dev->struct_mutex); +err: return sgt; }

-static void exynos_gem_unmap_dma_buf(struct dma_buf_attachment *attach, - struct sg_table *sgt, - enum dma_data_direction dir) -{ - dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir); - - sg_free_table(sgt); - kfree(sgt); - sgt = NULL; -} - -static void exynos_dmabuf_release(struct dma_buf *dmabuf) +struct drm_gem_object *exynos_gem_prime_import_sg(struct drm_device *dev, + size_t size, + struct sg_table *sg) { - struct exynos_drm_gem_obj *exynos_gem_obj = dmabuf->priv; - - DRM_DEBUG_PRIME("%s\n", __FILE__); - - /* - * exynos_dmabuf_release() call means that file object's - * f_count is 0 and it calls drm_gem_object_handle_unreference() - * to drop the references that these values had been increased - * at drm_prime_handle_to_fd() - */ - if (exynos_gem_obj->base.export_dma_buf == dmabuf) { - exynos_gem_obj->base.export_dma_buf = NULL; - - /* - * drop this gem object refcount to release allocated buffer - * and resources. - */ - drm_gem_object_unreference_unlocked(&exynos_gem_obj->base); - } -} - -static void *exynos_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, - unsigned long page_num) -{ - /* TODO */ - - return NULL; -} - -static void exynos_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf, - unsigned long page_num, - void *addr) -{ - /* TODO */ -} - -static void *exynos_gem_dmabuf_kmap(struct dma_buf *dma_buf, - unsigned long page_num) -{ - /* TODO */ - - return NULL; -} - -static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf, - unsigned long page_num, void *addr) -{ - /* TODO */ -} - -static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf, - struct vm_area_struct *vma) -{ - return -ENOTTY; -} - -static struct dma_buf_ops exynos_dmabuf_ops = { - .map_dma_buf = exynos_gem_map_dma_buf, - .unmap_dma_buf = exynos_gem_unmap_dma_buf, - .kmap = exynos_gem_dmabuf_kmap, - .kmap_atomic = exynos_gem_dmabuf_kmap_atomic, - .kunmap = exynos_gem_dmabuf_kunmap, - .kunmap_atomic = exynos_gem_dmabuf_kunmap_atomic, - .mmap = exynos_gem_dmabuf_mmap, - .release = exynos_dmabuf_release, -}; - -struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev, - struct drm_gem_object *obj, int flags) -{ - struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj); - - return dma_buf_export(exynos_gem_obj, &exynos_dmabuf_ops, - exynos_gem_obj->base.size, 0600); -} - -struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, - struct dma_buf *dma_buf) -{ - struct dma_buf_attachment *attach; struct sg_table *sgt; struct scatterlist *sgl; struct exynos_drm_gem_obj *exynos_gem_obj; @@ -195,36 +91,10 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,

DRM_DEBUG_PRIME("%s\n", __FILE__);

- /* is this one of own objects? */ - if (dma_buf->ops == &exynos_dmabuf_ops) { - struct drm_gem_object *obj; - - exynos_gem_obj = dma_buf->priv; - obj = &exynos_gem_obj->base; - - /* is it from our device? */ - if (obj->dev == drm_dev) { - drm_gem_object_reference(obj); - return obj; - } - } - - attach = dma_buf_attach(dma_buf, drm_dev->dev); - if (IS_ERR(attach)) - return ERR_PTR(-EINVAL); - - - sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); - if (IS_ERR_OR_NULL(sgt)) { - ret = PTR_ERR(sgt); - goto err_buf_detach; - } - buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); if (!buffer) { DRM_ERROR("failed to allocate exynos_drm_gem_buf.\n"); - ret = -ENOMEM; - goto err_unmap_attach; + return ERR_PTR(-ENOMEM); }

exynos_gem_obj = exynos_drm_gem_init(drm_dev, dma_buf->size); @@ -253,7 +123,6 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,

exynos_gem_obj->buffer = buffer; buffer->sgt = sgt; - exynos_gem_obj->base.import_attach = attach;

DRM_DEBUG_PRIME("dma_addr = 0x%x, size = 0x%lx\n", buffer->dma_addr, buffer->size); @@ -263,10 +132,6 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, err_free_buffer: kfree(buffer); buffer = NULL; -err_unmap_attach: - dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL); -err_buf_detach: - dma_buf_detach(dma_buf, attach); return ERR_PTR(ret); }

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h index 662a8f9..f198183 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h @@ -27,13 +27,16 @@ #define _EXYNOS_DRM_DMABUF_H_

#ifdef CONFIG_DRM_EXYNOS_DMABUF -struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev, - struct drm_gem_object *obj, int flags); - -struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, - struct dma_buf *dma_buf); +#define exynos_dmabuf_prime_export drm_gem_prime_export +#define exynos_dmabuf_prime_import drm_gem_prime_import +struct sg_table *exynos_gem_prime_get_pages(struct drm_gem_object *obj); +struct drm_gem_object *exynos_gem_prime_import_sg(struct drm_device *dev, + size_t size, + struct sg_table *sg); #else #define exynos_dmabuf_prime_export NULL #define exynos_dmabuf_prime_import NULL +#define exynos_gem_prime_get_pages NULL +#define exynos_gem_prime_import_sg NULL #endif #endif diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c index 2b287d2..2a04e97 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c @@ -283,6 +283,8 @@ static struct drm_driver exynos_drm_driver = { .prime_fd_to_handle = drm_gem_prime_fd_to_handle, .gem_prime_export = exynos_dmabuf_prime_export, .gem_prime_import = exynos_dmabuf_prime_import, + .gem_prime_get_pages = exynos_gem_prime_get_pages, + .gem_prime_import_sg = exynos_gem_prime_import_sg, .ioctls = exynos_ioctls, .fops = &exynos_drm_driver_fops, .name = DRIVER_NAME,

-- 1.8.0.1

Aaron Plattner

6:48 p.m.

New subject: [PATCH v2] drm/exynos: use prime helpers

Simplify the Exynos prime implementation by using the default behavior provided by drm_gem_prime_import and drm_gem_prime_export.

Signed-off-by: Aaron Plattner aplattner@nvidia.com --- v2: fix dumb mistakes now that I have a toolchain that can compile-test this

drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 156 ++--------------------------- drivers/gpu/drm/exynos/exynos_drm_dmabuf.h | 13 ++- drivers/gpu/drm/exynos/exynos_drm_drv.c | 2 + 3 files changed, 20 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c index 539da9f..71b40f4 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c @@ -28,8 +28,6 @@ #include "exynos_drm_drv.h" #include "exynos_drm_gem.h"

-#include <linux/dma-buf.h> - static struct sg_table *exynos_get_sgt(struct drm_device *drm_dev, struct exynos_drm_gem_buf *buf) { @@ -56,15 +54,12 @@ out: return NULL; }

DRM_DEBUG_PRIME("%s\n", __FILE__);

@@ -74,120 +69,20 @@ static struct sg_table * return sgt; }

DRM_DEBUG_PRIME("buffer size = 0x%lx\n", buf->size);

-err_unlock: - mutex_unlock(&dev->struct_mutex); +err: return sgt; }

-static void exynos_gem_unmap_dma_buf(struct dma_buf_attachment *attach, - struct sg_table *sgt, - enum dma_data_direction dir) -{ - dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir); - - sg_free_table(sgt); - kfree(sgt); - sgt = NULL; -} - -static void exynos_dmabuf_release(struct dma_buf *dmabuf) +struct drm_gem_object *exynos_gem_prime_import_sg(struct drm_device *dev, + size_t size, + struct sg_table *sgt) { - struct exynos_drm_gem_obj *exynos_gem_obj = dmabuf->priv; - - DRM_DEBUG_PRIME("%s\n", __FILE__); - - /* - * exynos_dmabuf_release() call means that file object's - * f_count is 0 and it calls drm_gem_object_handle_unreference() - * to drop the references that these values had been increased - * at drm_prime_handle_to_fd() - */ - if (exynos_gem_obj->base.export_dma_buf == dmabuf) { - exynos_gem_obj->base.export_dma_buf = NULL; - - /* - * drop this gem object refcount to release allocated buffer - * and resources. - */ - drm_gem_object_unreference_unlocked(&exynos_gem_obj->base); - } -} - -static void *exynos_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf, - unsigned long page_num) -{ - /* TODO */ - - return NULL; -} - -static void exynos_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf, - unsigned long page_num, - void *addr) -{ - /* TODO */ -} - -static void *exynos_gem_dmabuf_kmap(struct dma_buf *dma_buf, - unsigned long page_num) -{ - /* TODO */ - - return NULL; -} - -static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf, - unsigned long page_num, void *addr) -{ - /* TODO */ -} - -static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf, - struct vm_area_struct *vma) -{ - return -ENOTTY; -} - -static struct dma_buf_ops exynos_dmabuf_ops = { - .map_dma_buf = exynos_gem_map_dma_buf, - .unmap_dma_buf = exynos_gem_unmap_dma_buf, - .kmap = exynos_gem_dmabuf_kmap, - .kmap_atomic = exynos_gem_dmabuf_kmap_atomic, - .kunmap = exynos_gem_dmabuf_kunmap, - .kunmap_atomic = exynos_gem_dmabuf_kunmap_atomic, - .mmap = exynos_gem_dmabuf_mmap, - .release = exynos_dmabuf_release, -}; - -struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev, - struct drm_gem_object *obj, int flags) -{ - struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj); - - return dma_buf_export(exynos_gem_obj, &exynos_dmabuf_ops, - exynos_gem_obj->base.size, 0600); -} - -struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, - struct dma_buf *dma_buf) -{ - struct dma_buf_attachment *attach; - struct sg_table *sgt; struct scatterlist *sgl; struct exynos_drm_gem_obj *exynos_gem_obj; struct exynos_drm_gem_buf *buffer; @@ -195,39 +90,13 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,

DRM_DEBUG_PRIME("%s\n", __FILE__);

- exynos_gem_obj = exynos_drm_gem_init(drm_dev, dma_buf->size); + exynos_gem_obj = exynos_drm_gem_init(dev, size); if (!exynos_gem_obj) { ret = -ENOMEM; goto err_free_buffer; @@ -235,7 +104,7 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,

sgl = sgt->sgl;

- buffer->size = dma_buf->size; + buffer->size = size; buffer->dma_addr = sg_dma_address(sgl);

if (sgt->nents == 1) { @@ -253,7 +122,6 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,

exynos_gem_obj->buffer = buffer; buffer->sgt = sgt; - exynos_gem_obj->base.import_attach = attach;

DRM_DEBUG_PRIME("dma_addr = 0x%x, size = 0x%lx\n", buffer->dma_addr, buffer->size); @@ -263,10 +131,6 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, err_free_buffer: kfree(buffer); buffer = NULL; -err_unmap_attach: - dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL); -err_buf_detach: - dma_buf_detach(dma_buf, attach); return ERR_PTR(ret); }

-- 1.8.0.1

Inki Dae

7 Dec 7 Dec

6:36 a.m.

New subject: [PATCH v2] drm/exynos: use prime helpers

Hi,

CCing media guys.

I agree with you but we should consider one issue released to v4l2.

As you may know, V4L2-based driver uses vb2 as buffer manager and the vb2 includes dmabuf feature>(import and export) And v4l2 uses streaming concept>(qbuf and dqbuf) With dmabuf and iommu, generally qbuf imports a fd into its own buffer and maps it with its own iommu table calling dma_buf_map_attachment(). And dqbuf calls dma_buf_unmap_attachment() to unmap that buffer from its own iommu table. But now vb2's unmap_dma_buf callback is nothing to do. I think that the reason is the below issue,

If qbuf maps buffer with iomm table and dqbuf unmaps it from iommu table then it has performance deterioration because qbuf and dqbuf are called repeatedly. And this means map/unmap are repeated also. So I think media guys moved dma_unmap_sg call from its own unmap_dma_buf callback to detach callback instead. For this, you can refer to vb2_dc_dmabuf_ops_unmap and vb2_dc_dmabuf_ops_detach function.

So I added the below patch to avoid that performance deterioration and am testing it now.(this patch is derived from videobuf2-dma-contig.c)

http://git.kernel.org/?p=linux/kernel/git/daeinki/drm-exynos.git;a=commit;h=...

Thus, I'm not sure that your common set could cover all the cases including other frameworks. Please give me any opinions.

Thanks, Inki Dae

2012/12/7 Aaron Plattner aplattner@nvidia.com

...

Simplify the Exynos prime implementation by using the default behavior provided by drm_gem_prime_import and drm_gem_prime_export.

Signed-off-by: Aaron Plattner aplattner@nvidia.com

v2: fix dumb mistakes now that I have a toolchain that can compile-test this

drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 156 ++--------------------------- drivers/gpu/drm/exynos/exynos_drm_dmabuf.h | 13 ++- drivers/gpu/drm/exynos/exynos_drm_drv.c | 2 + 3 files changed, 20 insertions(+), 151 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c index 539da9f..71b40f4 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c @@ -28,8 +28,6 @@ #include "exynos_drm_drv.h" #include "exynos_drm_gem.h"

-#include <linux/dma-buf.h>

static struct sg_table *exynos_get_sgt(struct drm_device *drm_dev, struct exynos_drm_gem_buf *buf) { @@ -56,15 +54,12 @@ out: return NULL; }

-static struct sg_table *
          exynos_gem_map_dma_buf(struct dma_buf_attachment *attach,
                                  enum dma_data_direction dir)
+struct sg_table *exynos_gem_prime_get_pages(struct drm_gem_object *obj) {
  struct exynos_drm_gem_obj *gem_obj = attach->dmabuf->priv;
  struct exynos_drm_gem_obj *gem_obj = to_exynos_gem_obj(obj);
  struct drm_device *dev = gem_obj->base.dev;
  struct exynos_drm_gem_buf *buf;
  struct sg_table *sgt = NULL;
  int nents;

  DRM_DEBUG_PRIME("%s\n", __FILE__);
@@ -74,120 +69,20 @@ static struct sg_table * return sgt; }
  mutex_lock(&dev->struct_mutex);
  sgt = exynos_get_sgt(dev, buf);
  if (!sgt)
          goto err_unlock;
  nents = dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir);
  if (!nents) {
          DRM_ERROR("failed to map sgl with iommu.\n");
          sgt = NULL;
          goto err_unlock;
  }
          goto err;

  DRM_DEBUG_PRIME("buffer size = 0x%lx\n", buf->size);
-err_unlock:
  mutex_unlock(&dev->struct_mutex);
+err: return sgt; }

-static void exynos_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
                                          struct sg_table *sgt,
                                          enum dma_data_direction
dir) -{
  dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
  sg_free_table(sgt);
  kfree(sgt);
  sgt = NULL;
-}

-static void exynos_dmabuf_release(struct dma_buf *dmabuf) +struct drm_gem_object *exynos_gem_prime_import_sg(struct drm_device *dev,
                                            size_t size,
                                            struct sg_table *sgt)
{
  struct exynos_drm_gem_obj *exynos_gem_obj = dmabuf->priv;
  DRM_DEBUG_PRIME("%s\n", __FILE__);
  /*
   * exynos_dmabuf_release() call means that file object's
   * f_count is 0 and it calls drm_gem_object_handle_unreference()
   * to drop the references that these values had been increased
   * at drm_prime_handle_to_fd()
   */
  if (exynos_gem_obj->base.export_dma_buf == dmabuf) {
          exynos_gem_obj->base.export_dma_buf = NULL;
          /*
           * drop this gem object refcount to release allocated
buffer
           * and resources.
           */
          drm_gem_object_unreference_unlocked(&exynos_gem_obj->base);
  }
-}

-static void *exynos_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf,
                                          unsigned long page_num)
-{
  /* TODO */
  return NULL;
-}

-static void exynos_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf,
                                          unsigned long page_num,
                                          void *addr)
-{
  /* TODO */
-}

-static void *exynos_gem_dmabuf_kmap(struct dma_buf *dma_buf,
                                  unsigned long page_num)
-{
  /* TODO */
  return NULL;
-}

-static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf,
                                  unsigned long page_num, void *addr)
-{
  /* TODO */
-}

-static int exynos_gem_dmabuf_mmap(struct dma_buf *dma_buf,
  struct vm_area_struct *vma)
-{
  return -ENOTTY;
-}

-static struct dma_buf_ops exynos_dmabuf_ops = {
  .map_dma_buf            = exynos_gem_map_dma_buf,
  .unmap_dma_buf          = exynos_gem_unmap_dma_buf,
  .kmap                   = exynos_gem_dmabuf_kmap,
  .kmap_atomic            = exynos_gem_dmabuf_kmap_atomic,
  .kunmap                 = exynos_gem_dmabuf_kunmap,
  .kunmap_atomic          = exynos_gem_dmabuf_kunmap_atomic,
  .mmap                   = exynos_gem_dmabuf_mmap,
  .release                = exynos_dmabuf_release,
-};

-struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev,
                          struct drm_gem_object *obj, int flags)
-{
  struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);
  return dma_buf_export(exynos_gem_obj, &exynos_dmabuf_ops,
                          exynos_gem_obj->base.size, 0600);
-}

-struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
                          struct dma_buf *dma_buf)
-{
  struct dma_buf_attachment *attach;
  struct sg_table *sgt;
  struct scatterlist *sgl;
  struct exynos_drm_gem_obj *exynos_gem_obj;
  struct exynos_drm_gem_buf *buffer;
@@ -195,39 +90,13 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
    DRM_DEBUG_PRIME("%s\n", __FILE__);
  /* is this one of own objects? */
  if (dma_buf->ops == &exynos_dmabuf_ops) {
          struct drm_gem_object *obj;
          exynos_gem_obj = dma_buf->priv;
          obj = &exynos_gem_obj->base;
          /* is it from our device? */
          if (obj->dev == drm_dev) {
                  drm_gem_object_reference(obj);
                  return obj;
          }
  }
  attach = dma_buf_attach(dma_buf, drm_dev->dev);
  if (IS_ERR(attach))
          return ERR_PTR(-EINVAL);
  sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
  if (IS_ERR_OR_NULL(sgt)) {
          ret = PTR_ERR(sgt);
          goto err_buf_detach;
  }
  buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
  if (!buffer) {
          DRM_ERROR("failed to allocate exynos_drm_gem_buf.\n");
          ret = -ENOMEM;
          goto err_unmap_attach;
          return ERR_PTR(-ENOMEM);
  }
  exynos_gem_obj = exynos_drm_gem_init(drm_dev, dma_buf->size);
  exynos_gem_obj = exynos_drm_gem_init(dev, size);
  if (!exynos_gem_obj) {
          ret = -ENOMEM;
          goto err_free_buffer;
@@ -235,7 +104,7 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
    sgl = sgt->sgl;
  buffer->size = dma_buf->size;
  buffer->size = size;
  buffer->dma_addr = sg_dma_address(sgl);

  if (sgt->nents == 1) {
@@ -253,7 +122,6 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
    exynos_gem_obj->buffer = buffer;
    buffer->sgt = sgt;
  exynos_gem_obj->base.import_attach = attach;

  DRM_DEBUG_PRIME("dma_addr = 0x%x, size = 0x%lx\n",
buffer->dma_addr,

buffer->size); @@ -263,10 +131,6 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, err_free_buffer: kfree(buffer); buffer = NULL; -err_unmap_attach:
  dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL);
-err_buf_detach:
  dma_buf_detach(dma_buf, attach);
  return ERR_PTR(ret);
}

diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h index 662a8f9..f198183 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h @@ -27,13 +27,16 @@ #define _EXYNOS_DRM_DMABUF_H_

#ifdef CONFIG_DRM_EXYNOS_DMABUF -struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev,
                          struct drm_gem_object *obj, int flags);
-struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
                                          struct dma_buf *dma_buf);
+#define exynos_dmabuf_prime_export drm_gem_prime_export +#define exynos_dmabuf_prime_import drm_gem_prime_import +struct sg_table *exynos_gem_prime_get_pages(struct drm_gem_object *obj); +struct drm_gem_object *exynos_gem_prime_import_sg(struct drm_device *dev,
                                            size_t size,
                                            struct sg_table *sg);
#else #define exynos_dmabuf_prime_export NULL #define exynos_dmabuf_prime_import NULL +#define exynos_gem_prime_get_pages NULL +#define exynos_gem_prime_import_sg NULL #endif #endif diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c index 2b287d2..2a04e97 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c @@ -283,6 +283,8 @@ static struct drm_driver exynos_drm_driver = { .prime_fd_to_handle = drm_gem_prime_fd_to_handle, .gem_prime_export = exynos_dmabuf_prime_export, .gem_prime_import = exynos_dmabuf_prime_import,
  .gem_prime_get_pages    = exynos_gem_prime_get_pages,
  .gem_prime_import_sg    = exynos_gem_prime_import_sg,
  .ioctls                 = exynos_ioctls,
  .fops                   = &exynos_drm_driver_fops,
  .name   = DRIVER_NAME,
-- 1.8.0.1

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

Daniel Vetter

8:11 a.m.

New subject: [PATCH v2] drm/exynos: use prime helpers

On Fri, Dec 7, 2012 at 7:36 AM, Inki Dae inki.dae@samsung.com wrote:

...

Thus, I'm not sure that your common set could cover all the cases including other frameworks. Please give me any opinions.

I don't think that's required - as long as it is fairly useable by many drivers a helper library is good enough. And since it's no midlayer there's no requirement to use it, you're always free to do your own special sauce in your driver. But I think the current rfc could be made a bit more flexible so that more drivers could use at least parts of it, see my other mail. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Aaron Plattner

5:48 p.m.

New subject: [PATCH v2] drm/exynos: use prime helpers

On 12/06/2012 10:36 PM, Inki Dae wrote:

...

Hi,

CCing media guys.

I agree with you but we should consider one issue released to v4l2.

As you may know, V4L2-based driver uses vb2 as buffer manager and the vb2 includes dmabuf feature>(import and export) And v4l2 uses streaming concept>(qbuf and dqbuf) With dmabuf and iommu, generally qbuf imports a fd into its own buffer and maps it with its own iommu table calling dma_buf_map_attachment(). And dqbuf calls dma_buf_unmap_attachment() to unmap that buffer from its own iommu table. But now vb2's unmap_dma_buf callback is nothing to do. I think that the reason is the below issue,

If qbuf maps buffer with iomm table and dqbuf unmaps it from iommu table then it has performance deterioration because qbuf and dqbuf are called repeatedly. And this means map/unmap are repeated also. So I think media guys moved dma_unmap_sg call from its own unmap_dma_buf callback to detach callback instead. For this, you can refer to vb2_dc_dmabuf_ops_unmap and vb2_dc_dmabuf_ops_detach function.

So I added the below patch to avoid that performance deterioration and am testing it now.(this patch is derived from videobuf2-dma-contig.c) http://git.kernel.org/?p=linux/kernel/git/daeinki/drm-exynos.git;a=commit;h=...

Thus, I'm not sure that your common set could cover all the cases including other frameworks. Please give me any opinions.

It seems like this adjustment would make perfect sense to add to the helper layer I suggested. E.g., instead of having an exynos_attach structure that caches the sgt, there'd be a struct drm_gem_prime_attach that would do the same thing, and save the sgt it gets from driver->gem_prime_get_sg. Then it would benefit nouveau and radeon, too.

Alternatively, patch #4 could be dropped and Exynos can continue to reimplement all of this core functionality, since the helpers are optional, but I don't see anything about this change that should make it Exynos-specific, unless I'm missing something.

-- Aaron

Inki Dae

10 Dec 10 Dec

4:33 a.m.

New subject: [PATCH v2] drm/exynos: use prime helpers

2012/12/8 Aaron Plattner aplattner@nvidia.com

...

On 12/06/2012 10:36 PM, Inki Dae wrote:

...
Hi,

CCing media guys.

I agree with you but we should consider one issue released to v4l2.

As you may know, V4L2-based driver uses vb2 as buffer manager and the vb2 includes dmabuf feature>(import and export) And v4l2 uses streaming concept>(qbuf and dqbuf) With dmabuf and iommu, generally qbuf imports a fd into its own buffer and maps it with its own iommu table calling dma_buf_map_attachment(). And dqbuf calls dma_buf_unmap_attachment() to unmap that buffer from its own iommu table. But now vb2's unmap_dma_buf callback is nothing to do. I think that the reason is the below issue,

If qbuf maps buffer with iomm table and dqbuf unmaps it from iommu table then it has performance deterioration because qbuf and dqbuf are called repeatedly. And this means map/unmap are repeated also. So I think media guys moved dma_unmap_sg call from its own unmap_dma_buf callback to detach callback instead. For this, you can refer to vb2_dc_dmabuf_ops_unmap and vb2_dc_dmabuf_ops_detach function.

So I added the below patch to avoid that performance deterioration and am testing it now.(this patch is derived from videobuf2-dma-contig.c) http://git.kernel.org/?p=**linux/kernel/git/daeinki/drm-** exynos.git;a=commit;h=**576b1c3de8b90cf1570b8418b60afd**1edaae4e30http://git.kernel.org/?p=linux/kernel/git/daeinki/drm-exynos.git;a=commit;h=576b1c3de8b90cf1570b8418b60afd1edaae4e30

Thus, I'm not sure that your common set could cover all the cases including other frameworks. Please give me any opinions.

It seems like this adjustment would make perfect sense to add to the helper layer I suggested. E.g., instead of having an exynos_attach structure that caches the sgt, there'd be a struct drm_gem_prime_attach that would do the same thing, and save the sgt it gets from driver->gem_prime_get_sg. Then it would benefit nouveau and radeon, too.

If this change is applied to common helper and also that could be accepted by other maintainers then I think it's better to use this common helper instead of specific one.

...

Alternatively, patch #4 could be dropped and Exynos can continue to reimplement all of this core functionality, since the helpers are optional, but I don't see anything about this change that should make it Exynos-specific,

I agree with you. I also think this change isn't specific to Exynos. But you need to check if this is a reasonable change for other drivers also.

Thanks, Inki Dae

...

unless I'm missing something.

...

Aaron

______________________________**_________________ dri-devel mailing list dri-devel@lists.freedesktop.**org dri-devel@lists.freedesktop.org http://lists.freedesktop.org/**mailman/listinfo/dri-devel http://lists.freedesktop.org/mailman/listinfo/dri-devel

4536

Age (days ago)

4540

Last active (days ago)

dri-devel@lists.freedesktop.org

17 comments

4 participants

tags (0)

participants (4)

Aaron Plattner
Daniel Vetter
Inki Dae
Maarten Lankhorst