A first version of this work had been sent by Tomi Valkeinen in may 2017 (https://www.spinics.net/lists/dri-devel/msg140663.html).
This series adds a few new OMAP_BO flags to help the userspace manage situations where it needs to use lots of buffers, and would currently run out of TILER space. The TILER space is limited to mapping 128MB at any given time and some applications might need more.
This seres is also the opportunity to do some cleanup in the flags and improve the comments describing them.
The user-space patches for libdrm, although ready, haven't been posted yet. It will be be done when this series have been discussed and hopefully in the process of getting merged.
JJ
changes in v4: - fixed incremental build issue introduced by patch #1 and later fixed by patch #2. - Added my reviewed-by to Tomis's patch
changes in v3: - rebase on top of v5.4-rc2 - Document omap_gem_new() and the new flags using the kernel-doc format
changes in v2: - fixed build error that crept in during rebase before sending (sorry about that) - rework the refcount part a bit.
Jean-Jacques Hiblot (1): drm/omap: use refcount API to track the number of users of dma_addr
Tomi Valkeinen (7): drm/omap: add omap_gem_unpin_locked() drm/omap: accept NULL for dma_addr in omap_gem_pin drm/omap: cleanup OMAP_BO flags drm/omap: remove OMAP_BO_TILED define drm/omap: cleanup OMAP_BO_SCANOUT use drm/omap: add omap_gem_validate_flags() drm/omap: add OMAP_BO flags to affect buffer allocation
drivers/gpu/drm/omapdrm/omap_dmm_tiler.h | 2 +- drivers/gpu/drm/omapdrm/omap_fb.c | 6 +- drivers/gpu/drm/omapdrm/omap_gem.c | 191 ++++++++++++++++------ drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 2 +- include/uapi/drm/omap_drm.h | 27 ++- 5 files changed, 164 insertions(+), 64 deletions(-)
This would give us a WARN_ON() if the pin/unpin calls are unbalanced.
Proposed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Jean-Jacques Hiblot jjhiblot@ti.com Reviewed-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 44 +++++++++++++++--------------- 1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 37378dbc50d0..5c97ff810b5d 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -65,7 +65,7 @@ struct omap_gem_object { /** * # of users of dma_addr */ - u32 dma_addr_cnt; + refcount_t dma_addr_cnt;
/** * If the buffer has been imported from a dmabuf the OMAP_DB_DMABUF flag @@ -771,13 +771,15 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr) mutex_lock(&omap_obj->lock);
if (!omap_gem_is_contiguous(omap_obj) && priv->has_dmm) { - if (omap_obj->dma_addr_cnt == 0) { + if (refcount_read(&omap_obj->dma_addr_cnt) == 0) { u32 npages = obj->size >> PAGE_SHIFT; enum tiler_fmt fmt = gem2fmt(omap_obj->flags); struct tiler_block *block;
BUG_ON(omap_obj->block);
+ refcount_set(&omap_obj->dma_addr_cnt, 1); + ret = omap_gem_attach_pages(obj); if (ret) goto fail; @@ -811,10 +813,10 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr) omap_obj->block = block;
DBG("got dma address: %pad", &omap_obj->dma_addr); + } else { + refcount_inc(&omap_obj->dma_addr_cnt); }
- omap_obj->dma_addr_cnt++; - *dma_addr = omap_obj->dma_addr; } else if (omap_gem_is_contiguous(omap_obj)) { *dma_addr = omap_obj->dma_addr; @@ -844,22 +846,19 @@ void omap_gem_unpin(struct drm_gem_object *obj)
mutex_lock(&omap_obj->lock);
- if (omap_obj->dma_addr_cnt > 0) { - omap_obj->dma_addr_cnt--; - if (omap_obj->dma_addr_cnt == 0) { - ret = tiler_unpin(omap_obj->block); - if (ret) { - dev_err(obj->dev->dev, - "could not unpin pages: %d\n", ret); - } - ret = tiler_release(omap_obj->block); - if (ret) { - dev_err(obj->dev->dev, - "could not release unmap: %d\n", ret); - } - omap_obj->dma_addr = 0; - omap_obj->block = NULL; + if (refcount_dec_and_test(&omap_obj->dma_addr_cnt)) { + ret = tiler_unpin(omap_obj->block); + if (ret) { + dev_err(obj->dev->dev, + "could not unpin pages: %d\n", ret); + } + ret = tiler_release(omap_obj->block); + if (ret) { + dev_err(obj->dev->dev, + "could not release unmap: %d\n", ret); } + omap_obj->dma_addr = 0; + omap_obj->block = NULL; }
mutex_unlock(&omap_obj->lock); @@ -877,7 +876,7 @@ int omap_gem_rotated_dma_addr(struct drm_gem_object *obj, u32 orient,
mutex_lock(&omap_obj->lock);
- if ((omap_obj->dma_addr_cnt > 0) && omap_obj->block && + if ((refcount_read(&omap_obj->dma_addr_cnt) > 0) && omap_obj->block && (omap_obj->flags & OMAP_BO_TILED)) { *dma_addr = tiler_tsptr(omap_obj->block, orient, x, y); ret = 0; @@ -1028,7 +1027,8 @@ void omap_gem_describe(struct drm_gem_object *obj, struct seq_file *m)
seq_printf(m, "%08x: %2d (%2d) %08llx %pad (%2d) %p %4d", omap_obj->flags, obj->name, kref_read(&obj->refcount), - off, &omap_obj->dma_addr, omap_obj->dma_addr_cnt, + off, &omap_obj->dma_addr, + refcount_read(&omap_obj->dma_addr_cnt), omap_obj->vaddr, omap_obj->roll);
if (omap_obj->flags & OMAP_BO_TILED) { @@ -1091,7 +1091,7 @@ void omap_gem_free_object(struct drm_gem_object *obj) mutex_lock(&omap_obj->lock);
/* The object should not be pinned. */ - WARN_ON(omap_obj->dma_addr_cnt > 0); + WARN_ON(refcount_read(&omap_obj->dma_addr_cnt) > 0);
if (omap_obj->pages) { if (omap_obj->flags & OMAP_BO_MEM_DMABUF)
From: Tomi Valkeinen tomi.valkeinen@ti.com
Add omap_gem_unpin_locked() which is a version of omap_gem_unpin() that expects the caller to hold the omap_obj lock.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 25 ++++++++++++++++++------- 1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 5c97ff810b5d..3071e815da73 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -832,20 +832,16 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr) }
/** - * omap_gem_unpin() - Unpin a GEM object from memory + * omap_gem_unpin_locked() - Unpin a GEM object from memory * @obj: the GEM object * - * Unpin the given GEM object previously pinned with omap_gem_pin(). Pins are - * reference-counted, the actualy unpin will only be performed when the number - * of calls to this function matches the number of calls to omap_gem_pin(). + * omap_gem_unpin() without locking. */ -void omap_gem_unpin(struct drm_gem_object *obj) +static void omap_gem_unpin_locked(struct drm_gem_object *obj) { struct omap_gem_object *omap_obj = to_omap_bo(obj); int ret;
- mutex_lock(&omap_obj->lock); - if (refcount_dec_and_test(&omap_obj->dma_addr_cnt)) { ret = tiler_unpin(omap_obj->block); if (ret) { @@ -860,7 +856,22 @@ void omap_gem_unpin(struct drm_gem_object *obj) omap_obj->dma_addr = 0; omap_obj->block = NULL; } +}
+/** + * omap_gem_unpin() - Unpin a GEM object from memory + * @obj: the GEM object + * + * Unpin the given GEM object previously pinned with omap_gem_pin(). Pins are + * reference-counted, the actual unpin will only be performed when the number + * of calls to this function matches the number of calls to omap_gem_pin(). + */ +void omap_gem_unpin(struct drm_gem_object *obj) +{ + struct omap_gem_object *omap_obj = to_omap_bo(obj); + + mutex_lock(&omap_obj->lock); + omap_gem_unpin_locked(obj); mutex_unlock(&omap_obj->lock); }
From: Tomi Valkeinen tomi.valkeinen@ti.com
Allow NULL to be passed in 'dma_addr' for omap_gem_pin(), in case the caller does not need the dma_addr.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 3071e815da73..2ac83cdbb15a 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -817,9 +817,11 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr) refcount_inc(&omap_obj->dma_addr_cnt); }
- *dma_addr = omap_obj->dma_addr; + if (dma_addr) + *dma_addr = omap_obj->dma_addr; } else if (omap_gem_is_contiguous(omap_obj)) { - *dma_addr = omap_obj->dma_addr; + if (dma_addr) + *dma_addr = omap_obj->dma_addr; } else { ret = -EINVAL; goto fail;
From: Tomi Valkeinen tomi.valkeinen@ti.com
Reorder OMAP_BO flags and improve the comments.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- include/uapi/drm/omap_drm.h | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/include/uapi/drm/omap_drm.h b/include/uapi/drm/omap_drm.h index 1fccffef9e27..d8ee2f840697 100644 --- a/include/uapi/drm/omap_drm.h +++ b/include/uapi/drm/omap_drm.h @@ -38,19 +38,20 @@ struct drm_omap_param { __u64 value; /* in (set_param), out (get_param) */ };
-#define OMAP_BO_SCANOUT 0x00000001 /* scanout capable (phys contiguous) */ -#define OMAP_BO_CACHE_MASK 0x00000006 /* cache type mask, see cache modes */ -#define OMAP_BO_TILED_MASK 0x00000f00 /* tiled mapping mask, see tiled modes */ +/* Scanout buffer, consumable by DSS */ +#define OMAP_BO_SCANOUT 0x00000001
-/* cache modes */ -#define OMAP_BO_CACHED 0x00000000 /* default */ -#define OMAP_BO_WC 0x00000002 /* write-combine */ -#define OMAP_BO_UNCACHED 0x00000004 /* strongly-ordered (uncached) */ +/* Buffer CPU caching mode: cached, write-combining or uncached. */ +#define OMAP_BO_CACHED 0x00000000 +#define OMAP_BO_WC 0x00000002 +#define OMAP_BO_UNCACHED 0x00000004 +#define OMAP_BO_CACHE_MASK 0x00000006
-/* tiled modes */ +/* Use TILER for the buffer. The TILER container unit can be 8, 16 or 32 bits. */ #define OMAP_BO_TILED_8 0x00000100 #define OMAP_BO_TILED_16 0x00000200 #define OMAP_BO_TILED_32 0x00000300 +#define OMAP_BO_TILED_MASK 0x00000f00 #define OMAP_BO_TILED (OMAP_BO_TILED_8 | OMAP_BO_TILED_16 | OMAP_BO_TILED_32)
union omap_gem_size {
From: Tomi Valkeinen tomi.valkeinen@ti.com
OMAP_BO_TILED does not make sense, as OMAP_BO_TILED_* values are not bitmasks but normal values. As we already have OMAP_BO_TILED_MASK for the mask, we can remove OMAP_BO_TILED and use OMAP_BO_TILED_MASK instead.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- drivers/gpu/drm/omapdrm/omap_dmm_tiler.h | 2 +- drivers/gpu/drm/omapdrm/omap_fb.c | 6 +++--- drivers/gpu/drm/omapdrm/omap_gem.c | 18 +++++++++--------- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 2 +- include/uapi/drm/omap_drm.h | 1 - 5 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.h b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.h index 835e6654fa82..43c1d096b021 100644 --- a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.h +++ b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.h @@ -113,7 +113,7 @@ extern struct platform_driver omap_dmm_driver; /* GEM bo flags -> tiler fmt */ static inline enum tiler_fmt gem2fmt(u32 flags) { - switch (flags & OMAP_BO_TILED) { + switch (flags & OMAP_BO_TILED_MASK) { case OMAP_BO_TILED_8: return TILFMT_8BIT; case OMAP_BO_TILED_16: diff --git a/drivers/gpu/drm/omapdrm/omap_fb.c b/drivers/gpu/drm/omapdrm/omap_fb.c index 7e89e5cb4068..8daf46fd5e05 100644 --- a/drivers/gpu/drm/omapdrm/omap_fb.c +++ b/drivers/gpu/drm/omapdrm/omap_fb.c @@ -95,7 +95,7 @@ static u32 get_linear_addr(struct drm_framebuffer *fb,
bool omap_framebuffer_supports_rotation(struct drm_framebuffer *fb) { - return omap_gem_flags(fb->obj[0]) & OMAP_BO_TILED; + return omap_gem_flags(fb->obj[0]) & OMAP_BO_TILED_MASK; }
/* Note: DRM rotates counter-clockwise, TILER & DSS rotates clockwise */ @@ -154,7 +154,7 @@ void omap_framebuffer_update_scanout(struct drm_framebuffer *fb, x = state->src_x >> 16; y = state->src_y >> 16;
- if (omap_gem_flags(fb->obj[0]) & OMAP_BO_TILED) { + if (omap_gem_flags(fb->obj[0]) & OMAP_BO_TILED_MASK) { u32 w = state->src_w >> 16; u32 h = state->src_h >> 16;
@@ -212,7 +212,7 @@ void omap_framebuffer_update_scanout(struct drm_framebuffer *fb, plane = &omap_fb->planes[1];
if (info->rotation_type == OMAP_DSS_ROT_TILER) { - WARN_ON(!(omap_gem_flags(fb->obj[1]) & OMAP_BO_TILED)); + WARN_ON(!(omap_gem_flags(fb->obj[1]) & OMAP_BO_TILED_MASK)); omap_gem_rotated_dma_addr(fb->obj[1], orient, x/2, y/2, &info->p_uv_addr); } else { diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 2ac83cdbb15a..07dba300ec07 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -194,7 +194,7 @@ static void omap_gem_evict(struct drm_gem_object *obj) struct omap_gem_object *omap_obj = to_omap_bo(obj); struct omap_drm_private *priv = obj->dev->dev_private;
- if (omap_obj->flags & OMAP_BO_TILED) { + if (omap_obj->flags & OMAP_BO_TILED_MASK) { enum tiler_fmt fmt = gem2fmt(omap_obj->flags); int i;
@@ -322,7 +322,7 @@ size_t omap_gem_mmap_size(struct drm_gem_object *obj) struct omap_gem_object *omap_obj = to_omap_bo(obj); size_t size = obj->size;
- if (omap_obj->flags & OMAP_BO_TILED) { + if (omap_obj->flags & OMAP_BO_TILED_MASK) { /* for tiled buffers, the virtual size has stride rounded up * to 4kb.. (to hide the fact that row n+1 might start 16kb or * 32kb later!). But we don't back the entire buffer with @@ -511,7 +511,7 @@ vm_fault_t omap_gem_fault(struct vm_fault *vmf) * probably trigger put_pages()? */
- if (omap_obj->flags & OMAP_BO_TILED) + if (omap_obj->flags & OMAP_BO_TILED_MASK) ret = omap_gem_fault_2d(obj, vma, vmf); else ret = omap_gem_fault_1d(obj, vma, vmf); @@ -784,7 +784,7 @@ int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr) if (ret) goto fail;
- if (omap_obj->flags & OMAP_BO_TILED) { + if (omap_obj->flags & OMAP_BO_TILED_MASK) { block = tiler_reserve_2d(fmt, omap_obj->width, omap_obj->height, 0); @@ -890,7 +890,7 @@ int omap_gem_rotated_dma_addr(struct drm_gem_object *obj, u32 orient, mutex_lock(&omap_obj->lock);
if ((refcount_read(&omap_obj->dma_addr_cnt) > 0) && omap_obj->block && - (omap_obj->flags & OMAP_BO_TILED)) { + (omap_obj->flags & OMAP_BO_TILED_MASK)) { *dma_addr = tiler_tsptr(omap_obj->block, orient, x, y); ret = 0; } @@ -905,7 +905,7 @@ int omap_gem_tiled_stride(struct drm_gem_object *obj, u32 orient) { struct omap_gem_object *omap_obj = to_omap_bo(obj); int ret = -EINVAL; - if (omap_obj->flags & OMAP_BO_TILED) + if (omap_obj->flags & OMAP_BO_TILED_MASK) ret = tiler_stride(gem2fmt(omap_obj->flags), orient); return ret; } @@ -1044,7 +1044,7 @@ void omap_gem_describe(struct drm_gem_object *obj, struct seq_file *m) refcount_read(&omap_obj->dma_addr_cnt), omap_obj->vaddr, omap_obj->roll);
- if (omap_obj->flags & OMAP_BO_TILED) { + if (omap_obj->flags & OMAP_BO_TILED_MASK) { seq_printf(m, " %dx%d", omap_obj->width, omap_obj->height); if (omap_obj->block) { struct tcm_area *area = &omap_obj->block->area; @@ -1143,7 +1143,7 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, int ret;
/* Validate the flags and compute the memory and cache flags. */ - if (flags & OMAP_BO_TILED) { + if (flags & OMAP_BO_TILED_MASK) { if (!priv->usergart) { dev_err(dev->dev, "Tiled buffers require DMM\n"); return NULL; @@ -1185,7 +1185,7 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, omap_obj->flags = flags; mutex_init(&omap_obj->lock);
- if (flags & OMAP_BO_TILED) { + if (flags & OMAP_BO_TILED_MASK) { /* * For tiled buffers align dimensions to slot boundaries and * calculate size based on aligned dimensions. diff --git a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c index 07c0b1b486f7..a4a4415e4666 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c +++ b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c @@ -64,7 +64,7 @@ static int omap_gem_dmabuf_begin_cpu_access(struct dma_buf *buffer, { struct drm_gem_object *obj = buffer->priv; struct page **pages; - if (omap_gem_flags(obj) & OMAP_BO_TILED) { + if (omap_gem_flags(obj) & OMAP_BO_TILED_MASK) { /* TODO we would need to pin at least part of the buffer to * get de-tiled view. For now just reject it. */ diff --git a/include/uapi/drm/omap_drm.h b/include/uapi/drm/omap_drm.h index d8ee2f840697..5a142fad473c 100644 --- a/include/uapi/drm/omap_drm.h +++ b/include/uapi/drm/omap_drm.h @@ -52,7 +52,6 @@ struct drm_omap_param { #define OMAP_BO_TILED_16 0x00000200 #define OMAP_BO_TILED_32 0x00000300 #define OMAP_BO_TILED_MASK 0x00000f00 -#define OMAP_BO_TILED (OMAP_BO_TILED_8 | OMAP_BO_TILED_16 | OMAP_BO_TILED_32)
union omap_gem_size { __u32 bytes; /* (for non-tiled formats) */
From: Tomi Valkeinen tomi.valkeinen@ti.com
omap_gem_new() has a comment about OMAP_BO_SCANOUT which does not make sense. Also, for the TILER case, we drop OMAP_BO_SCANOUT flag for some reason.
It's not clear what the original purpose of OMAP_BO_SCANOUT is, but presuming it means "scanout buffer, something that can be consumed by DSS", this patch cleans up the above issues.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 07dba300ec07..5c4cdf618347 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -1153,7 +1153,6 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, * Tiled buffers are always shmem paged backed. When they are * scanned out, they are remapped into DMM/TILER. */ - flags &= ~OMAP_BO_SCANOUT; flags |= OMAP_BO_MEM_SHMEM;
/* @@ -1164,9 +1163,8 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, flags |= tiler_get_cpu_cache_flags(); } else if ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm) { /* - * OMAP_BO_SCANOUT hints that the buffer doesn't need to be - * tiled. However, to lower the pressure on memory allocation, - * use contiguous memory only if no TILER is available. + * If we don't have DMM, we must allocate scanout buffers + * from contiguous DMA memory. */ flags |= OMAP_BO_MEM_DMA_API; } else if (!(flags & OMAP_BO_MEM_DMABUF)) {
From: Tomi Valkeinen tomi.valkeinen@ti.com
Add a helper function omap_gem_validate_flags() which validates the omap_bo flags passed from the userspace.
Also drop the dev_err() message, as the userspace can cause that at will.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 40 ++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index 5c4cdf618347..cafa9d0bf3c1 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -1131,6 +1131,38 @@ void omap_gem_free_object(struct drm_gem_object *obj) kfree(omap_obj); }
+static bool omap_gem_validate_flags(struct drm_device *dev, u32 flags) +{ + struct omap_drm_private *priv = dev->dev_private; + + switch (flags & OMAP_BO_CACHE_MASK) { + case OMAP_BO_CACHED: + case OMAP_BO_WC: + case OMAP_BO_CACHE_MASK: + break; + + default: + return false; + } + + if (flags & OMAP_BO_TILED_MASK) { + if (!priv->usergart) + return false; + + switch (flags & OMAP_BO_TILED_MASK) { + case OMAP_BO_TILED_8: + case OMAP_BO_TILED_16: + case OMAP_BO_TILED_32: + break; + + default: + return false; + } + } + + return true; +} + /* GEM buffer object constructor */ struct drm_gem_object *omap_gem_new(struct drm_device *dev, union omap_gem_size gsize, u32 flags) @@ -1142,13 +1174,11 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, size_t size; int ret;
+ if (!omap_gem_validate_flags(dev, flags)) + return NULL; + /* Validate the flags and compute the memory and cache flags. */ if (flags & OMAP_BO_TILED_MASK) { - if (!priv->usergart) { - dev_err(dev->dev, "Tiled buffers require DMM\n"); - return NULL; - } - /* * Tiled buffers are always shmem paged backed. When they are * scanned out, they are remapped into DMM/TILER.
From: Tomi Valkeinen tomi.valkeinen@ti.com
On SoCs with DMM/TILER, we have two ways to allocate buffers: normal dma_alloc or via DMM (which basically functions as an IOMMU). DMM can map 128MB at a time, and we only map the DMM buffers when they are used (i.e. not at alloc time). If DMM is present, omapdrm always uses DMM.
There are use cases that require lots of big buffers that are being used at the same time by different IPs. At the moment the userspace has a hard maximum of 128MB.
This patch adds three new flags that can be used by the userspace to solve the situation:
OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. This can be used to avoid DMM if the userspace knows it needs more than 128M of memory at the same time.
OMAP_BO_MEM_DMM: The driver will use DMM to get the memory. There's not much use for this flag at the moment, as on platforms with DMM it is used by default, but it's here for completeness.
OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and keep it pinned. This can be used to 1) get an error at alloc time if DMM space is full, and 2) get rid of the constant pin/unpin operations which may have some effect on performance.
If none of the flags are given, the behavior is the same as currently.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com Reviewed-by: Jean-Jacques Hiblot jjhiblot@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 54 ++++++++++++++++++++++++++++-- include/uapi/drm/omap_drm.h | 9 +++++ 2 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index cafa9d0bf3c1..9af2c9a4bd28 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -1095,6 +1095,9 @@ void omap_gem_free_object(struct drm_gem_object *obj) list_del(&omap_obj->mm_list); mutex_unlock(&priv->list_lock);
+ if (omap_obj->flags & OMAP_BO_MEM_PIN) + omap_gem_unpin_locked(obj); + /* * We own the sole reference to the object at this point, but to keep * lockdep happy, we must still take the omap_obj_lock to call @@ -1145,10 +1148,19 @@ static bool omap_gem_validate_flags(struct drm_device *dev, u32 flags) return false; }
+ if ((flags & OMAP_BO_MEM_CONTIG) && (flags & OMAP_BO_MEM_DMM)) + return false; + + if ((flags & OMAP_BO_MEM_DMM) && !priv->usergart) + return false; + if (flags & OMAP_BO_TILED_MASK) { if (!priv->usergart) return false;
+ if (flags & OMAP_BO_MEM_CONTIG) + return false; + switch (flags & OMAP_BO_TILED_MASK) { case OMAP_BO_TILED_8: case OMAP_BO_TILED_16: @@ -1163,7 +1175,34 @@ static bool omap_gem_validate_flags(struct drm_device *dev, u32 flags) return true; }
-/* GEM buffer object constructor */ +/** + * omap_gem_new() - Create a new GEM buffer + * @dev: The DRM device + * @gsize: The requested size for the GEM buffer. If the buffer is tiled + * (2D buffer), the size is a pair of values: height and width + * expressed in pixels. If the buffers is not tiled, it is expressed + * in bytes. + * @flags: Flags give additionnal information about the allocation: + * OMAP_BO_TILED_x: use the TILER (2D buffers). The TILER container + * unit can be 8, 16 or 32 bits. Cache is always disabled for + * tiled buffers. + * OMAP_BO_SCANOUT: Scannout buffer, consummable by the DSS + * OMAP_BO_CACHED: Buffer CPU caching mode: cached + * OMAP_BO_WC: Buffer CPU caching mode: write-combined + * OMAP_BO_UNCACHED: Buffer CPU caching mode: uncached + * OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. + * This can be used to avoid DMM if the userspace knows it needs + * more than 128M of memory at the same time. + * OMAP_BO_MEM_DMM: The driver will use DMM to get the memory. There's + * not much use for this flag at the moment, as on platforms with + * DMM it is used by default, but it's here for completeness. + * OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and + * keep it pinned. This can be used to 1) get an error at alloc + * time if DMM space is full, and 2) get rid of the constant + * pin/unpin operations which may have some effect on performance. + * + * Return: The GEM buffer or NULL if the allocation failed + */ struct drm_gem_object *omap_gem_new(struct drm_device *dev, union omap_gem_size gsize, u32 flags) { @@ -1191,7 +1230,8 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, */ flags &= ~(OMAP_BO_CACHED|OMAP_BO_WC|OMAP_BO_UNCACHED); flags |= tiler_get_cpu_cache_flags(); - } else if ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm) { + } else if ((flags & OMAP_BO_MEM_CONTIG) || + ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm)) { /* * If we don't have DMM, we must allocate scanout buffers * from contiguous DMA memory. @@ -1251,12 +1291,22 @@ struct drm_gem_object *omap_gem_new(struct drm_device *dev, goto err_release; }
+ if (flags & OMAP_BO_MEM_PIN) { + ret = omap_gem_pin(obj, NULL); + if (ret) + goto err_free_dma; + } + mutex_lock(&priv->list_lock); list_add(&omap_obj->mm_list, &priv->obj_list); mutex_unlock(&priv->list_lock);
return obj;
+err_free_dma: + if (flags & OMAP_BO_MEM_DMA_API) + dma_free_writecombine(dev->dev, size, + omap_obj->vaddr, omap_obj->dma_addr); err_release: drm_gem_object_release(obj); err_free: diff --git a/include/uapi/drm/omap_drm.h b/include/uapi/drm/omap_drm.h index 5a142fad473c..842d3180a442 100644 --- a/include/uapi/drm/omap_drm.h +++ b/include/uapi/drm/omap_drm.h @@ -47,6 +47,15 @@ struct drm_omap_param { #define OMAP_BO_UNCACHED 0x00000004 #define OMAP_BO_CACHE_MASK 0x00000006
+/* Force allocation from contiguous DMA memory */ +#define OMAP_BO_MEM_CONTIG 0x00000008 + +/* Force allocation via DMM */ +#define OMAP_BO_MEM_DMM 0x00000010 + +/* Pin the buffer when allocating and keep pinned */ +#define OMAP_BO_MEM_PIN 0x00000020 + /* Use TILER for the buffer. The TILER container unit can be 8, 16 or 32 bits. */ #define OMAP_BO_TILED_8 0x00000100 #define OMAP_BO_TILED_16 0x00000200
Hi Jean-Jacques,
I love your patch! Yet something to improve:
[auto build test ERROR on linus/master] [cannot apply to v5.4-rc2 next-20191010] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Jean-Jacques-Hiblot/drm-omap-OMAP_B... config: arm-allmodconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 7.4.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.4.0 make.cross ARCH=arm
If you fix the issue, kindly add following tag Reported-by: kbuild test robot lkp@intel.com
All errors (new ones prefixed by >>):
drivers/gpu/drm/omapdrm/omap_gem.c: In function 'omap_gem_new':
drivers/gpu/drm/omapdrm/omap_gem.c:1310:3: error: implicit declaration of function 'dma_free_writecombine'; did you mean 'pgprot_writecombine'? [-Werror=implicit-function-declaration]
dma_free_writecombine(dev->dev, size, ^~~~~~~~~~~~~~~~~~~~~ pgprot_writecombine cc1: some warnings being treated as errors
vim +1310 drivers/gpu/drm/omapdrm/omap_gem.c
1179 1180 /** 1181 * omap_gem_new() - Create a new GEM buffer 1182 * @dev: The DRM device 1183 * @gsize: The requested size for the GEM buffer. If the buffer is tiled 1184 * (2D buffer), the size is a pair of values: height and width 1185 * expressed in pixels. If the buffers is not tiled, it is expressed 1186 * in bytes. 1187 * @flags: Flags give additionnal information about the allocation: 1188 * OMAP_BO_TILED_x: use the TILER (2D buffers). The TILER container 1189 * unit can be 8, 16 or 32 bits. Cache is always disabled for 1190 * tiled buffers. 1191 * OMAP_BO_SCANOUT: Scannout buffer, consummable by the DSS 1192 * OMAP_BO_CACHED: Buffer CPU caching mode: cached 1193 * OMAP_BO_WC: Buffer CPU caching mode: write-combined 1194 * OMAP_BO_UNCACHED: Buffer CPU caching mode: uncached 1195 * OMAP_BO_MEM_CONTIG: The driver will use dma_alloc to get the memory. 1196 * This can be used to avoid DMM if the userspace knows it needs 1197 * more than 128M of memory at the same time. 1198 * OMAP_BO_MEM_DMM: The driver will use DMM to get the memory. There's 1199 * not much use for this flag at the moment, as on platforms with 1200 * DMM it is used by default, but it's here for completeness. 1201 * OMAP_BO_MEM_PIN: The driver will pin the memory at alloc time, and 1202 * keep it pinned. This can be used to 1) get an error at alloc 1203 * time if DMM space is full, and 2) get rid of the constant 1204 * pin/unpin operations which may have some effect on performance. 1205 * 1206 * Return: The GEM buffer or NULL if the allocation failed 1207 */ 1208 struct drm_gem_object *omap_gem_new(struct drm_device *dev, 1209 union omap_gem_size gsize, u32 flags) 1210 { 1211 struct omap_drm_private *priv = dev->dev_private; 1212 struct omap_gem_object *omap_obj; 1213 struct drm_gem_object *obj; 1214 struct address_space *mapping; 1215 size_t size; 1216 int ret; 1217 1218 if (!omap_gem_validate_flags(dev, flags)) 1219 return NULL; 1220 1221 /* Validate the flags and compute the memory and cache flags. */ 1222 if (flags & OMAP_BO_TILED_MASK) { 1223 /* 1224 * Tiled buffers are always shmem paged backed. When they are 1225 * scanned out, they are remapped into DMM/TILER. 1226 */ 1227 flags |= OMAP_BO_MEM_SHMEM; 1228 1229 /* 1230 * Currently don't allow cached buffers. There is some caching 1231 * stuff that needs to be handled better. 1232 */ 1233 flags &= ~(OMAP_BO_CACHED|OMAP_BO_WC|OMAP_BO_UNCACHED); 1234 flags |= tiler_get_cpu_cache_flags(); 1235 } else if ((flags & OMAP_BO_MEM_CONTIG) || 1236 ((flags & OMAP_BO_SCANOUT) && !priv->has_dmm)) { 1237 /* 1238 * If we don't have DMM, we must allocate scanout buffers 1239 * from contiguous DMA memory. 1240 */ 1241 flags |= OMAP_BO_MEM_DMA_API; 1242 } else if (!(flags & OMAP_BO_MEM_DMABUF)) { 1243 /* 1244 * All other buffers not backed by dma_buf are shmem-backed. 1245 */ 1246 flags |= OMAP_BO_MEM_SHMEM; 1247 } 1248 1249 /* Allocate the initialize the OMAP GEM object. */ 1250 omap_obj = kzalloc(sizeof(*omap_obj), GFP_KERNEL); 1251 if (!omap_obj) 1252 return NULL; 1253 1254 obj = &omap_obj->base; 1255 omap_obj->flags = flags; 1256 mutex_init(&omap_obj->lock); 1257 1258 if (flags & OMAP_BO_TILED_MASK) { 1259 /* 1260 * For tiled buffers align dimensions to slot boundaries and 1261 * calculate size based on aligned dimensions. 1262 */ 1263 tiler_align(gem2fmt(flags), &gsize.tiled.width, 1264 &gsize.tiled.height); 1265 1266 size = tiler_size(gem2fmt(flags), gsize.tiled.width, 1267 gsize.tiled.height); 1268 1269 omap_obj->width = gsize.tiled.width; 1270 omap_obj->height = gsize.tiled.height; 1271 } else { 1272 size = PAGE_ALIGN(gsize.bytes); 1273 } 1274 1275 /* Initialize the GEM object. */ 1276 if (!(flags & OMAP_BO_MEM_SHMEM)) { 1277 drm_gem_private_object_init(dev, obj, size); 1278 } else { 1279 ret = drm_gem_object_init(dev, obj, size); 1280 if (ret) 1281 goto err_free; 1282 1283 mapping = obj->filp->f_mapping; 1284 mapping_set_gfp_mask(mapping, GFP_USER | __GFP_DMA32); 1285 } 1286 1287 /* Allocate memory if needed. */ 1288 if (flags & OMAP_BO_MEM_DMA_API) { 1289 omap_obj->vaddr = dma_alloc_wc(dev->dev, size, 1290 &omap_obj->dma_addr, 1291 GFP_KERNEL); 1292 if (!omap_obj->vaddr) 1293 goto err_release; 1294 } 1295 1296 if (flags & OMAP_BO_MEM_PIN) { 1297 ret = omap_gem_pin(obj, NULL); 1298 if (ret) 1299 goto err_free_dma; 1300 } 1301 1302 mutex_lock(&priv->list_lock); 1303 list_add(&omap_obj->mm_list, &priv->obj_list); 1304 mutex_unlock(&priv->list_lock); 1305 1306 return obj; 1307 1308 err_free_dma: 1309 if (flags & OMAP_BO_MEM_DMA_API)
1310 dma_free_writecombine(dev->dev, size,
1311 omap_obj->vaddr, omap_obj->dma_addr); 1312 err_release: 1313 drm_gem_object_release(obj); 1314 err_free: 1315 kfree(omap_obj); 1316 return NULL; 1317 } 1318
--- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Hi JJ,
On 10/10/2019 12:34, Jean-Jacques Hiblot wrote:
A first version of this work had been sent by Tomi Valkeinen in may 2017 (https://www.spinics.net/lists/dri-devel/msg140663.html).
This series adds a few new OMAP_BO flags to help the userspace manage situations where it needs to use lots of buffers, and would currently run out of TILER space. The TILER space is limited to mapping 128MB at any given time and some applications might need more.
This seres is also the opportunity to do some cleanup in the flags and improve the comments describing them.
The user-space patches for libdrm, although ready, haven't been posted yet. It will be be done when this series have been discussed and hopefully in the process of getting merged.
JJ
changes in v4:
- fixed incremental build issue introduced by patch #1 and later fixed by patch #2.
- Added my reviewed-by to Tomis's patch
This doesn't compile on top of 5.4 as the last patch is using dma_free_writecombine instead of dma_free_wc. In v3, it was correct, but the changenotes don't mention the change.
Was there some mix up? What kernel are your patches based on?
Tomi
On 10/10/2019 11:45, Tomi Valkeinen wrote:
Hi JJ,
On 10/10/2019 12:34, Jean-Jacques Hiblot wrote:
A first version of this work had been sent by Tomi Valkeinen in may 2017 (https://www.spinics.net/lists/dri-devel/msg140663.html).
This series adds a few new OMAP_BO flags to help the userspace manage situations where it needs to use lots of buffers, and would currently run out of TILER space. The TILER space is limited to mapping 128MB at any given time and some applications might need more.
This seres is also the opportunity to do some cleanup in the flags and improve the comments describing them.
The user-space patches for libdrm, although ready, haven't been posted yet. It will be be done when this series have been discussed and hopefully in the process of getting merged.
JJ
changes in v4:
- fixed incremental build issue introduced by patch #1 and later
fixed by patch #2.
- Added my reviewed-by to Tomis's patch
This doesn't compile on top of 5.4 as the last patch is using dma_free_writecombine instead of dma_free_wc. In v3, it was correct, but the changenotes don't mention the change.
Was there some mix up? What kernel are your patches based on?
Yes I rushed an then got things mixed up, doing other stuff in parallel. Sorry about that.
Expect a fixed version based on v5.4-rc2 soon.
JJ
Tomi
dri-devel@lists.freedesktop.org