[PATCH 00/22] Tegra DRM fixes

List overview All Threads
Download

newer

older

[radeon-alex:amd-staging-4.11...

[PATCH 0/6] drm/msm: support...

Dmitry Osipenko

23 May 2017 23 May '17

12:14 a.m.

Hello,

I have already sent some of the patches contained in this series and some of them got reviews. Later I added couple more patches and the dependencies started to form, so please ignore all the patches I sent before this series. The patches without r-b signatures require a review.

Dmitry Osipenko (21): drm/tegra: Fix lockup on a use of staging API drm/tegra: Correct idr_alloc() minimum id drm/tegra: Check whether page belongs to BO in tegra_bo_kmap() drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL drm/tegra: Check syncpoint ID in the 'submit' IOCTL drm/tegra: Remove module ownership from the tegra_fb_ops drm/tegra: dc: Drop the reset asserts to workaround a bug drm/tegra: dc: Apply clipping to the plane drm/tegra: Disable plane if it is invisible gpu: host1x: Initialize firewall class to the jobs one gpu: host1x: Correct host1x_job_pin() error handling gpu: host1x: Do not leak BO's phys address to userspace gpu: host1x: Forbid relocation address shifting in the firewall gpu: host1x: Forbid RESTART opcode in the firewall gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall gpu: host1x: Check waits in the firewall gpu: host1x: Remove unused 'struct host1x_cmdbuf' gpu: host1x: Remove unused host1x_cdma_stop() definition drm/tegra: Don't use IOMMU on Tegra20 Revert "iommu/tegra: gart: Do not register with bus"

Mikko Perttunen (1): gpu: host1x: Refactor channel allocation code

-- 2.13.0

Show replies by date

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 01/22] drm/tegra: Fix lockup on a use of staging API

Commit bdd2f9cd ("Don't leak kernel pointer to userspace") added a mutex around staging IOCTL's, some of those mutexes are taken twice.

Fixes: bdd2f9cd10eb ("drm/tegra: Don't leak kernel pointer to userspace") Signed-off-by: Dmitry Osipenko digetx@gmail.com Reviewed-by: Mikko Perttunen mperttunen@nvidia.com --- drivers/gpu/drm/tegra/drm.c | 20 ++++---------------- 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 51c48a8e00ec..b49d32a89f40 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -451,18 +451,6 @@ int tegra_drm_submit(struct tegra_drm_context *context,

#ifdef CONFIG_DRM_TEGRA_STAGING -static struct tegra_drm_context * -tegra_drm_file_get_context(struct tegra_drm_file *file, u32 id) -{ - struct tegra_drm_context *context; - - mutex_lock(&file->lock); - context = idr_find(&file->contexts, id); - mutex_unlock(&file->lock); - - return context; -} - static int tegra_gem_create(struct drm_device *drm, void *data, struct drm_file *file) { @@ -606,7 +594,7 @@ static int tegra_close_channel(struct drm_device *drm, void *data,

mutex_lock(&fpriv->lock);

- context = tegra_drm_file_get_context(fpriv, args->context); + context = idr_find(&fpriv->contexts, args->context); if (!context) { err = -EINVAL; goto unlock; @@ -631,7 +619,7 @@ static int tegra_get_syncpt(struct drm_device *drm, void *data,

mutex_lock(&fpriv->lock);

- context = tegra_drm_file_get_context(fpriv, args->context); + context = idr_find(&fpriv->contexts, args->context); if (!context) { err = -ENODEV; goto unlock; @@ -660,7 +648,7 @@ static int tegra_submit(struct drm_device *drm, void *data,

mutex_lock(&fpriv->lock);

- context = tegra_drm_file_get_context(fpriv, args->context); + context = idr_find(&fpriv->contexts, args->context); if (!context) { err = -ENODEV; goto unlock; @@ -685,7 +673,7 @@ static int tegra_get_syncpt_base(struct drm_device *drm, void *data,

mutex_lock(&fpriv->lock);

- context = tegra_drm_file_get_context(fpriv, args->context); + context = idr_find(&fpriv->contexts, args->context); if (!context) { err = -ENODEV; goto unlock;

-- 2.13.0

Dmitry Osipenko

12:14 a.m.

New subject: [PATCH 02/22] drm/tegra: Correct idr_alloc() minimum id

The client ID 0 is reserved by the host1x/cdma to mark the timeout timer work as already been scheduled and context ID is used as the clients one. This fixes spurious CDMA timeouts.

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index b49d32a89f40..f9282da94a1f 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -539,7 +539,7 @@ static int tegra_client_open(struct tegra_drm_file *fpriv, if (err < 0) return err;

- err = idr_alloc(&fpriv->contexts, context, 0, 0, GFP_KERNEL); + err = idr_alloc(&fpriv->contexts, context, 1, 0, GFP_KERNEL); if (err < 0) { client->ops->close_channel(context); return err;

-- 2.13.0

Dmitry Osipenko

12:14 a.m.

New subject: [PATCH 03/22] drm/tegra: Check whether page belongs to BO in tegra_bo_kmap()

This fixes an OOPS in case of out-of-bounds accessing of a kmap'ed cmdbuf (non-IOMMU allocation) while patching the relocations in do_relocs().

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/gem.c | 3 +++ 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 424569b53e57..ca0d4439e97b 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -74,6 +74,9 @@ static void *tegra_bo_kmap(struct host1x_bo *bo, unsigned int page) { struct tegra_bo *obj = host1x_to_tegra_bo(bo);

+ if (page * PAGE_SIZE >= obj->gem.size) + return NULL; + if (obj->vaddr) return obj->vaddr + page * PAGE_SIZE; else if (obj->gem.import_attach)

-- 2.13.0

Erik Faye-Lund

12:21 a.m.

New subject: [PATCH 03/22] drm/tegra: Check whether page belongs to BO in tegra_bo_kmap()

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...

This fixes an OOPS in case of out-of-bounds accessing of a kmap'ed cmdbuf (non-IOMMU allocation) while patching the relocations in do_relocs().

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/gem.c | 3 +++ 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 424569b53e57..ca0d4439e97b 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -74,6 +74,9 @@ static void *tegra_bo_kmap(struct host1x_bo *bo, unsigned int page) { struct tegra_bo *obj = host1x_to_tegra_bo(bo);
  if (page * PAGE_SIZE >= obj->gem.size)
          return NULL;
  if (obj->vaddr)
          return obj->vaddr + page * PAGE_SIZE;
  else if (obj->gem.import_attach)
-- 2.13.0

Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

Mikko Perttunen

1 Jun 1 Jun

6:01 p.m.

New subject: [PATCH 03/22] drm/tegra: Check whether page belongs to BO in tegra_bo_kmap()

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

This fixes an OOPS in case of out-of-bounds accessing of a kmap'ed cmdbuf (non-IOMMU allocation) while patching the relocations in do_relocs().

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/gem.c | 3 +++ 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 424569b53e57..ca0d4439e97b 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -74,6 +74,9 @@ static void *tegra_bo_kmap(struct host1x_bo *bo, unsigned int page) { struct tegra_bo *obj = host1x_to_tegra_bo(bo);
if (page * PAGE_SIZE >= obj->gem.size)
return NULL;

The multiplication here could overflow, so it needs the same u64 treatment to catch all problem situations. I'm not sure if this is required, though, with the other bounds check patches in this series.

...

if (obj->vaddr) return obj->vaddr + page * PAGE_SIZE; else if (obj->gem.import_attach)

Dmitry Osipenko

6:32 p.m.

New subject: [PATCH 03/22] drm/tegra: Check whether page belongs to BO in tegra_bo_kmap()

On 01.06.2017 21:01, Mikko Perttunen wrote:

...

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
This fixes an OOPS in case of out-of-bounds accessing of a kmap'ed cmdbuf (non-IOMMU allocation) while patching the relocations in do_relocs().

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/gem.c | 3 +++ 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 424569b53e57..ca0d4439e97b 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -74,6 +74,9 @@ static void *tegra_bo_kmap(struct host1x_bo *bo, unsigned int page) { struct tegra_bo *obj = host1x_to_tegra_bo(bo);
if (page * PAGE_SIZE >= obj->gem.size)
   return NULL;
The multiplication here could overflow, so it needs the same u64 treatment to catch all problem situations. I'm not sure if this is required, though, with the other bounds check patches in this series.

Right, I'll checks once more if this patch is still needed, thank you.

...

  if (obj->vaddr)
      return obj->vaddr + page * PAGE_SIZE;
  else if (obj->gem.import_attach)

-- Dmitry

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 04/22] drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL

If commands buffer claims a number of words that is higher than its BO can fit, a kernel OOPS will be fired on the out-of-bounds BO access. This was triggered by an opentegra Xorg driver that erroneously pushed too many commands to the pushbuf.

The CMDA commands buffer address is 4 bytes aligned, so check its alignment.

The maximum number of the CDMA gather fetches is 16383, add a check for it.

Add a sanity check for the relocations in a same way.

[ 46.829393] Unable to handle kernel paging request at virtual address f09b2000 ... [<c04a3ba4>] (host1x_job_pin) from [<c04dfcd0>] (tegra_drm_submit+0x474/0x510) [<c04dfcd0>] (tegra_drm_submit) from [<c04deea0>] (tegra_submit+0x50/0x6c) [<c04deea0>] (tegra_submit) from [<c04c07c0>] (drm_ioctl+0x1e4/0x3ec) [<c04c07c0>] (drm_ioctl) from [<c02541a0>] (do_vfs_ioctl+0x9c/0x8e4) [<c02541a0>] (do_vfs_ioctl) from [<c0254a1c>] (SyS_ioctl+0x34/0x5c) [<c0254a1c>] (SyS_ioctl) from [<c0107640>] (ret_fast_syscall+0x0/0x3c)

Signed-off-by: Dmitry Osipenko digetx@gmail.com Reviewed-by: Erik Faye-Lund kusmabite@gmail.com --- drivers/gpu/drm/tegra/drm.c | 48 +++++++++++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/tegra/gem.c | 5 ----- drivers/gpu/drm/tegra/gem.h | 5 +++++ 3 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index f9282da94a1f..7e4559ec824d 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -26,6 +26,7 @@ #define DRIVER_PATCHLEVEL 0

#define CARVEOUT_SZ SZ_64M +#define CDMA_GATHER_FETCHES_MAX_NB 16383

struct tegra_drm_file { struct idr contexts; @@ -383,18 +384,41 @@ int tegra_drm_submit(struct tegra_drm_context *context, while (num_cmdbufs) { struct drm_tegra_cmdbuf cmdbuf; struct host1x_bo *bo; + struct tegra_bo *obj; + u64 offset;

if (copy_from_user(&cmdbuf, cmdbufs, sizeof(cmdbuf))) { err = -EFAULT; goto fail; }

+ /* + * The maximum number of CDMA gather fetches is 16383, a higher + * value means the words count is malformed. + */ + if (cmdbuf.words > CDMA_GATHER_FETCHES_MAX_NB) { + err = -EINVAL; + goto fail; + } + bo = host1x_bo_lookup(file, cmdbuf.handle); if (!bo) { err = -ENOENT; goto fail; }

+ offset = (u64)cmdbuf.offset + (u64)cmdbuf.words * sizeof(u32); + obj = host1x_to_tegra_bo(bo); + + /* + * The CDMA base address if 4-bytes aligned, unaligned offset + * is malformed. + */ + if (offset & 3 || offset >= obj->gem.size) { + err = -EINVAL; + goto fail; + } + host1x_job_add_gather(job, bo, cmdbuf.words, cmdbuf.offset); num_cmdbufs--; cmdbufs++; @@ -402,11 +426,35 @@ int tegra_drm_submit(struct tegra_drm_context *context,

/* copy and resolve relocations from submit */ while (num_relocs--) { + struct host1x_reloc *reloc; + struct tegra_bo *obj; + err = host1x_reloc_copy_from_user(&job->relocarray[num_relocs], &relocs[num_relocs], drm, file); if (err < 0) goto fail; + + reloc = &job->relocarray[num_relocs]; + obj = host1x_to_tegra_bo(reloc->cmdbuf.bo); + + /* + * The unaligned cmdbuf offset will cause an unaligned write + * during of the relocations patching, corrupting the commands + * stream. + */ + if (reloc->cmdbuf.offset & 3 || + reloc->cmdbuf.offset >= obj->gem.size) { + err = -EINVAL; + goto fail; + } + + obj = host1x_to_tegra_bo(reloc->target.bo); + + if (reloc->target.offset >= obj->gem.size) { + err = -EINVAL; + goto fail; + } }

if (copy_from_user(job->waitchk, waitchks, diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index ca0d4439e97b..6a855b2f07fb 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -20,11 +20,6 @@ #include "drm.h" #include "gem.h"

-static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo) -{ - return container_of(bo, struct tegra_bo, base); -} - static void tegra_bo_put(struct host1x_bo *bo) { struct tegra_bo *obj = host1x_to_tegra_bo(bo); diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h index 6c5f12ac0087..8b32a6fd586d 100644 --- a/drivers/gpu/drm/tegra/gem.h +++ b/drivers/gpu/drm/tegra/gem.h @@ -52,6 +52,11 @@ static inline struct tegra_bo *to_tegra_bo(struct drm_gem_object *gem) return container_of(gem, struct tegra_bo, gem); }

+static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo) +{ + return container_of(bo, struct tegra_bo, base); +} + struct tegra_bo *tegra_bo_create(struct drm_device *drm, size_t size, unsigned long flags); struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file,

-- 2.13.0

Mikko Perttunen

31 May 31 May

6:39 p.m.

New subject: [PATCH 04/22] drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL

Just some comments on comments (?!). With those fixed,

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

If commands buffer claims a number of words that is higher than its BO can fit, a kernel OOPS will be fired on the out-of-bounds BO access. This was triggered by an opentegra Xorg driver that erroneously pushed too many commands to the pushbuf.

The CMDA commands buffer address is 4 bytes aligned, so check its alignment.

CDMA

...

The maximum number of the CDMA gather fetches is 16383, add a check for it.

Add a sanity check for the relocations in a same way.

[ 46.829393] Unable to handle kernel paging request at virtual address f09b2000 ... [<c04a3ba4>] (host1x_job_pin) from [<c04dfcd0>] (tegra_drm_submit+0x474/0x510) [<c04dfcd0>] (tegra_drm_submit) from [<c04deea0>] (tegra_submit+0x50/0x6c) [<c04deea0>] (tegra_submit) from [<c04c07c0>] (drm_ioctl+0x1e4/0x3ec) [<c04c07c0>] (drm_ioctl) from [<c02541a0>] (do_vfs_ioctl+0x9c/0x8e4) [<c02541a0>] (do_vfs_ioctl) from [<c0254a1c>] (SyS_ioctl+0x34/0x5c) [<c0254a1c>] (SyS_ioctl) from [<c0107640>] (ret_fast_syscall+0x0/0x3c)

Signed-off-by: Dmitry Osipenko digetx@gmail.com Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

drivers/gpu/drm/tegra/drm.c | 48 +++++++++++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/tegra/gem.c | 5 ----- drivers/gpu/drm/tegra/gem.h | 5 +++++ 3 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index f9282da94a1f..7e4559ec824d 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -26,6 +26,7 @@ #define DRIVER_PATCHLEVEL 0

#define CARVEOUT_SZ SZ_64M +#define CDMA_GATHER_FETCHES_MAX_NB 16383

struct tegra_drm_file { struct idr contexts; @@ -383,18 +384,41 @@ int tegra_drm_submit(struct tegra_drm_context *context, while (num_cmdbufs) { struct drm_tegra_cmdbuf cmdbuf; struct host1x_bo *bo;
struct tegra_bo *obj;
u64 offset;
if (copy_from_user(&cmdbuf, cmdbufs, sizeof(cmdbuf))) { err = -EFAULT; goto fail; }
/*
 * The maximum number of CDMA gather fetches is 16383, a higher
 * value means the words count is malformed.
 */
if (cmdbuf.words > CDMA_GATHER_FETCHES_MAX_NB) {
	err = -EINVAL;
	goto fail;
}
bo = host1x_bo_lookup(file, cmdbuf.handle); if (!bo) { err = -ENOENT; goto fail; }
offset = (u64)cmdbuf.offset + (u64)cmdbuf.words * sizeof(u32);
obj = host1x_to_tegra_bo(bo);
/*
 * The CDMA base address if 4-bytes aligned, unaligned offset

"Gather buffer base address must be", perhaps, as cmdbufs aren't used directly as CDMA buffers.

...

 * is malformed.
 */
if (offset & 3 || offset >= obj->gem.size) {
	err = -EINVAL;
	goto fail;
}
host1x_job_add_gather(job, bo, cmdbuf.words, cmdbuf.offset); num_cmdbufs--; cmdbufs++;
@@ -402,11 +426,35 @@ int tegra_drm_submit(struct tegra_drm_context *context,

/* copy and resolve relocations from submit */ while (num_relocs--) {
struct host1x_reloc *reloc;
struct tegra_bo *obj;
err = host1x_reloc_copy_from_user(&job->relocarray[num_relocs], &relocs[num_relocs], drm, file); if (err < 0) goto fail;
reloc = &job->relocarray[num_relocs];
obj = host1x_to_tegra_bo(reloc->cmdbuf.bo);
/*
 * The unaligned cmdbuf offset will cause an unaligned write

"An unaligned ..."

...

 * during of the relocations patching, corrupting the commands
 * stream.
 */
if (reloc->cmdbuf.offset & 3 ||
    reloc->cmdbuf.offset >= obj->gem.size) {
	err = -EINVAL;
	goto fail;
}
obj = host1x_to_tegra_bo(reloc->target.bo);
if (reloc->target.offset >= obj->gem.size) {
	err = -EINVAL;
	goto fail;
}
}

if (copy_from_user(job->waitchk, waitchks,
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index ca0d4439e97b..6a855b2f07fb 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -20,11 +20,6 @@ #include "drm.h" #include "gem.h"

-static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo) -{

return container_of(bo, struct tegra_bo, base);

-}

static void tegra_bo_put(struct host1x_bo *bo) { struct tegra_bo *obj = host1x_to_tegra_bo(bo);

diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h index 6c5f12ac0087..8b32a6fd586d 100644 --- a/drivers/gpu/drm/tegra/gem.h +++ b/drivers/gpu/drm/tegra/gem.h @@ -52,6 +52,11 @@ static inline struct tegra_bo *to_tegra_bo(struct drm_gem_object *gem) return container_of(gem, struct tegra_bo, gem); }

+static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo) +{

return container_of(bo, struct tegra_bo, base);

+}

struct tegra_bo *tegra_bo_create(struct drm_device *drm, size_t size, unsigned long flags); struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file,

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 05/22] drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL

The waitchecks along with multiple syncpoints per submit are not ready for use yet, let's forbid them for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/drm.c | 60 ++++++++++++++++++++++++++++++++++++++++++--- drivers/gpu/host1x/job.h | 7 ------ include/linux/host1x.h | 7 ++++++ 3 files changed, 63 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 7e4559ec824d..eae0c1512ab0 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -349,6 +349,36 @@ static int host1x_reloc_copy_from_user(struct host1x_reloc *dest, return 0; }

+static int host1x_waitchk_copy_from_user(struct host1x_waitchk *dest, + struct drm_tegra_waitchk __user *src, + struct drm_file *file) +{ + u32 cmdbuf; + int err; + + err = get_user(cmdbuf, &src->handle); + if (err < 0) + return err; + + err = get_user(dest->offset, &src->offset); + if (err < 0) + return err; + + err = get_user(dest->syncpt_id, &src->syncpt); + if (err < 0) + return err; + + err = get_user(dest->thresh, &src->thresh); + if (err < 0) + return err; + + dest->bo = host1x_bo_lookup(file, cmdbuf); + if (!dest->bo) + return -ENOENT; + + return 0; +} + int tegra_drm_submit(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file) @@ -370,6 +400,10 @@ int tegra_drm_submit(struct tegra_drm_context *context, if (args->num_syncpts != 1) return -EINVAL;

+ /* We don't yet support waitchks */ + if (args->num_waitchks != 0) + return -EINVAL; + job = host1x_job_alloc(context->channel, args->num_cmdbufs, args->num_relocs, args->num_waitchks); if (!job) @@ -457,10 +491,28 @@ int tegra_drm_submit(struct tegra_drm_context *context, } }

- if (copy_from_user(job->waitchk, waitchks, - sizeof(*waitchks) * num_waitchks)) { - err = -EFAULT; - goto fail; + /* copy and resolve waitchks from submit */ + while (num_waitchks--) { + struct host1x_waitchk *wait = &job->waitchk[num_waitchks]; + struct tegra_bo *obj; + + err = host1x_waitchk_copy_from_user(wait, + &waitchks[num_waitchks], + file); + if (err < 0) + goto fail; + + obj = host1x_to_tegra_bo(wait->bo); + + /* + * The unaligned offset will cause an unaligned write during + * of the waitchks patching, corrupting the commands stream. + */ + if (wait->offset & 3 || + wait->offset >= obj->gem.size) { + err = -EINVAL; + goto fail; + } }

if (copy_from_user(&syncpt, (void __user *)(uintptr_t)args->syncpts, diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index 878239c476d2..0debd93a1849 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -34,13 +34,6 @@ struct host1x_cmdbuf { u32 pad; };

-struct host1x_waitchk { - struct host1x_bo *bo; - u32 offset; - u32 syncpt_id; - u32 thresh; -}; - struct host1x_job_unpin_data { struct host1x_bo *bo; struct sg_table *sgt; diff --git a/include/linux/host1x.h b/include/linux/host1x.h index 3d04aa1dc83e..aa323e43ae4e 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -177,6 +177,13 @@ struct host1x_reloc { unsigned long shift; };

+struct host1x_waitchk { + struct host1x_bo *bo; + u32 offset; + u32 syncpt_id; + u32 thresh; +}; + struct host1x_job { /* When refcount goes to zero, job can be freed */ struct kref ref;

-- 2.13.0

Mikko Perttunen

31 May 31 May

6:43 p.m.

New subject: [PATCH 05/22] drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

The waitchecks along with multiple syncpoints per submit are not ready for use yet, let's forbid them for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 60 ++++++++++++++++++++++++++++++++++++++++++--- drivers/gpu/host1x/job.h | 7 ------ include/linux/host1x.h | 7 ++++++ 3 files changed, 63 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 7e4559ec824d..eae0c1512ab0 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -349,6 +349,36 @@ static int host1x_reloc_copy_from_user(struct host1x_reloc *dest, return 0; }

+static int host1x_waitchk_copy_from_user(struct host1x_waitchk *dest,
			 struct drm_tegra_waitchk __user *src,
			 struct drm_file *file)
+{
u32 cmdbuf;

int err;

err = get_user(cmdbuf, &src->handle);

if (err < 0)
return err;
err = get_user(dest->offset, &src->offset);

if (err < 0)
return err;
err = get_user(dest->syncpt_id, &src->syncpt);

if (err < 0)
return err;
err = get_user(dest->thresh, &src->thresh);

if (err < 0)
return err;
dest->bo = host1x_bo_lookup(file, cmdbuf);

if (!dest->bo)
return -ENOENT;
return 0;
+}

int tegra_drm_submit(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file)

@@ -370,6 +400,10 @@ int tegra_drm_submit(struct tegra_drm_context *context, if (args->num_syncpts != 1) return -EINVAL;
/* We don't yet support waitchks */

if (args->num_waitchks != 0)
return -EINVAL;
job = host1x_job_alloc(context->channel, args->num_cmdbufs, args->num_relocs, args->num_waitchks); if (!job)
@@ -457,10 +491,28 @@ int tegra_drm_submit(struct tegra_drm_context *context, } }
if (copy_from_user(job->waitchk, waitchks,
	   sizeof(*waitchks) * num_waitchks)) {
err = -EFAULT;
goto fail;
/* copy and resolve waitchks from submit */

while (num_waitchks--) {
struct host1x_waitchk *wait = &job->waitchk[num_waitchks];
struct tegra_bo *obj;
err = host1x_waitchk_copy_from_user(wait,
				    &waitchks[num_waitchks],
				    file);
if (err < 0)
	goto fail;
obj = host1x_to_tegra_bo(wait->bo);
/*
 * The unaligned offset will cause an unaligned write during
 * of the waitchks patching, corrupting the commands stream.
 */
if (wait->offset & 3 ||
    wait->offset >= obj->gem.size) {
	err = -EINVAL;
	goto fail;
}
}

if (copy_from_user(&syncpt, (void __user *)(uintptr_t)args->syncpts,
diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index 878239c476d2..0debd93a1849 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -34,13 +34,6 @@ struct host1x_cmdbuf { u32 pad; };

-struct host1x_waitchk {

struct host1x_bo *bo;

u32 offset;

u32 syncpt_id;

u32 thresh;

-};

struct host1x_job_unpin_data { struct host1x_bo *bo; struct sg_table *sgt;

diff --git a/include/linux/host1x.h b/include/linux/host1x.h index 3d04aa1dc83e..aa323e43ae4e 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -177,6 +177,13 @@ struct host1x_reloc { unsigned long shift; };

+struct host1x_waitchk {

struct host1x_bo *bo;

u32 offset;

u32 syncpt_id;

u32 thresh;

+};

struct host1x_job { /* When refcount goes to zero, job can be freed */ struct kref ref;

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 06/22] drm/tegra: Check syncpoint ID in the 'submit' IOCTL

In case of invalid syncpoint ID, the host1x_syncpt_get() returns NULL and none of its users perform a check of the returned pointer later. Let's bail out until it's too late.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/drm.c | 9 +++++++++ 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index eae0c1512ab0..cdb05d6efde4 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -393,6 +393,8 @@ int tegra_drm_submit(struct tegra_drm_context *context, struct drm_tegra_waitchk __user *waitchks = (void __user *)(uintptr_t)args->waitchks; struct drm_tegra_syncpt syncpt; + struct host1x *host1x = dev_get_drvdata(drm->dev->parent); + struct host1x_syncpt *sp; struct host1x_job *job; int err;

@@ -521,6 +523,13 @@ int tegra_drm_submit(struct tegra_drm_context *context, goto fail; }

+ /* check whether syncpoint ID is valid */ + sp = host1x_syncpt_get(host1x, syncpt.id); + if (!sp) { + err = -ENOENT; + goto fail; + } + job->is_addr_reg = context->client->ops->is_addr_reg; job->syncpt_incrs = syncpt.incrs; job->syncpt_id = syncpt.id;

-- 2.13.0

Mikko Perttunen

31 May 31 May

6:46 p.m.

New subject: [PATCH 06/22] drm/tegra: Check syncpoint ID in the 'submit' IOCTL

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

In case of invalid syncpoint ID, the host1x_syncpt_get() returns NULL and none of its users perform a check of the returned pointer later. Let's bail out until it's too late.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 9 +++++++++ 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index eae0c1512ab0..cdb05d6efde4 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -393,6 +393,8 @@ int tegra_drm_submit(struct tegra_drm_context *context, struct drm_tegra_waitchk __user *waitchks = (void __user *)(uintptr_t)args->waitchks; struct drm_tegra_syncpt syncpt;

struct host1x *host1x = dev_get_drvdata(drm->dev->parent);

struct host1x_syncpt *sp; struct host1x_job *job; int err;

@@ -521,6 +523,13 @@ int tegra_drm_submit(struct tegra_drm_context *context, goto fail; }
/* check whether syncpoint ID is valid */

sp = host1x_syncpt_get(host1x, syncpt.id);

if (!sp) {
err = -ENOENT;
goto fail;
}

job->is_addr_reg = context->client->ops->is_addr_reg; job->syncpt_incrs = syncpt.incrs; job->syncpt_id = syncpt.id;

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 07/22] drm/tegra: Remove module ownership from the tegra_fb_ops

The framebuffers console fbcon_startup() increments the tegra_drm module 'use' refcount via try_module_get(), causing an interlock of the DRM subsys and the tegra_drm modules. In result, the tegra_drm module can't be unloaded using rmmod.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/fb.c | 1 - 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c index 25acb73ee728..75e102ffdc86 100644 --- a/drivers/gpu/drm/tegra/fb.c +++ b/drivers/gpu/drm/tegra/fb.c @@ -202,7 +202,6 @@ struct drm_framebuffer *tegra_fb_create(struct drm_device *drm,

#ifdef CONFIG_DRM_FBDEV_EMULATION static struct fb_ops tegra_fb_ops = { - .owner = THIS_MODULE, DRM_FB_HELPER_DEFAULT_OPS, .fb_fillrect = drm_fb_helper_sys_fillrect, .fb_copyarea = drm_fb_helper_sys_copyarea,

-- 2.13.0

Thierry Reding

13 Jun 13 Jun

1:43 p.m.

New subject: [PATCH 07/22] drm/tegra: Remove module ownership from the tegra_fb_ops

On Tue, May 23, 2017 at 03:14:22AM +0300, Dmitry Osipenko wrote:

...

The framebuffers console fbcon_startup() increments the tegra_drm module 'use' refcount via try_module_get(), causing an interlock of the DRM subsys and the tegra_drm modules. In result, the tegra_drm module can't be unloaded using rmmod.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/fb.c | 1 - 1 file changed, 1 deletion(-)

That's done on purpose because otherwise you could just rip out the driver from under the framebuffer emulation and things would crash.

My understanding is that the right way to unload a module is to unbind the driver first (which will cause the framebuffer to be removed and hence the reference to be dropped) before the rmmod.

Thierry

Dmitry Osipenko

2 p.m.

New subject: [PATCH 07/22] drm/tegra: Remove module ownership from the tegra_fb_ops

On 13.06.2017 16:43, Thierry Reding wrote:

...

On Tue, May 23, 2017 at 03:14:22AM +0300, Dmitry Osipenko wrote:

...
The framebuffers console fbcon_startup() increments the tegra_drm module 'use' refcount via try_module_get(), causing an interlock of the DRM subsys and the tegra_drm modules. In result, the tegra_drm module can't be unloaded using rmmod.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/fb.c | 1 - 1 file changed, 1 deletion(-)

That's done on purpose because otherwise you could just rip out the driver from under the framebuffer emulation and things would crash.

My understanding is that the right way to unload a module is to unbind the driver first (which will cause the framebuffer to be removed and hence the reference to be dropped) before the rmmod.

Thierry

Aha, interesting. I'll try the unbinding and will drop this patch from the series, thank you for the clarification. I haven't observed any crashes on a module reloading (framebuffer detached/attached just fine), maybe I was lucky then.

-- Dmitry

Thierry Reding

3:07 p.m.

New subject: [PATCH 07/22] drm/tegra: Remove module ownership from the tegra_fb_ops

On Tue, Jun 13, 2017 at 05:00:28PM +0300, Dmitry Osipenko wrote:

...

On 13.06.2017 16:43, Thierry Reding wrote:

...
On Tue, May 23, 2017 at 03:14:22AM +0300, Dmitry Osipenko wrote:

...
The framebuffers console fbcon_startup() increments the tegra_drm module 'use' refcount via try_module_get(), causing an interlock of the DRM subsys and the tegra_drm modules. In result, the tegra_drm module can't be unloaded using rmmod.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/fb.c | 1 - 1 file changed, 1 deletion(-)

That's done on purpose because otherwise you could just rip out the driver from under the framebuffer emulation and things would crash.

My understanding is that the right way to unload a module is to unbind the driver first (which will cause the framebuffer to be removed and hence the reference to be dropped) before the rmmod.

Thierry

Aha, interesting. I'll try the unbinding and will drop this patch from the series, thank you for the clarification. I haven't observed any crashes on a module reloading (framebuffer detached/attached just fine), maybe I was lucky then.

It's possible that it works by accident. The driver removes the framebuffer as part of the driver removal process, so technically nothing should crash. However, if, for any reason, anyone was holding on to a reference to the framebuffer (not sure if that is even possible) the module needs to stay around long enough as well, otherwise the function pointers would become dangling. The module reference makes sure that this doesn't happen (as long as a framebuffer exists, the ops will stick around).

So even if there isn't a way to make it crash today, the code is still correct in taking the reference.

Thierry

Dmitry Osipenko

5:39 p.m.

New subject: [PATCH 07/22] drm/tegra: Remove module ownership from the tegra_fb_ops

On 13.06.2017 18:07, Thierry Reding wrote:

...

On Tue, Jun 13, 2017 at 05:00:28PM +0300, Dmitry Osipenko wrote:

...
On 13.06.2017 16:43, Thierry Reding wrote:

...
On Tue, May 23, 2017 at 03:14:22AM +0300, Dmitry Osipenko wrote:

...
The framebuffers console fbcon_startup() increments the tegra_drm module 'use' refcount via try_module_get(), causing an interlock of the DRM subsys and the tegra_drm modules. In result, the tegra_drm module can't be unloaded using rmmod.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/fb.c | 1 - 1 file changed, 1 deletion(-)

That's done on purpose because otherwise you could just rip out the driver from under the framebuffer emulation and things would crash.

My understanding is that the right way to unload a module is to unbind the driver first (which will cause the framebuffer to be removed and hence the reference to be dropped) before the rmmod.

Thierry

Aha, interesting. I'll try the unbinding and will drop this patch from the series, thank you for the clarification. I haven't observed any crashes on a module reloading (framebuffer detached/attached just fine), maybe I was lucky then.

It's possible that it works by accident. The driver removes the framebuffer as part of the driver removal process, so technically nothing should crash. However, if, for any reason, anyone was holding on to a reference to the framebuffer (not sure if that is even possible) the module needs to stay around long enough as well, otherwise the function pointers would become dangling. The module reference makes sure that this doesn't happen (as long as a framebuffer exists, the ops will stick around).

So even if there isn't a way to make it crash today, the code is still correct in taking the reference.

Thank you very much for a more detailed explanation. I've successfully unloaded the tegra_drm module after unbinding the VT's, the patch is dropped from the series.

echo 0 | tee /sys/devices/virtual/vtconsole/vtcon*/bind

-- Dmitry

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 08/22] drm/tegra: dc: Drop the reset asserts to workaround a bug

Commit 33a8eb8 ("Implement runtime PM") introduced HW reset control. It causes a hang on Tegra20 if both display controllers are utilized (RGB panel and HDMI). The TRM suggests that each display controller has its own reset control, apparently it is not correct. Let's remove the interaction with the resets for now as a workaround.

Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM") Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/dc.c | 15 --------------- 1 file changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 95b373f739f2..5c9b93981af2 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -1989,8 +1989,6 @@ static int tegra_dc_probe(struct platform_device *pdev) return PTR_ERR(dc->rst); }

- reset_control_assert(dc->rst); - if (dc->soc->has_powergate) { if (dc->pipe == 0) dc->powergate = TEGRA_POWERGATE_DIS; @@ -2061,13 +2059,6 @@ static int tegra_dc_remove(struct platform_device *pdev) static int tegra_dc_suspend(struct device *dev) { struct tegra_dc *dc = dev_get_drvdata(dev); - int err; - - err = reset_control_assert(dc->rst); - if (err < 0) { - dev_err(dev, "failed to assert reset: %d\n", err); - return err; - }

if (dc->soc->has_powergate) tegra_powergate_power_off(dc->powergate); @@ -2095,12 +2086,6 @@ static int tegra_dc_resume(struct device *dev) dev_err(dev, "failed to enable clock: %d\n", err); return err; } - - err = reset_control_deassert(dc->rst); - if (err < 0) { - dev_err(dev, "failed to deassert reset: %d\n", err); - return err; - } }

return 0;

-- 2.13.0

Thierry Reding

13 Jun 13 Jun

1:45 p.m.

New subject: [PATCH 08/22] drm/tegra: dc: Drop the reset asserts to workaround a bug

On Tue, May 23, 2017 at 03:14:23AM +0300, Dmitry Osipenko wrote:

...

Commit 33a8eb8 ("Implement runtime PM") introduced HW reset control. It causes a hang on Tegra20 if both display controllers are utilized (RGB panel and HDMI). The TRM suggests that each display controller has its own reset control, apparently it is not correct. Let's remove the interaction with the resets for now as a workaround.

Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM") Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/dc.c | 15 --------------- 1 file changed, 15 deletions(-)

Do you mind if I parameterize this to restrict omission of the assert and deassert to Tegra20? I'm fairly sure that these resets are important on later chips in order to properly reset the display controllers.

My proposal would be to add a field to struct tegra_dc_soc_info (maybe .broken_reset) that will be set only on Tegra20 to still allow the DC to be reset on later generations.

Thierry

Dmitry Osipenko

2:18 p.m.

New subject: [PATCH 08/22] drm/tegra: dc: Drop the reset asserts to workaround a bug

On 13.06.2017 16:45, Thierry Reding wrote:

...

On Tue, May 23, 2017 at 03:14:23AM +0300, Dmitry Osipenko wrote:

...
Commit 33a8eb8 ("Implement runtime PM") introduced HW reset control. It causes a hang on Tegra20 if both display controllers are utilized (RGB panel and HDMI). The TRM suggests that each display controller has its own reset control, apparently it is not correct. Let's remove the interaction with the resets for now as a workaround.

Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM") Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/dc.c | 15 --------------- 1 file changed, 15 deletions(-)

Do you mind if I parameterize this to restrict omission of the assert and deassert to Tegra20? I'm fairly sure that these resets are important on later chips in order to properly reset the display controllers.

I don't mind at all, I can parameterize it in the next rev myself if you wish.

...

My proposal would be to add a field to struct tegra_dc_soc_info (maybe .broken_reset) that will be set only on Tegra20 to still allow the DC to be reset on later generations.

Sounds good.

-- Dmitry

Thierry Reding

3:07 p.m.

New subject: [PATCH 08/22] drm/tegra: dc: Drop the reset asserts to workaround a bug

On Tue, Jun 13, 2017 at 05:18:37PM +0300, Dmitry Osipenko wrote:

...

On 13.06.2017 16:45, Thierry Reding wrote:

...
On Tue, May 23, 2017 at 03:14:23AM +0300, Dmitry Osipenko wrote:

...
Commit 33a8eb8 ("Implement runtime PM") introduced HW reset control. It causes a hang on Tegra20 if both display controllers are utilized (RGB panel and HDMI). The TRM suggests that each display controller has its own reset control, apparently it is not correct. Let's remove the interaction with the resets for now as a workaround.

Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM") Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/dc.c | 15 --------------- 1 file changed, 15 deletions(-)

Do you mind if I parameterize this to restrict omission of the assert and deassert to Tegra20? I'm fairly sure that these resets are important on later chips in order to properly reset the display controllers.

I don't mind at all, I can parameterize it in the next rev myself if you wish.

Sure, why not.

Thierry

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 09/22] drm/tegra: dc: Apply clipping to the plane

On Tegra20 an overlay plane should be clipped, otherwise its output is distorted once plane crosses display boundary.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/dc.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 5c9b93981af2..2e01a2dfc5cd 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -485,12 +485,25 @@ static int tegra_plane_state_add(struct tegra_plane *plane, { struct drm_crtc_state *crtc_state; struct tegra_dc_state *tegra; + struct drm_rect clip; + int err;

/* Propagate errors from allocation or locking failures. */ crtc_state = drm_atomic_get_crtc_state(state->state, state->crtc); if (IS_ERR(crtc_state)) return PTR_ERR(crtc_state);

+ clip.x1 = 0; + clip.y1 = 0; + clip.x2 = crtc_state->mode.hdisplay; + clip.y2 = crtc_state->mode.vdisplay; + + /* Check plane state for visibility and calculate clipping bounds */ + err = drm_plane_helper_check_state(state, &clip, 0, INT_MAX, + true, true); + if (err < 0) + return err; + tegra = to_dc_state(crtc_state);

tegra->planes |= WIN_A_ACT_REQ << plane->index; @@ -560,14 +573,14 @@ static void tegra_plane_atomic_update(struct drm_plane *plane, return;

memset(&window, 0, sizeof(window)); - window.src.x = plane->state->src_x >> 16; - window.src.y = plane->state->src_y >> 16; - window.src.w = plane->state->src_w >> 16; - window.src.h = plane->state->src_h >> 16; - window.dst.x = plane->state->crtc_x; - window.dst.y = plane->state->crtc_y; - window.dst.w = plane->state->crtc_w; - window.dst.h = plane->state->crtc_h; + window.src.x = plane->state->src.x1 >> 16; + window.src.y = plane->state->src.y1 >> 16; + window.src.w = drm_rect_width(&plane->state->src) >> 16; + window.src.h = drm_rect_height(&plane->state->src) >> 16; + window.dst.x = plane->state->dst.x1; + window.dst.y = plane->state->dst.y1; + window.dst.w = drm_rect_width(&plane->state->dst); + window.dst.h = drm_rect_height(&plane->state->dst); window.bits_per_pixel = fb->format->cpp[0] * 8; window.bottom_up = tegra_fb_is_bottom_up(fb);

-- 2.13.0

Dmitry Osipenko

12:14 a.m.

New subject: [PATCH 10/22] drm/tegra: Disable plane if it is invisible

On Tegra20 if plane has width or height equal to 0, it will be infinitely wide or tall. Let's disable the plane if it is invisible on atomic state committing to fix the issue. The Rockchip DRM driver does the same.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/dc.c | 33 +++++++++++++++++++++------------ 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 2e01a2dfc5cd..1f9d8d43bec3 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -558,6 +558,23 @@ static int tegra_plane_atomic_check(struct drm_plane *plane, return 0; }

+static void tegra_dc_disable_window(struct tegra_dc *dc, int index) +{ + unsigned long flags; + u32 value; + + spin_lock_irqsave(&dc->lock, flags); + + value = WINDOW_A_SELECT << index; + tegra_dc_writel(dc, value, DC_CMD_DISPLAY_WINDOW_HEADER); + + value = tegra_dc_readl(dc, DC_WIN_WIN_OPTIONS); + value &= ~WIN_ENABLE; + tegra_dc_writel(dc, value, DC_WIN_WIN_OPTIONS); + + spin_unlock_irqrestore(&dc->lock, flags); +} + static void tegra_plane_atomic_update(struct drm_plane *plane, struct drm_plane_state *old_state) { @@ -572,6 +589,9 @@ static void tegra_plane_atomic_update(struct drm_plane *plane, if (!plane->state->crtc || !plane->state->fb) return;

+ if (!plane->state->visible) + return tegra_dc_disable_window(dc, p->index); + memset(&window, 0, sizeof(window)); window.src.x = plane->state->src.x1 >> 16; window.src.y = plane->state->src.y1 >> 16; @@ -611,8 +631,6 @@ static void tegra_plane_atomic_disable(struct drm_plane *plane, { struct tegra_plane *p = to_tegra_plane(plane); struct tegra_dc *dc; - unsigned long flags; - u32 value;

/* rien ne va plus */ if (!old_state || !old_state->crtc) @@ -620,16 +638,7 @@ static void tegra_plane_atomic_disable(struct drm_plane *plane,

dc = to_tegra_dc(old_state->crtc);

- spin_lock_irqsave(&dc->lock, flags); - - value = WINDOW_A_SELECT << p->index; - tegra_dc_writel(dc, value, DC_CMD_DISPLAY_WINDOW_HEADER); - - value = tegra_dc_readl(dc, DC_WIN_WIN_OPTIONS); - value &= ~WIN_ENABLE; - tegra_dc_writel(dc, value, DC_WIN_WIN_OPTIONS); - - spin_unlock_irqrestore(&dc->lock, flags); + tegra_dc_disable_window(dc, p->index); }

static const struct drm_plane_helper_funcs tegra_primary_plane_helper_funcs = {

-- 2.13.0

Dmitry Osipenko

12:14 a.m.

New subject: [PATCH 11/22] gpu: host1x: Initialize firewall class to the jobs one

The commands stream is prepended by the jobs class on the CDMA submission, so that explicitly setting a module class in the commands stream isn't necessary. The firewall initializes its class to 0 and the command stream that doesn't explicitly specify the class effectively bypasses the firewall.

Signed-off-by: Dmitry Osipenko digetx@gmail.com Reviewed-by: Mikko Perttunen mperttunen@nvidia.com --- drivers/gpu/host1x/job.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 5f5f8ee6143d..d9933828fe87 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -504,7 +504,7 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) fw.dev = dev; fw.reloc = job->relocarray; fw.num_relocs = job->num_relocs; - fw.class = 0; + fw.class = job->class;

for (i = 0; i < job->num_gathers; i++) { struct host1x_job_gather *g = &job->gathers[i];

-- 2.13.0

Dmitry Osipenko

12:14 a.m.

New subject: [PATCH 12/22] gpu: host1x: Correct host1x_job_pin() error handling

In case of relocations / waitchecks patching failure the jobs pins stay referenced till DRM file get closed, wasting memory. Add the missed unpinning.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/host1x/job.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index d9933828fe87..14f3f957ffab 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -592,22 +592,20 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)

err = do_relocs(job, g->bo); if (err) - break; + goto out;

err = do_waitchks(job, host, g->bo); if (err) - break; + goto out; }

- if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && !err) { - err = copy_gathers(job, dev); - if (err) { - host1x_job_unpin(job); - return err; - } - } + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) + goto out;

+ err = copy_gathers(job, dev); out: + if (err) + host1x_job_unpin(job); wmb();

return err;

-- 2.13.0

Erik Faye-Lund

12:32 a.m.

New subject: [PATCH 12/22] gpu: host1x: Correct host1x_job_pin() error handling

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...

In case of relocations / waitchecks patching failure the jobs pins stay referenced till DRM file get closed, wasting memory. Add the missed unpinning.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index d9933828fe87..14f3f957ffab 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -592,22 +592,20 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
            err = do_relocs(job, g->bo);
            if (err)
                  break;
                  goto out;

          err = do_waitchks(job, host, g->bo);
          if (err)
                  break;
                  goto out;
  }
  if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && !err) {
          err = copy_gathers(job, dev);
          if (err) {
                  host1x_job_unpin(job);
                  return err;
          }
  }
  if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
          goto out;
  err = copy_gathers(job, dev);
out:
  if (err)
          host1x_job_unpin(job);
  wmb();

  return err;
-- 2.13.0

One subtle undocumented change here, is that wmb() now gets called in the copy_gathers()-error case (just like it already does for the other error-cases). That's seems like an OK change, so:

Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

Mikko Perttunen

31 May 31 May

6:50 p.m.

New subject: [PATCH 12/22] gpu: host1x: Correct host1x_job_pin() error handling

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

In case of relocations / waitchecks patching failure the jobs pins stay referenced till DRM file get closed, wasting memory. Add the missed unpinning.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index d9933828fe87..14f3f957ffab 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -592,22 +592,20 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev)
err = do_relocs(job, g->bo);
if (err)
	break;
	goto out;
err = do_waitchks(job, host, g->bo); if (err)
	break;
	goto out;
}
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && !err) {
err = copy_gathers(job, dev);
if (err) {
	host1x_job_unpin(job);
	return err;
}
}
if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
goto out;
err = copy_gathers(job, dev); out:

if (err)
host1x_job_unpin(job);
wmb();

return err;

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 13/22] gpu: host1x: Do not leak BO's phys address to userspace

Do gathers coping before patching them, so the original gathers are left untouched. That's not as bad as leaking a kernel addresses, but still doesn't feel right.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/host1x/job.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 14f3f957ffab..190353944d23 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -137,8 +137,9 @@ static void host1x_syncpt_patch_offset(struct host1x_syncpt *sp, * avoid a wrap condition in the HW). */ static int do_waitchks(struct host1x_job *job, struct host1x *host, - struct host1x_bo *patch) + struct host1x_job_gather *g) { + struct host1x_bo *patch = g->bo; int i;

/* compare syncpt vs wait threshold */ @@ -165,7 +166,8 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host, wait->syncpt_id, sp->name, wait->thresh, host1x_syncpt_read_min(sp));

- host1x_syncpt_patch_offset(sp, patch, wait->offset); + host1x_syncpt_patch_offset(sp, patch, + g->offset + wait->offset); }

wait->bo = NULL; @@ -269,11 +271,12 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) return err; }

-static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL; + struct host1x_bo *cmdbuf = g->bo;

/* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) { @@ -286,7 +289,8 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;

- if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && + last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { if (cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr); @@ -301,11 +305,20 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) } }

+ if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { + cmdbuf_page_addr = job->gather_copy_mapped; + cmdbuf_page_addr += g->offset; + } + target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK); + + if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) + target += (reloc->cmdbuf.offset & PAGE_MASK) >> 2; + *target = reloc_addr; }

- if (cmdbuf_page_addr) + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr);

return 0; @@ -573,6 +586,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (err) goto out;

+ if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { + err = copy_gathers(job, dev); + if (err) + goto out; + } + /* patch gathers */ for (i = 0; i < job->num_gathers; i++) { struct host1x_job_gather *g = &job->gathers[i]; @@ -581,7 +600,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (g->handled) continue;

- g->base = job->gather_addr_phys[i]; + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) + g->base = job->gather_addr_phys[i];

for (j = i + 1; j < job->num_gathers; j++) { if (job->gathers[j].bo == g->bo) { @@ -590,19 +610,15 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) } }

- err = do_relocs(job, g->bo); + err = do_relocs(job, g); if (err) - goto out; + break;

- err = do_waitchks(job, host, g->bo); + err = do_waitchks(job, host, g); if (err) - goto out; + break; }

- if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) - goto out; - - err = copy_gathers(job, dev); out: if (err) host1x_job_unpin(job);

-- 2.13.0

Mikko Perttunen

13 Jun 13 Jun

5:31 p.m.

New subject: [PATCH 13/22] gpu: host1x: Do not leak BO's phys address to userspace

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

Do gathers coping before patching them, so the original gathers are left untouched. That's not as bad as leaking a kernel addresses, but still doesn't feel right.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 14f3f957ffab..190353944d23 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -137,8 +137,9 @@ static void host1x_syncpt_patch_offset(struct host1x_syncpt *sp,

avoid a wrap condition in the HW).

*/ static int do_waitchks(struct host1x_job *job, struct host1x *host,
       struct host1x_bo *patch)
       struct host1x_job_gather *g)
{
struct host1x_bo *patch = g->bo; int i;

/* compare syncpt vs wait threshold */
@@ -165,7 +166,8 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host, wait->syncpt_id, sp->name, wait->thresh, host1x_syncpt_read_min(sp));
	host1x_syncpt_patch_offset(sp, patch, wait->offset);
	host1x_syncpt_patch_offset(sp, patch,
				   g->offset + wait->offset);
}

wait->bo = NULL;
@@ -269,11 +271,12 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) return err; }

-static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL;

struct host1x_bo *cmdbuf = g->bo;

/* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) {

@@ -286,7 +289,8 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;
if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) &&
    last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
if (cmdbuf_page_addr)
	host1x_bo_kunmap(cmdbuf, last_page,
			 cmdbuf_page_addr);
@@ -301,11 +305,20 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) } }
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
	cmdbuf_page_addr = job->gather_copy_mapped;
	cmdbuf_page_addr += g->offset;
}
target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
	target += (reloc->cmdbuf.offset & PAGE_MASK) >> 2;
*target = reloc_addr;

I think this logic would be cleaner like ..

if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { target = job->gather_copy_mapped + g->offset + reloc->cmdbuf.offset; ... } else { if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { ... } target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK); } *target = reloc_addr

...

}

if (cmdbuf_page_addr)

if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr);

return 0;

@@ -573,6 +586,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (err) goto out;
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
err = copy_gathers(job, dev);
if (err)
	goto out;
}

/* patch gathers */ for (i = 0; i < job->num_gathers; i++) { struct host1x_job_gather *g = &job->gathers[i];
@@ -581,7 +600,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (g->handled) continue;
g->base = job->gather_addr_phys[i];
if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
	g->base = job->gather_addr_phys[i];

Perhaps add a comment here like "copy_gathers sets base when firewall is enabled"

...

for (j = i + 1; j < job->num_gathers; j++) {
	if (job->gathers[j].bo == g->bo) {
@@ -590,19 +610,15 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) } }
err = do_relocs(job, g->bo);
err = do_relocs(job, g);
if (err)
	goto out;
	break;
err = do_waitchks(job, host, g->bo);
err = do_waitchks(job, host, g);
if (err)
	goto out;
	break;
}
if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
goto out;
err = copy_gathers(job, dev); out: if (err) host1x_job_unpin(job);

Dmitry Osipenko

6:21 p.m.

New subject: [PATCH 13/22] gpu: host1x: Do not leak BO's phys address to userspace

On 13.06.2017 20:31, Mikko Perttunen wrote:

...

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Do gathers coping before patching them, so the original gathers are left untouched. That's not as bad as leaking a kernel addresses, but still doesn't feel right.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 14f3f957ffab..190353944d23 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -137,8 +137,9 @@ static void host1x_syncpt_patch_offset(struct host1x_syncpt *sp,

avoid a wrap condition in the HW).

*/ static int do_waitchks(struct host1x_job *job, struct host1x *host,
          struct host1x_bo *patch)
          struct host1x_job_gather *g)
{
struct host1x_bo *patch = g->bo; int i; /* compare syncpt vs wait threshold */
@@ -165,7 +166,8 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host, wait->syncpt_id, sp->name, wait->thresh, host1x_syncpt_read_min(sp));
       host1x_syncpt_patch_offset(sp, patch, wait->offset);
       host1x_syncpt_patch_offset(sp, patch,
                      g->offset + wait->offset);
    }
      wait->bo = NULL;
@@ -269,11 +271,12 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) return err; } -static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL;

struct host1x_bo *cmdbuf = g->bo; /* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) {

@@ -286,7 +289,8 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;
   if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
   if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) &&
       last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
        if (cmdbuf_page_addr)
            host1x_bo_kunmap(cmdbuf, last_page,
                     cmdbuf_page_addr);
@@ -301,11 +305,20 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) } }
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
       cmdbuf_page_addr = job->gather_copy_mapped;
       cmdbuf_page_addr += g->offset;
   }
    target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
       target += (reloc->cmdbuf.offset & PAGE_MASK) >> 2;
    *target = reloc_addr;
I think this logic would be cleaner like ..

if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { target = job->gather_copy_mapped + g->offset + reloc->cmdbuf.offset; ... } else { if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { ... } target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK); } *target = reloc_addr

What about this variant?

/* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) { @@ -286,6 +289,12 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;

+ if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { + cmdbuf_page_addr = job->gather_copy_mapped + g->offset; + target = cmdbuf_page_addr + reloc->cmdbuf.offset; + goto patch_reloc; + } + if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { if (cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, @@ -302,10 +311,11 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) }

target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK); +patch_reloc: *target = reloc_addr; }

- if (cmdbuf_page_addr) + if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr);

return 0;

...

...
  }
if (cmdbuf_page_addr)

if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr); return 0;

@@ -573,6 +586,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (err) goto out;
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
   err = copy_gathers(job, dev);
   if (err)
       goto out;
}
/* patch gathers */
for (i = 0; i < job->num_gathers; i++) {
    struct host1x_job_gather *g = &job->gathers[i];
@@ -581,7 +600,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (g->handled) continue;
   g->base = job->gather_addr_phys[i];
   if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
       g->base = job->gather_addr_phys[i];
Perhaps add a comment here like "copy_gathers sets base when firewall is enabled"

Okay, added. Thank you for the review comments!

-- Dmitry

Mikko Perttunen

7:03 p.m.

New subject: [PATCH 13/22] gpu: host1x: Do not leak BO's phys address to userspace

On 06/13/2017 09:21 PM, Dmitry Osipenko wrote:

...

On 13.06.2017 20:31, Mikko Perttunen wrote:

...
On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Do gathers coping before patching them, so the original gathers are left untouched. That's not as bad as leaking a kernel addresses, but still doesn't feel right.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 14f3f957ffab..190353944d23 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -137,8 +137,9 @@ static void host1x_syncpt_patch_offset(struct host1x_syncpt *sp, * avoid a wrap condition in the HW). */ static int do_waitchks(struct host1x_job *job, struct host1x *host,
          struct host1x_bo *patch)
          struct host1x_job_gather *g)
{
struct host1x_bo *patch = g->bo; int i; /* compare syncpt vs wait threshold */
@@ -165,7 +166,8 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host, wait->syncpt_id, sp->name, wait->thresh, host1x_syncpt_read_min(sp));
       host1x_syncpt_patch_offset(sp, patch, wait->offset);
       host1x_syncpt_patch_offset(sp, patch,
                      g->offset + wait->offset);
     }
       wait->bo = NULL;
@@ -269,11 +271,12 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) return err; } -static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL;

struct host1x_bo *cmdbuf = g->bo; /* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) {

@@ -286,7 +289,8 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;
   if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
   if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) &&
       last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
         if (cmdbuf_page_addr)
             host1x_bo_kunmap(cmdbuf, last_page,
                      cmdbuf_page_addr);
@@ -301,11 +305,20 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) } }
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
       cmdbuf_page_addr = job->gather_copy_mapped;
       cmdbuf_page_addr += g->offset;
   }
     target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
       target += (reloc->cmdbuf.offset & PAGE_MASK) >> 2;
     *target = reloc_addr;
I think this logic would be cleaner like ..

if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { target = job->gather_copy_mapped + g->offset + reloc->cmdbuf.offset; ... } else { if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { ... } target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK); } *target = reloc_addr
What about this variant?

-static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL;

struct host1x_bo *cmdbuf = g->bo;

/* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) {

@@ -286,6 +289,12 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
	cmdbuf_page_addr = job->gather_copy_mapped + g->offset;

This no longer represents the address of a page, so I think it would be better to just assign the final value to target.

...

	target = cmdbuf_page_addr + reloc->cmdbuf.offset;
	goto patch_reloc;
}
if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { if (cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page,
@@ -302,10 +311,11 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) }
target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
+patch_reloc: *target = reloc_addr; }

if (cmdbuf_page_addr)

if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr)

And then we wouldn't need the IS_ENABLED here

...

host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr);

return 0;

with that change, looks good to me

...

...
...
   }
if (cmdbuf_page_addr)

if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr); return 0;

@@ -573,6 +586,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (err) goto out;
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
   err = copy_gathers(job, dev);
   if (err)
       goto out;
}
 /* patch gathers */
 for (i = 0; i < job->num_gathers; i++) {
     struct host1x_job_gather *g = &job->gathers[i];
@@ -581,7 +600,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (g->handled) continue;
   g->base = job->gather_addr_phys[i];
   if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
       g->base = job->gather_addr_phys[i];
Perhaps add a comment here like "copy_gathers sets base when firewall is enabled"
Okay, added. Thank you for the review comments!

Dmitry Osipenko

7:56 p.m.

New subject: [PATCH 13/22] gpu: host1x: Do not leak BO's phys address to userspace

On 13.06.2017 22:03, Mikko Perttunen wrote:

...

On 06/13/2017 09:21 PM, Dmitry Osipenko wrote:

...
On 13.06.2017 20:31, Mikko Perttunen wrote:

...
On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Do gathers coping before patching them, so the original gathers are left untouched. That's not as bad as leaking a kernel addresses, but still doesn't feel right.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 14f3f957ffab..190353944d23 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -137,8 +137,9 @@ static void host1x_syncpt_patch_offset(struct host1x_syncpt *sp, * avoid a wrap condition in the HW). */ static int do_waitchks(struct host1x_job *job, struct host1x *host,
          struct host1x_bo *patch)
          struct host1x_job_gather *g)
{
struct host1x_bo *patch = g->bo; int i; /* compare syncpt vs wait threshold */
@@ -165,7 +166,8 @@ static int do_waitchks(struct host1x_job *job, struct host1x *host, wait->syncpt_id, sp->name, wait->thresh, host1x_syncpt_read_min(sp));
       host1x_syncpt_patch_offset(sp, patch, wait->offset);
       host1x_syncpt_patch_offset(sp, patch,
                      g->offset + wait->offset);
     }
       wait->bo = NULL;
@@ -269,11 +271,12 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job) return err; } -static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL;

struct host1x_bo *cmdbuf = g->bo; /* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) {

@@ -286,7 +289,8 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;
   if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
   if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) &&
       last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
         if (cmdbuf_page_addr)
             host1x_bo_kunmap(cmdbuf, last_page,
                      cmdbuf_page_addr);
@@ -301,11 +305,20 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) } }
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
       cmdbuf_page_addr = job->gather_copy_mapped;
       cmdbuf_page_addr += g->offset;
   }
     target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
       target += (reloc->cmdbuf.offset & PAGE_MASK) >> 2;
     *target = reloc_addr;
I think this logic would be cleaner like ..

if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) { target = job->gather_copy_mapped + g->offset + reloc->cmdbuf.offset; ... } else { if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) { ... } target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK); } *target = reloc_addr
What about this variant?

-static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) +static int do_relocs(struct host1x_job *job, struct host1x_job_gather *g) { int i = 0; u32 last_page = ~0; void *cmdbuf_page_addr = NULL;

struct host1x_bo *cmdbuf = g->bo;

/* pin & patch the relocs for one gather */ for (i = 0; i < job->num_relocs; i++) {

@@ -286,6 +289,12 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) if (cmdbuf != reloc->cmdbuf.bo) continue;
   if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
       cmdbuf_page_addr = job->gather_copy_mapped + g->offset;
This no longer represents the address of a page, so I think it would be better to just assign the final value to target.

...
       target = cmdbuf_page_addr + reloc->cmdbuf.offset;
       goto patch_reloc;
   }
    if (last_page != reloc->cmdbuf.offset >> PAGE_SHIFT) {
        if (cmdbuf_page_addr)
            host1x_bo_kunmap(cmdbuf, last_page,
@@ -302,10 +311,11 @@ static int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) }
      target = cmdbuf_page_addr + (reloc->cmdbuf.offset & ~PAGE_MASK);
+patch_reloc: *target = reloc_addr; }

if (cmdbuf_page_addr)

if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr)
And then we wouldn't need the IS_ENABLED here

...
      host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr);

  return 0;
with that change, looks good to me

Thank you very much for the suggestion.

...

...
...
...
   }
if (cmdbuf_page_addr)

if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && cmdbuf_page_addr) host1x_bo_kunmap(cmdbuf, last_page, cmdbuf_page_addr); return 0;

@@ -573,6 +586,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (err) goto out;
if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) {
   err = copy_gathers(job, dev);
   if (err)
       goto out;
}
 /* patch gathers */
 for (i = 0; i < job->num_gathers; i++) {
     struct host1x_job_gather *g = &job->gathers[i];
@@ -581,7 +600,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (g->handled) continue;
   g->base = job->gather_addr_phys[i];
   if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
       g->base = job->gather_addr_phys[i];
Perhaps add a comment here like "copy_gathers sets base when firewall is enabled"
Okay, added. Thank you for the review comments!

-- Dmitry

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work. Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;

+ /* relocation shift value validation isn't implemented yet */ + if (reloc->shift) + return false; + return true; }

-- 2.13.0

Erik Faye-Lund

12:33 a.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...

Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work. Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;
  /* relocation shift value validation isn't implemented yet */
  if (reloc->shift)
          return false;
  return true;
}

-- 2.13.0

Good call.

Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

Mikko Perttunen

10:14 a.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

Reloc shifts are implemented (see assignment of u32 reloc_addr in do_relocs), and they are required for VIC job submissions.

On 23.05.2017 03:14, Dmitry Osipenko wrote:

...

Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work. Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;
/* relocation shift value validation isn't implemented yet */

if (reloc->shift)
return false;
return true;
}

Dmitry Osipenko

10:58 a.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 23.05.2017 13:14, Mikko Perttunen wrote:

...

Reloc shifts are implemented (see assignment of u32 reloc_addr in do_relocs), and they are required for VIC job submissions.

The *validation* is unimplemented. Since VIC uses the shifts, we may add a validation for it in a way it is done for the registers / class checks, i.e. compare the per address register shift value. I suppose those registers are documented somewhere(TRM), could you please point me to them (TRM page, reg name)?

...

On 23.05.2017 03:14, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work. Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;
/* relocation shift value validation isn't implemented yet */

if (reloc->shift)
   return false;
return true;
}

-- Dmitry

Mikko Perttunen

11:19 a.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 23.05.2017 13:58, Dmitry Osipenko wrote:

...

On 23.05.2017 13:14, Mikko Perttunen wrote:

...
Reloc shifts are implemented (see assignment of u32 reloc_addr in do_relocs), and they are required for VIC job submissions.

The *validation* is unimplemented. Since VIC uses the shifts, we may add a validation for it in a way it is done for the registers / class checks, i.e. compare the per address register shift value. I suppose those registers are documented somewhere(TRM), could you please point me to them (TRM page, reg name)?

Ah, indeed, sorry. I'm not sure if it's worth implementing validation for e.g. VIC, since it would be quite a bit of work and require many per-chip and per-unit tables or ranges in the kernel. I think it's a better solution to just forbid everything in the firewall for VIC (and eventually other units), since on systems with those units the normal configuration will be IOMMU anyway.

Anyway, For VIC, everything happens using two registers (method index and method parameter), so you would need to first detect the method index, save that, then wait for a method parameter write and then check that against the written method. The TRM currently has a list methods, but doesn't list their parameter - at least most ones taking a pointer are called *_OFFSET, but not sure if all. In any case, I don't think it's worth implementing.

Thanks, Mikko

...

...
On 23.05.2017 03:14, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work. Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;
/* relocation shift value validation isn't implemented yet */

if (reloc->shift)
   return false;
return true;
}

Dmitry Osipenko

11:28 a.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 23.05.2017 14:19, Mikko Perttunen wrote:

...

On 23.05.2017 13:58, Dmitry Osipenko wrote:

...
On 23.05.2017 13:14, Mikko Perttunen wrote:

...
Reloc shifts are implemented (see assignment of u32 reloc_addr in do_relocs), and they are required for VIC job submissions.

The *validation* is unimplemented. Since VIC uses the shifts, we may add a validation for it in a way it is done for the registers / class checks, i.e. compare the per address register shift value. I suppose those registers are documented somewhere(TRM), could you please point me to them (TRM page, reg name)?

Ah, indeed, sorry. I'm not sure if it's worth implementing validation for e.g. VIC, since it would be quite a bit of work and require many per-chip and per-unit tables or ranges in the kernel. I think it's a better solution to just forbid everything in the firewall for VIC (and eventually other units), since on systems with those units the normal configuration will be IOMMU anyway.

Anyway, For VIC, everything happens using two registers (method index and method parameter), so you would need to first detect the method index, save that, then wait for a method parameter write and then check that against the written method. The TRM currently has a list methods, but doesn't list their parameter - at least most ones taking a pointer are called *_OFFSET, but not sure if all. In any case, I don't think it's worth implementing.

Yes, in IOMMU case firewall will be only useful as a debug feature. We may check for the IOMMU presence and bypass the firewall if it's present, or we may bypass the shifts validation only.

...

...
...
On 23.05.2017 03:14, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work. Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;
/* relocation shift value validation isn't implemented yet */

if (reloc->shift)
   return false;
return true;
}

-- Dmitry

Mikko Perttunen

1 Jun 1 Jun

5:39 p.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work.

Perhaps change to "As of now there is no use for the address shifting on Tegra20" if you post another revision.

...

Let's forbid it in the firewall till a proper validation is implemented.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 190353944d23..1a1568e64ba8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -332,6 +332,10 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, if (reloc->cmdbuf.bo != cmdbuf || reloc->cmdbuf.offset != offset) return false;
/* relocation shift value validation isn't implemented yet */

if (reloc->shift)
return false;
return true; }

Dmitry Osipenko

6:37 p.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 01.06.2017 20:39, Mikko Perttunen wrote:

...

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work.

Perhaps change to "As of now there is no use for the address shifting on Tegra20" if you post another revision.

I'll post a new revision of the series after getting comments to the all patches, to not churn the ML. Thank you very much for the reviews!

-- Dmitry

Dmitry Osipenko

6:44 p.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 01.06.2017 21:37, Dmitry Osipenko wrote:

...

On 01.06.2017 20:39, Mikko Perttunen wrote:

...
Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work.

Perhaps change to "As of now there is no use for the address shifting on Tegra20" if you post another revision.

I'll post a new revision of the series after getting comments to the all patches, to not churn the ML. Thank you very much for the reviews!

However, given your previous comments to this patch, I'll probably add a bypass of the shit checking in case of IOMMU presence.

-- Dmitry

Mikko Perttunen

6:51 p.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 06/01/2017 09:44 PM, Dmitry Osipenko wrote:

...

On 01.06.2017 21:37, Dmitry Osipenko wrote:

...
On 01.06.2017 20:39, Mikko Perttunen wrote:

...
Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work.

Perhaps change to "As of now there is no use for the address shifting on Tegra20" if you post another revision.

I'll post a new revision of the series after getting comments to the all patches, to not churn the ML. Thank you very much for the reviews!

However, given your previous comments to this patch, I'll probably add a bypass of the shit checking in case of IOMMU presence.

I don't think that's needed - the firewall will deny pretty much all VIC submissions due to is_addr_reg not being implemented so it cannot reasonably be used on modern Tegras anyway.

Dmitry Osipenko

7:15 p.m.

New subject: [PATCH 14/22] gpu: host1x: Forbid relocation address shifting in the firewall

On 01.06.2017 21:51, Mikko Perttunen wrote:

...

On 06/01/2017 09:44 PM, Dmitry Osipenko wrote:

...
On 01.06.2017 21:37, Dmitry Osipenko wrote:

...
On 01.06.2017 20:39, Mikko Perttunen wrote:

...
Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absent. As of now there is no use for the address shifting (at least on Tegra20) and adding a proper shifting / sizes validation is much more work.

Perhaps change to "As of now there is no use for the address shifting on Tegra20" if you post another revision.

I'll post a new revision of the series after getting comments to the all patches, to not churn the ML. Thank you very much for the reviews!

However, given your previous comments to this patch, I'll probably add a bypass of the shit checking in case of IOMMU presence.

The IOMMU presence checking probably wouldn't be enough. Better to check the Host1x version instead, to not break the non-IOMMU case on modern Tegras.

...

I don't think that's needed - the firewall will deny pretty much all VIC submissions due to is_addr_reg not being implemented so it cannot reasonably be used on modern Tegras anyway.

Either firewall should be completely avoided on newer Tegras or it should perform at least some checks and not break the newer Tegras.

-- Dmitry

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 15/22] gpu: host1x: Forbid RESTART opcode in the firewall

The RESTART opcode terminates the gather and restarts the CDMA fetching from a specified word << 2 relative to the CDMA start address. That shouldn't be allowed to be done by userspace.

Signed-off-by: Dmitry Osipenko digetx@gmail.com Reviewed-by: Erik Faye-Lund kusmabite@gmail.com --- drivers/gpu/host1x/job.c | 1 - 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 1a1568e64ba8..cf335c5979e2 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -497,7 +497,6 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) goto out; break; case 4: - case 5: case 14: break; default:

-- 2.13.0

Mikko Perttunen

1 Jun 1 Jun

5:41 p.m.

New subject: [PATCH 15/22] gpu: host1x: Forbid RESTART opcode in the firewall

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

The RESTART opcode terminates the gather and restarts the CDMA fetching from a specified word << 2 relative to the CDMA start address. That shouldn't be allowed to be done by userspace.

Signed-off-by: Dmitry Osipenko digetx@gmail.com Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

drivers/gpu/host1x/job.c | 1 - 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 1a1568e64ba8..cf335c5979e2 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -497,7 +497,6 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) goto out; break; case 4:
case 5:
case 14: break; default:

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, }

job->is_addr_reg = context->client->ops->is_addr_reg; + job->is_valid_class = context->client->ops->is_valid_class; job->syncpt_incrs = syncpt.incrs; job->syncpt_id = syncpt.id; job->timeout = 10000; diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset); + int (*is_valid_class)(u32 class); int (*submit)(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file); diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; }

+static int gr2d_is_valid_class(u32 class) +{ + switch (class) { + case HOST1X_CLASS_GR2D: + case HOST1X_CLASS_GR2D_SB: + return 1; + } + + return 0; +} + static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg, + .is_valid_class = gr2d_is_valid_class, .submit = tegra_drm_submit, };

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall {

static int check_register(struct host1x_firewall *fw, unsigned long offset) { + if (!fw->job->is_addr_reg) + return 0; + if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) { if (!fw->num_relocs) return -EINVAL; @@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; }

+static int check_class(struct host1x_firewall *fw, u32 class) +{ + if (!fw->job->is_valid_class) { + if (fw->class != class) + return -EINVAL; + } else { + if (!fw->job->is_valid_class(fw->class)) + return -EINVAL; + } + + return 0; +} + static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask; @@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32)); + u32 job_class = fw->class; int err = 0;

- if (!fw->job->is_addr_reg) - return 0; - fw->words = g->words; fw->cmdbuf = g->bo; fw->offset = 0; @@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff; - err = check_mask(fw); + err = check_class(fw, job_class); + if (!err) + err = check_mask(fw); if (err) goto out; break; diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped;

/* Check if register is marked as an address reg */ - int (*is_addr_reg)(struct device *dev, u32 reg, u32 class); + int (*is_addr_reg)(struct device *dev, u32 class, u32 reg); + + /* Check if class belongs to the unit */ + int (*is_valid_class)(u32 class);

/* Request a SETCLASS to this class */ u32 class;

-- 2.13.0

Erik Faye-Lund

12:39 a.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...

Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, }
    job->is_addr_reg = context->client->ops->is_addr_reg;
  job->is_valid_class = context->client->ops->is_valid_class;
  job->syncpt_incrs = syncpt.incrs;
  job->syncpt_id = syncpt.id;
  job->timeout = 10000;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);
  int (*is_valid_class)(u32 class);
  int (*submit)(struct tegra_drm_context *context,
                struct drm_tegra_submit *args, struct drm_device *drm,
                struct drm_file *file);
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; }

+static int gr2d_is_valid_class(u32 class) +{
  switch (class) {
  case HOST1X_CLASS_GR2D:
  case HOST1X_CLASS_GR2D_SB:
          return 1;
  }
  return 0;
+}

static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg,
  .is_valid_class = gr2d_is_valid_class,
  .submit = tegra_drm_submit,
};

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall {

static int check_register(struct host1x_firewall *fw, unsigned long offset) {
  if (!fw->job->is_addr_reg)
          return 0;
  if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) {
          if (!fw->num_relocs)
                  return -EINVAL;
@@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; }

+static int check_class(struct host1x_firewall *fw, u32 class) +{
  if (!fw->job->is_valid_class) {
          if (fw->class != class)
                  return -EINVAL;
  } else {
          if (!fw->job->is_valid_class(fw->class))
                  return -EINVAL;
  }
  return 0;
+}

static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask; @@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32));
  u32 job_class = fw->class;
  int err = 0;
  if (!fw->job->is_addr_reg)
          return 0;
  fw->words = g->words;
  fw->cmdbuf = g->bo;
  fw->offset = 0;
@@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff;
                  err = check_mask(fw);
                  err = check_class(fw, job_class);
                  if (!err)
                          err = check_mask(fw);
                  if (err)
                          goto out;
                  break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped;
    /* Check if register is marked as an address reg */
  int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);
  int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);

This seems like an unrelated fix, you might want to mention it in the commit message at least.

...

  /* Check if class belongs to the unit */

  int (*is_valid_class)(u32 class);

  /* Request a SETCLASS to this class */
  u32 class;

-- 2.13.0

Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

Dmitry Osipenko

12:45 a.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

On 23.05.2017 03:39, Erik Faye-Lund wrote:

...

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...
Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, }
    job->is_addr_reg = context->client->ops->is_addr_reg;
  job->is_valid_class = context->client->ops->is_valid_class;
  job->syncpt_incrs = syncpt.incrs;
  job->syncpt_id = syncpt.id;
  job->timeout = 10000;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);
  int (*is_valid_class)(u32 class);
  int (*submit)(struct tegra_drm_context *context,
                struct drm_tegra_submit *args, struct drm_device *drm,
                struct drm_file *file);
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; }

+static int gr2d_is_valid_class(u32 class) +{
  switch (class) {
  case HOST1X_CLASS_GR2D:
  case HOST1X_CLASS_GR2D_SB:
          return 1;
  }
  return 0;
+}

static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg,
  .is_valid_class = gr2d_is_valid_class,
  .submit = tegra_drm_submit,
};

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall {

static int check_register(struct host1x_firewall *fw, unsigned long offset) {
  if (!fw->job->is_addr_reg)
          return 0;
  if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) {
          if (!fw->num_relocs)
                  return -EINVAL;
@@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; }

+static int check_class(struct host1x_firewall *fw, u32 class) +{
  if (!fw->job->is_valid_class) {
          if (fw->class != class)
                  return -EINVAL;
  } else {
          if (!fw->job->is_valid_class(fw->class))
                  return -EINVAL;
  }
  return 0;
+}

static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask; @@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32));
  u32 job_class = fw->class;
  int err = 0;
  if (!fw->job->is_addr_reg)
          return 0;
  fw->words = g->words;
  fw->cmdbuf = g->bo;
  fw->offset = 0;
@@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff;
                  err = check_mask(fw);
                  err = check_class(fw, job_class);
                  if (!err)
                          err = check_mask(fw);
                  if (err)
                          goto out;
                  break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped;
    /* Check if register is marked as an address reg */
  int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);
  int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);
This seems like an unrelated fix, you might want to mention it in the commit message at least.

Yeah, I slipped in this typo fix here as I got confused by the swapped class-reg in the function proto while was writing this patch.

-- Dmitry

Thierry Reding

13 Jun 13 Jun

2:06 p.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

On Tue, May 23, 2017 at 02:39:33AM +0200, Erik Faye-Lund wrote:

...

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...
Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, }
    job->is_addr_reg = context->client->ops->is_addr_reg;
  job->is_valid_class = context->client->ops->is_valid_class;
  job->syncpt_incrs = syncpt.incrs;
  job->syncpt_id = syncpt.id;
  job->timeout = 10000;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);
  int (*is_valid_class)(u32 class);
  int (*submit)(struct tegra_drm_context *context,
                struct drm_tegra_submit *args, struct drm_device *drm,
                struct drm_file *file);
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; }

+static int gr2d_is_valid_class(u32 class) +{
  switch (class) {
  case HOST1X_CLASS_GR2D:
  case HOST1X_CLASS_GR2D_SB:
          return 1;
  }
  return 0;
+}

static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg,
  .is_valid_class = gr2d_is_valid_class,
  .submit = tegra_drm_submit,
};

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall {

static int check_register(struct host1x_firewall *fw, unsigned long offset) {
  if (!fw->job->is_addr_reg)
          return 0;
  if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) {
          if (!fw->num_relocs)
                  return -EINVAL;
@@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; }

+static int check_class(struct host1x_firewall *fw, u32 class) +{
  if (!fw->job->is_valid_class) {
          if (fw->class != class)
                  return -EINVAL;
  } else {
          if (!fw->job->is_valid_class(fw->class))
                  return -EINVAL;
  }
  return 0;
+}

static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask; @@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32));
  u32 job_class = fw->class;
  int err = 0;
  if (!fw->job->is_addr_reg)
          return 0;
  fw->words = g->words;
  fw->cmdbuf = g->bo;
  fw->offset = 0;
@@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff;
                  err = check_mask(fw);
                  err = check_class(fw, job_class);
                  if (!err)
                          err = check_mask(fw);
                  if (err)
                          goto out;
                  break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped;
    /* Check if register is marked as an address reg */
  int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);
  int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);
This seems like an unrelated fix, you might want to mention it in the commit message at least.

If you're going to rev the series anyway, might be worth splitting this off into a separate commit to make it stand out more.

Either way is fine with me.

Thierry

Dmitry Osipenko

2:15 p.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

On 13.06.2017 17:06, Thierry Reding wrote:

...

On Tue, May 23, 2017 at 02:39:33AM +0200, Erik Faye-Lund wrote:

...
On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...
Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, }
    job->is_addr_reg = context->client->ops->is_addr_reg;
  job->is_valid_class = context->client->ops->is_valid_class;
  job->syncpt_incrs = syncpt.incrs;
  job->syncpt_id = syncpt.id;
  job->timeout = 10000;
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);
  int (*is_valid_class)(u32 class);
  int (*submit)(struct tegra_drm_context *context,
                struct drm_tegra_submit *args, struct drm_device *drm,
                struct drm_file *file);
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; }

+static int gr2d_is_valid_class(u32 class) +{
  switch (class) {
  case HOST1X_CLASS_GR2D:
  case HOST1X_CLASS_GR2D_SB:
          return 1;
  }
  return 0;
+}

static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg,
  .is_valid_class = gr2d_is_valid_class,
  .submit = tegra_drm_submit,
};

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall {

static int check_register(struct host1x_firewall *fw, unsigned long offset) {
  if (!fw->job->is_addr_reg)
          return 0;
  if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) {
          if (!fw->num_relocs)
                  return -EINVAL;
@@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; }

+static int check_class(struct host1x_firewall *fw, u32 class) +{
  if (!fw->job->is_valid_class) {
          if (fw->class != class)
                  return -EINVAL;
  } else {
          if (!fw->job->is_valid_class(fw->class))
                  return -EINVAL;
  }
  return 0;
+}

static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask; @@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32));
  u32 job_class = fw->class;
  int err = 0;
  if (!fw->job->is_addr_reg)
          return 0;
  fw->words = g->words;
  fw->cmdbuf = g->bo;
  fw->offset = 0;
@@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff;
                  err = check_mask(fw);
                  err = check_class(fw, job_class);
                  if (!err)
                          err = check_mask(fw);
                  if (err)
                          goto out;
                  break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped;
    /* Check if register is marked as an address reg */
  int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);
  int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);
This seems like an unrelated fix, you might want to mention it in the commit message at least.
If you're going to rev the series anyway, might be worth splitting this off into a separate commit to make it stand out more.

Either way is fine with me.

Thierry

Yes, factoring out that change is reasonable.

-- Dmitry

Mikko Perttunen

1 Jun 1 Jun

5:46 p.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

With the simplification below and the is_addr_reg change mentioned,

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, }

job->is_addr_reg = context->client->ops->is_addr_reg;

job->is_valid_class = context->client->ops->is_valid_class; job->syncpt_incrs = syncpt.incrs; job->syncpt_id = syncpt.id; job->timeout = 10000;

diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);

int (*is_valid_class)(u32 class); int (*submit)(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file);

diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; }

+static int gr2d_is_valid_class(u32 class) +{
switch (class) {

case HOST1X_CLASS_GR2D:

case HOST1X_CLASS_GR2D_SB:
return 1;
}

return 0;
+}

This could be just

return (class == HOST1X_CLASS_GR2D || class == HOST1X_CLASS_GR2D_SB);

...

static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg,

.is_valid_class = gr2d_is_valid_class, .submit = tegra_drm_submit, };

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall {

static int check_register(struct host1x_firewall *fw, unsigned long offset) {
if (!fw->job->is_addr_reg)
return 0;
if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) { if (!fw->num_relocs) return -EINVAL;
@@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; }

+static int check_class(struct host1x_firewall *fw, u32 class) +{
if (!fw->job->is_valid_class) {
if (fw->class != class)
	return -EINVAL;
} else {
if (!fw->job->is_valid_class(fw->class))
	return -EINVAL;
}

return 0;
+}

static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask;

@@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32));

u32 job_class = fw->class; int err = 0;
if (!fw->job->is_addr_reg)
return 0;
fw->words = g->words; fw->cmdbuf = g->bo; fw->offset = 0;
@@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff;
	err = check_mask(fw);
	err = check_class(fw, job_class);
	if (!err)
		err = check_mask(fw);
if (err)
	goto out;
break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped;

/* Check if register is marked as an address reg */

int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);

int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);

/* Check if class belongs to the unit */

int (*is_valid_class)(u32 class);

/* Request a SETCLASS to this class */ u32 class;

Dmitry Osipenko

6:36 p.m.

New subject: [PATCH 16/22] gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall

On 01.06.2017 20:46, Mikko Perttunen wrote:

...

With the simplification below and the is_addr_reg change mentioned,

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...
Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 1 + drivers/gpu/drm/tegra/drm.h | 1 + drivers/gpu/drm/tegra/gr2d.c | 12 ++++++++++++ drivers/gpu/host1x/job.c | 24 ++++++++++++++++++++---- include/linux/host1x.h | 5 ++++- 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index cdb05d6efde4..17416e1c219a 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -531,6 +531,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, } job->is_addr_reg = context->client->ops->is_addr_reg;

job->is_valid_class = context->client->ops->is_valid_class; job->syncpt_incrs = syncpt.incrs; job->syncpt_id = syncpt.id; job->timeout = 10000;

diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 85aa2e3d9d4e..6d6da01282f3 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -83,6 +83,7 @@ struct tegra_drm_client_ops { struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);

int (*is_valid_class)(u32 class); int (*submit)(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file);

diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..782231c41a1a 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -109,10 +109,22 @@ static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) return 0; } +static int gr2d_is_valid_class(u32 class) +{
switch (class) {

case HOST1X_CLASS_GR2D:

case HOST1X_CLASS_GR2D_SB:
   return 1;
}

return 0;
+}
This could be just

return (class == HOST1X_CLASS_GR2D || class == HOST1X_CLASS_GR2D_SB);

That's a matter of a taste as both variants result in a same assembly being generated. I don't mind your variant ;)

...

...
static const struct tegra_drm_client_ops gr2d_ops = { .open_channel = gr2d_open_channel, .close_channel = gr2d_close_channel, .is_addr_reg = gr2d_is_addr_reg,

.is_valid_class = gr2d_is_valid_class, .submit = tegra_drm_submit, }; diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c

index cf335c5979e2..65e12219405a 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -358,6 +358,9 @@ struct host1x_firewall { static int check_register(struct host1x_firewall *fw, unsigned long offset) {
if (!fw->job->is_addr_reg)
   return 0;
if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) {
    if (!fw->num_relocs)
        return -EINVAL;
@@ -372,6 +375,19 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; } +static int check_class(struct host1x_firewall *fw, u32 class) +{
if (!fw->job->is_valid_class) {
   if (fw->class != class)
       return -EINVAL;
} else {
   if (!fw->job->is_valid_class(fw->class))
       return -EINVAL;
}

return 0;
+}

static int check_mask(struct host1x_firewall *fw) { u32 mask = fw->mask;

@@ -445,11 +461,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / sizeof(u32));

u32 job_class = fw->class; int err = 0;

if (!fw->job->is_addr_reg)
   return 0;
fw->words = g->words;
fw->cmdbuf = g->bo;
fw->offset = 0;
@@ -469,7 +483,9 @@ static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) fw->class = word >> 6 & 0x3ff; fw->mask = word & 0x3f; fw->reg = word >> 16 & 0xfff;
       err = check_mask(fw);
       err = check_class(fw, job_class);
       if (!err)
           err = check_mask(fw);
        if (err)
            goto out;
        break;
diff --git a/include/linux/host1x.h b/include/linux/host1x.h index aa323e43ae4e..561d6bb6580d 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -233,7 +233,10 @@ struct host1x_job { u8 *gather_copy_mapped; /* Check if register is marked as an address reg */

int (*is_addr_reg)(struct device *dev, u32 reg, u32 class);

int (*is_addr_reg)(struct device *dev, u32 class, u32 reg);

/* Check if class belongs to the unit */

int (*is_valid_class)(u32 class); /* Request a SETCLASS to this class */ u32 class;

-- Dmitry

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 17/22] gpu: host1x: Check waits in the firewall

Check waits in the firewall in a way it is done for relocations.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/host1x/job.c | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 65e12219405a..7bc7d0c64559 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -31,6 +31,8 @@ #include "job.h" #include "syncpt.h"

+#define HOST1X_WAIT_SYNCPT_OFFSET 0x8 + struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, u32 num_cmdbufs, u32 num_relocs, u32 num_waitchks) @@ -339,6 +341,17 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, return true; }

+static bool check_wait(struct host1x_waitchk *wait, struct host1x_bo *cmdbuf, + unsigned int offset) +{ + offset *= sizeof(u32); + + if (wait->bo != cmdbuf || wait->offset != offset) + return false; + + return true; +} + struct host1x_firewall { struct host1x_job *job; struct device *dev; @@ -346,6 +359,9 @@ struct host1x_firewall { unsigned int num_relocs; struct host1x_reloc *reloc;

+ unsigned int num_waitchks; + struct host1x_waitchk *waitchk; + struct host1x_bo *cmdbuf; unsigned int offset;

@@ -372,6 +388,20 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) fw->reloc++; }

+ if (offset == HOST1X_WAIT_SYNCPT_OFFSET) { + if (fw->class != HOST1X_CLASS_HOST1X) + return -EINVAL; + + if (!fw->num_waitchks) + return -EINVAL; + + if (!check_wait(fw->waitchk, fw->cmdbuf, fw->offset)) + return -EINVAL; + + fw->num_waitchks--; + fw->waitchk++; + } + return 0; }

@@ -536,6 +566,8 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) fw.dev = dev; fw.reloc = job->relocarray; fw.num_relocs = job->num_relocs; + fw.waitchk = job->waitchk; + fw.num_waitchks = job->num_waitchk; fw.class = job->class;

for (i = 0; i < job->num_gathers; i++) { @@ -574,8 +606,8 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) offset += g->words * sizeof(u32); }

- /* No relocs should remain at this point */ - if (fw.num_relocs) + /* No relocs and waitchks should remain at this point */ + if (fw.num_relocs || fw.num_waitchks) return -EINVAL;

return 0;

-- 2.13.0

Mikko Perttunen

1 Jun 1 Jun

5:51 p.m.

New subject: [PATCH 17/22] gpu: host1x: Check waits in the firewall

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

Check waits in the firewall in a way it is done for relocations.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.c | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 65e12219405a..7bc7d0c64559 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -31,6 +31,8 @@ #include "job.h" #include "syncpt.h"

+#define HOST1X_WAIT_SYNCPT_OFFSET 0x8

struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, u32 num_cmdbufs, u32 num_relocs, u32 num_waitchks)

@@ -339,6 +341,17 @@ static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, return true; }

+static bool check_wait(struct host1x_waitchk *wait, struct host1x_bo *cmdbuf,
       unsigned int offset)
+{
offset *= sizeof(u32);

if (wait->bo != cmdbuf || wait->offset != offset)
return false > +
return true;
+}

struct host1x_firewall { struct host1x_job *job; struct device *dev;

@@ -346,6 +359,9 @@ struct host1x_firewall { unsigned int num_relocs; struct host1x_reloc *reloc;

unsigned int num_waitchks;

struct host1x_waitchk *waitchk;

struct host1x_bo *cmdbuf; unsigned int offset;

@@ -372,6 +388,20 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) fw->reloc++; }
if (offset == HOST1X_WAIT_SYNCPT_OFFSET) {
if (fw->class != HOST1X_CLASS_HOST1X)
	return -EINVAL;
if (!fw->num_waitchks)
	return -EINVAL;
if (!check_wait(fw->waitchk, fw->cmdbuf, fw->offset))
	return -EINVAL;
fw->num_waitchks--;
fw->waitchk++;
}

return 0; }
@@ -536,6 +566,8 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) fw.dev = dev; fw.reloc = job->relocarray; fw.num_relocs = job->num_relocs;

fw.waitchk = job->waitchk;

fw.num_waitchks = job->num_waitchk; fw.class = job->class;

for (i = 0; i < job->num_gathers; i++) {

@@ -574,8 +606,8 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) offset += g->words * sizeof(u32); }

/* No relocs should remain at this point */

if (fw.num_relocs)

/* No relocs and waitchks should remain at this point */

if (fw.num_relocs || fw.num_waitchks) return -EINVAL;

return 0;

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 18/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf'

The struct host1x_cmdbuf is unused, let's remove it.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/host1x/job.h | 7 ------- 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index 0debd93a1849..4bda51d503ec 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -27,13 +27,6 @@ struct host1x_job_gather { bool handled; };

-struct host1x_cmdbuf { - u32 handle; - u32 offset; - u32 words; - u32 pad; -}; - struct host1x_job_unpin_data { struct host1x_bo *bo; struct sg_table *sgt;

-- 2.13.0

Erik Faye-Lund

12:42 a.m.

New subject: [PATCH 18/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf'

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...

The struct host1x_cmdbuf is unused, let's remove it.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.h | 7 ------- 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index 0debd93a1849..4bda51d503ec 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -27,13 +27,6 @@ struct host1x_job_gather { bool handled; };

-struct host1x_cmdbuf {
  u32 handle;
  u32 offset;
  u32 words;
  u32 pad;
-};

struct host1x_job_unpin_data { struct host1x_bo *bo; struct sg_table *sgt; -- 2.13.0

Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

Mikko Perttunen

1 Jun 1 Jun

5:52 p.m.

New subject: [PATCH 18/22] gpu: host1x: Remove unused 'struct host1x_cmdbuf'

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

The struct host1x_cmdbuf is unused, let's remove it.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/job.h | 7 ------- 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index 0debd93a1849..4bda51d503ec 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -27,13 +27,6 @@ struct host1x_job_gather { bool handled; };

-struct host1x_cmdbuf {

u32 handle;

u32 offset;

u32 words;

u32 pad;

-};

struct host1x_job_unpin_data { struct host1x_bo *bo; struct sg_table *sgt;

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 19/22] gpu: host1x: Remove unused host1x_cdma_stop() definition

There is no host1x_cdma_stop() in the code, let's remove its definition from the header file.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/host1x/cdma.h | 1 - 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h index ec170a78f4e1..286d49386be9 100644 --- a/drivers/gpu/host1x/cdma.h +++ b/drivers/gpu/host1x/cdma.h @@ -88,7 +88,6 @@ struct host1x_cdma {

int host1x_cdma_init(struct host1x_cdma *cdma); int host1x_cdma_deinit(struct host1x_cdma *cdma); -void host1x_cdma_stop(struct host1x_cdma *cdma); int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job); void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2); void host1x_cdma_end(struct host1x_cdma *cdma, struct host1x_job *job);

-- 2.13.0

Erik Faye-Lund

12:43 a.m.

New subject: [PATCH 19/22] gpu: host1x: Remove unused host1x_cdma_stop() definition

On Tue, May 23, 2017 at 2:14 AM, Dmitry Osipenko digetx@gmail.com wrote:

...

There is no host1x_cdma_stop() in the code, let's remove its definition from the header file.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/cdma.h | 1 - 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h index ec170a78f4e1..286d49386be9 100644 --- a/drivers/gpu/host1x/cdma.h +++ b/drivers/gpu/host1x/cdma.h @@ -88,7 +88,6 @@ struct host1x_cdma {

int host1x_cdma_init(struct host1x_cdma *cdma); int host1x_cdma_deinit(struct host1x_cdma *cdma); -void host1x_cdma_stop(struct host1x_cdma *cdma); int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job); void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2); void host1x_cdma_end(struct host1x_cdma *cdma, struct host1x_job *job); -- 2.13.0

Reviewed-by: Erik Faye-Lund kusmabite@gmail.com

Mikko Perttunen

1 Jun 1 Jun

5:52 p.m.

New subject: [PATCH 19/22] gpu: host1x: Remove unused host1x_cdma_stop() definition

Reviewed-by: Mikko Perttunen mperttunen@nvidia.com

On 05/23/2017 03:14 AM, Dmitry Osipenko wrote:

...

There is no host1x_cdma_stop() in the code, let's remove its definition from the header file.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/host1x/cdma.h | 1 - 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/host1x/cdma.h b/drivers/gpu/host1x/cdma.h index ec170a78f4e1..286d49386be9 100644 --- a/drivers/gpu/host1x/cdma.h +++ b/drivers/gpu/host1x/cdma.h @@ -88,7 +88,6 @@ struct host1x_cdma {

int host1x_cdma_init(struct host1x_cdma *cdma); int host1x_cdma_deinit(struct host1x_cdma *cdma); -void host1x_cdma_stop(struct host1x_cdma *cdma); int host1x_cdma_begin(struct host1x_cdma *cdma, struct host1x_job *job); void host1x_cdma_push(struct host1x_cdma *cdma, u32 op1, u32 op2); void host1x_cdma_end(struct host1x_cdma *cdma, struct host1x_job *job);

Dmitry Osipenko

23 May 23 May

12:14 a.m.

New subject: [PATCH 20/22] gpu: host1x: Refactor channel allocation code

From: Mikko Perttunen mperttunen@nvidia.com

This is largely a rewrite of the Host1x channel allocation code, bringing several changes:

- The previous code could deadlock due to an interaction between the 'reflock' mutex and CDMA timeout handling. This gets rid of the mutex. - Support for more than 32 channels, required for Tegra186 - General refactoring, including better encapsulation of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen mperttunen@nvidia.com Reviewed-by: Dmitry Osipenko digetx@gmail.com Tested-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/gr2d.c | 4 +- drivers/gpu/drm/tegra/gr3d.c | 4 +- drivers/gpu/drm/tegra/vic.c | 4 +- drivers/gpu/host1x/channel.c | 147 +++++++++++++++++++++++-------------- drivers/gpu/host1x/channel.h | 21 ++++-- drivers/gpu/host1x/debug.c | 47 +++++------- drivers/gpu/host1x/dev.c | 7 +- drivers/gpu/host1x/dev.h | 6 +- drivers/gpu/host1x/hw/channel_hw.c | 4 - include/linux/host1x.h | 1 - 10 files changed, 135 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 782231c41a1a..693fd2b3aef6 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -38,7 +38,7 @@ static int gr2d_init(struct host1x_client *client)

client->syncpts[0] = host1x_syncpt_request(client->dev, flags); if (!client->syncpts[0]) { - host1x_channel_free(gr2d->channel); + host1x_channel_put(gr2d->channel); return -ENOMEM; }

@@ -57,7 +57,7 @@ static int gr2d_exit(struct host1x_client *client) return err;

host1x_syncpt_free(client->syncpts[0]); - host1x_channel_free(gr2d->channel); + host1x_channel_put(gr2d->channel);

return 0; } diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c index 13f0d1b7cd98..cee2ab645cde 100644 --- a/drivers/gpu/drm/tegra/gr3d.c +++ b/drivers/gpu/drm/tegra/gr3d.c @@ -48,7 +48,7 @@ static int gr3d_init(struct host1x_client *client)

client->syncpts[0] = host1x_syncpt_request(client->dev, flags); if (!client->syncpts[0]) { - host1x_channel_free(gr3d->channel); + host1x_channel_put(gr3d->channel); return -ENOMEM; }

@@ -67,7 +67,7 @@ static int gr3d_exit(struct host1x_client *client) return err;

host1x_syncpt_free(client->syncpts[0]); - host1x_channel_free(gr3d->channel); + host1x_channel_put(gr3d->channel);

return 0; } diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c index cd804e404a11..47cb1aaa58b1 100644 --- a/drivers/gpu/drm/tegra/vic.c +++ b/drivers/gpu/drm/tegra/vic.c @@ -182,7 +182,7 @@ static int vic_init(struct host1x_client *client) free_syncpt: host1x_syncpt_free(client->syncpts[0]); free_channel: - host1x_channel_free(vic->channel); + host1x_channel_put(vic->channel); detach_device: if (tegra->domain) iommu_detach_device(tegra->domain, vic->dev); @@ -203,7 +203,7 @@ static int vic_exit(struct host1x_client *client) return err;

host1x_syncpt_free(client->syncpts[0]); - host1x_channel_free(vic->channel); + host1x_channel_put(vic->channel);

if (vic->domain) { iommu_detach_device(vic->domain, vic->dev); diff --git a/drivers/gpu/host1x/channel.c b/drivers/gpu/host1x/channel.c index 8f437d924c10..db9b91d1384c 100644 --- a/drivers/gpu/host1x/channel.c +++ b/drivers/gpu/host1x/channel.c @@ -24,19 +24,33 @@ #include "job.h"

/* Constructor for the host1x device list */ -int host1x_channel_list_init(struct host1x *host) +int host1x_channel_list_init(struct host1x_channel_list *chlist, + unsigned int num_channels) { - INIT_LIST_HEAD(&host->chlist.list); - mutex_init(&host->chlist_mutex); - - if (host->info->nb_channels > BITS_PER_LONG) { - WARN(1, "host1x hardware has more channels than supported by the driver\n"); - return -ENOSYS; + chlist->channels = kcalloc(num_channels, sizeof(struct host1x_channel), + GFP_KERNEL); + if (!chlist->channels) + return -ENOMEM; + + chlist->allocated_channels = + kcalloc(BITS_TO_LONGS(num_channels), sizeof(unsigned long), + GFP_KERNEL); + if (!chlist->allocated_channels) { + kfree(chlist->channels); + return -ENOMEM; }

+ bitmap_zero(chlist->allocated_channels, num_channels); + return 0; }

+void host1x_channel_list_free(struct host1x_channel_list *chlist) +{ + kfree(chlist->allocated_channels); + kfree(chlist->channels); +} + int host1x_job_submit(struct host1x_job *job) { struct host1x *host = dev_get_drvdata(job->channel->dev->parent); @@ -47,86 +61,107 @@ EXPORT_SYMBOL(host1x_job_submit);

struct host1x_channel *host1x_channel_get(struct host1x_channel *channel) { - int err = 0; + kref_get(&channel->refcount);

- mutex_lock(&channel->reflock); + return channel; +} +EXPORT_SYMBOL(host1x_channel_get);

- if (channel->refcount == 0) - err = host1x_cdma_init(&channel->cdma); +/** + * host1x_channel_get_index() - Attempt to get channel reference by index + * @host: Host1x device object + * @index: Index of channel + * + * If channel number @index is currently allocated, increase its refcount + * and return a pointer to it. Otherwise, return NULL. + */ +struct host1x_channel *host1x_channel_get_index(struct host1x *host, + unsigned int index) +{ + struct host1x_channel *ch = &host->channel_list.channels[index];

- if (!err) - channel->refcount++; + if (!kref_get_unless_zero(&ch->refcount)) + return NULL;

- mutex_unlock(&channel->reflock); + return ch; +} + +static void release_channel(struct kref *kref) +{ + struct host1x_channel *channel = + container_of(kref, struct host1x_channel, refcount); + struct host1x *host = dev_get_drvdata(channel->dev->parent); + struct host1x_channel_list *chlist = &host->channel_list; + + host1x_hw_cdma_stop(host, &channel->cdma); + host1x_cdma_deinit(&channel->cdma);

- return err ? NULL : channel; + clear_bit(channel->id, chlist->allocated_channels); } -EXPORT_SYMBOL(host1x_channel_get);

void host1x_channel_put(struct host1x_channel *channel) { - mutex_lock(&channel->reflock); + kref_put(&channel->refcount, release_channel); +} +EXPORT_SYMBOL(host1x_channel_put);

- if (channel->refcount == 1) { - struct host1x *host = dev_get_drvdata(channel->dev->parent); +static struct host1x_channel *acquire_unused_channel(struct host1x *host) +{ + struct host1x_channel_list *chlist = &host->channel_list; + unsigned int max_channels = host->info->nb_channels; + unsigned int index;

- host1x_hw_cdma_stop(host, &channel->cdma); - host1x_cdma_deinit(&channel->cdma); + index = find_first_zero_bit(chlist->allocated_channels, max_channels); + if (index >= max_channels) { + dev_err(host->dev, "failed to find free channel\n"); + return NULL; }

- channel->refcount--; + chlist->channels[index].id = index;

- mutex_unlock(&channel->reflock); + set_bit(index, chlist->allocated_channels); + + return &chlist->channels[index]; } -EXPORT_SYMBOL(host1x_channel_put);

+/** + * host1x_channel_request() - Allocate a channel + * @device: Host1x unit this channel will be used to send commands to + * + * Allocates a new host1x channel for @device. If there are no free channels, + * this will sleep until one becomes available. May return NULL if CDMA + * initialization fails. + */ struct host1x_channel *host1x_channel_request(struct device *dev) { struct host1x *host = dev_get_drvdata(dev->parent); - unsigned int max_channels = host->info->nb_channels; - struct host1x_channel *channel = NULL; - unsigned long index; + struct host1x_channel_list *chlist = &host->channel_list; + struct host1x_channel *channel; int err;

- mutex_lock(&host->chlist_mutex); + channel = acquire_unused_channel(host); + if (!channel) + return NULL;

- index = find_first_zero_bit(&host->allocated_channels, max_channels); - if (index >= max_channels) - goto fail; + kref_init(&channel->refcount); + mutex_init(&channel->submitlock); + channel->dev = dev;

- channel = kzalloc(sizeof(*channel), GFP_KERNEL); - if (!channel) + err = host1x_hw_channel_init(host, channel, channel->id); + if (err < 0) goto fail;

- err = host1x_hw_channel_init(host, channel, index); + err = host1x_cdma_init(&channel->cdma); if (err < 0) goto fail;

- /* Link device to host1x_channel */ - channel->dev = dev; - - /* Add to channel list */ - list_add_tail(&channel->list, &host->chlist.list); - - host->allocated_channels |= BIT(index); - - mutex_unlock(&host->chlist_mutex); return channel;

fail: - dev_err(dev, "failed to init channel\n"); - kfree(channel); - mutex_unlock(&host->chlist_mutex); - return NULL; -} -EXPORT_SYMBOL(host1x_channel_request); + clear_bit(channel->id, chlist->allocated_channels);

-void host1x_channel_free(struct host1x_channel *channel) -{ - struct host1x *host = dev_get_drvdata(channel->dev->parent); + dev_err(dev, "failed to initialize channel\n");

- host->allocated_channels &= ~BIT(channel->id); - list_del(&channel->list); - kfree(channel); + return NULL; } -EXPORT_SYMBOL(host1x_channel_free); +EXPORT_SYMBOL(host1x_channel_request); diff --git a/drivers/gpu/host1x/channel.h b/drivers/gpu/host1x/channel.h index df767cf90d51..7068e42d42df 100644 --- a/drivers/gpu/host1x/channel.h +++ b/drivers/gpu/host1x/channel.h @@ -20,17 +20,21 @@ #define __HOST1X_CHANNEL_H

#include <linux/io.h> +#include <linux/kref.h>

#include "cdma.h"

struct host1x; +struct host1x_channel;

-struct host1x_channel { - struct list_head list; +struct host1x_channel_list { + struct host1x_channel *channels; + unsigned long *allocated_channels; +};

- unsigned int refcount; +struct host1x_channel { + struct kref refcount; unsigned int id; - struct mutex reflock; struct mutex submitlock; void __iomem *regs; struct device *dev; @@ -38,9 +42,10 @@ struct host1x_channel { };

/* channel list operations */ -int host1x_channel_list_init(struct host1x *host); - -#define host1x_for_each_channel(host, channel) \ - list_for_each_entry(channel, &host->chlist.list, list) +int host1x_channel_list_init(struct host1x_channel_list *chlist, + unsigned int num_channels); +void host1x_channel_list_free(struct host1x_channel_list *chlist); +struct host1x_channel *host1x_channel_get_index(struct host1x *host, + unsigned int index);

#endif diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c index d9330fcc62ad..2aae0e63214c 100644 --- a/drivers/gpu/host1x/debug.c +++ b/drivers/gpu/host1x/debug.c @@ -43,24 +43,19 @@ void host1x_debug_output(struct output *o, const char *fmt, ...) o->fn(o->ctx, o->buf, len); }

-static int show_channels(struct host1x_channel *ch, void *data, bool show_fifo) +static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) { struct host1x *m = dev_get_drvdata(ch->dev->parent); struct output *o = data;

- mutex_lock(&ch->reflock); + mutex_lock(&ch->cdma.lock);

- if (ch->refcount) { - mutex_lock(&ch->cdma.lock); + if (show_fifo) + host1x_hw_show_channel_fifo(m, ch, o);

- if (show_fifo) - host1x_hw_show_channel_fifo(m, ch, o); + host1x_hw_show_channel_cdma(m, ch, o);

- host1x_hw_show_channel_cdma(m, ch, o); - mutex_unlock(&ch->cdma.lock); - } - - mutex_unlock(&ch->reflock); + mutex_unlock(&ch->cdma.lock);

return 0; } @@ -94,28 +89,22 @@ static void show_syncpts(struct host1x *m, struct output *o) host1x_debug_output(o, "\n"); }

-static void show_all(struct host1x *m, struct output *o) +static void show_all(struct host1x *m, struct output *o, bool show_fifo) { - struct host1x_channel *ch; + int i;

host1x_hw_show_mlocks(m, o); show_syncpts(m, o); host1x_debug_output(o, "---- channels ----\n");

- host1x_for_each_channel(m, ch) - show_channels(ch, o, true); -} - -static void show_all_no_fifo(struct host1x *host1x, struct output *o) -{ - struct host1x_channel *ch; - - host1x_hw_show_mlocks(host1x, o); - show_syncpts(host1x, o); - host1x_debug_output(o, "---- channels ----\n"); + for (i = 0; i < m->info->nb_channels; ++i) { + struct host1x_channel *ch = host1x_channel_get_index(m, i);

- host1x_for_each_channel(host1x, ch) - show_channels(ch, o, false); + if (ch) { + show_channel(ch, o, show_fifo); + host1x_channel_put(ch); + } + } }

static int host1x_debug_show_all(struct seq_file *s, void *unused) @@ -125,7 +114,7 @@ static int host1x_debug_show_all(struct seq_file *s, void *unused) .ctx = s };

- show_all(s->private, &o); + show_all(s->private, &o, true);

return 0; } @@ -137,7 +126,7 @@ static int host1x_debug_show(struct seq_file *s, void *unused) .ctx = s };

- show_all_no_fifo(s->private, &o); + show_all(s->private, &o, false);

return 0; } @@ -216,7 +205,7 @@ void host1x_debug_dump(struct host1x *host1x) .fn = write_to_printk };

- show_all(host1x, &o); + show_all(host1x, &o, true); }

void host1x_debug_dump_syncpts(struct host1x *host1x) diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index f05ebb14fa63..5c1c711a21af 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -198,7 +198,8 @@ static int host1x_probe(struct platform_device *pdev) host->iova_end = geometry->aperture_end; }

- err = host1x_channel_list_init(host); + err = host1x_channel_list_init(&host->channel_list, + host->info->nb_channels); if (err) { dev_err(&pdev->dev, "failed to initialize channel list\n"); goto fail_detach_device; @@ -207,7 +208,7 @@ static int host1x_probe(struct platform_device *pdev) err = clk_prepare_enable(host->clk); if (err < 0) { dev_err(&pdev->dev, "failed to enable clock\n"); - goto fail_detach_device; + goto fail_free_channels; }

err = reset_control_deassert(host->rst); @@ -244,6 +245,8 @@ static int host1x_probe(struct platform_device *pdev) reset_control_assert(host->rst); fail_unprepare_disable: clk_disable_unprepare(host->clk); +fail_free_channels: + host1x_channel_list_free(&host->channel_list); fail_detach_device: if (host->domain) { put_iova_domain(&host->iova); diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index 229d08b6a45e..ffdbc15b749b 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -129,10 +129,8 @@ struct host1x { struct host1x_syncpt *nop_sp;

struct mutex syncpt_mutex; - struct mutex chlist_mutex; - struct host1x_channel chlist; - unsigned long allocated_channels; - unsigned int num_allocated_channels; + + struct host1x_channel_list channel_list;

struct dentry *debugfs;

diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index 5e8df78b7acd..8447a56c41ca 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -181,10 +181,6 @@ static int channel_submit(struct host1x_job *job) static int host1x_channel_init(struct host1x_channel *ch, struct host1x *dev, unsigned int index) { - ch->id = index; - mutex_init(&ch->reflock); - mutex_init(&ch->submitlock); - ch->regs = dev->regs + index * HOST1X_CHANNEL_SIZE; return 0; } diff --git a/include/linux/host1x.h b/include/linux/host1x.h index 561d6bb6580d..5dd8df2f8810 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -156,7 +156,6 @@ struct host1x_channel; struct host1x_job;

struct host1x_channel *host1x_channel_request(struct device *dev); -void host1x_channel_free(struct host1x_channel *channel); struct host1x_channel *host1x_channel_get(struct host1x_channel *channel); void host1x_channel_put(struct host1x_channel *channel); int host1x_job_submit(struct host1x_job *job);

-- 2.13.0

Dmitry Osipenko

12:16 a.m.

New subject: [PATCH 21/22] drm/tegra: Don't use IOMMU on Tegra20

There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU provider.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/gpu/drm/tegra/drm.c | 5 ++++- drivers/gpu/host1x/dev.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 17416e1c219a..ac8f76d9475f 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -15,6 +15,8 @@ #include <drm/drm_atomic.h> #include <drm/drm_atomic_helper.h>

+#include <soc/tegra/fuse.h> + #include "drm.h" #include "gem.h"

@@ -131,7 +133,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags) if (!tegra) return -ENOMEM;

- if (iommu_present(&platform_bus_type)) { + if (iommu_present(&platform_bus_type) && + tegra_get_chip_id() != TEGRA20) { u64 carveout_start, carveout_end, gem_start, gem_end; struct iommu_domain_geometry *geometry; unsigned long order; diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 5c1c711a21af..56b7e6d51894 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -25,6 +25,8 @@ #include <linux/of.h> #include <linux/slab.h>

+#include <soc/tegra/fuse.h> + #define CREATE_TRACE_POINTS #include <trace/events/host1x.h> #undef CREATE_TRACE_POINTS @@ -177,7 +179,8 @@ static int host1x_probe(struct platform_device *pdev) return err; }

- if (iommu_present(&platform_bus_type)) { + if (iommu_present(&platform_bus_type) && + tegra_get_chip_id() != TEGRA20) { struct iommu_domain_geometry *geometry; unsigned long order;

-- 2.13.0

Dmitry Osipenko

12:16 a.m.

New subject: [PATCH 22/22] Revert "iommu/tegra: gart: Do not register with bus"

The GART driver was disabled because it was picked by as an IOMMU provider for the DRM driver on Tegra20, which is not the purpose of the GART. Now DRM driver avoids to use IOMMU on Tegra20, so the GART driver can be re-enabled. Potentially there are interesting use cases of the GART for Tegra20, like mmap'ing of a fallback memory allocation and using its GART aperture for an accelerated graphics operation.

This reverts commit c7e3ca515e784a82223f447b79eb2090bdb91378.

Signed-off-by: Dmitry Osipenko digetx@gmail.com --- drivers/iommu/tegra-gart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c index 37e708fdbb5a..430d13bf4f95 100644 --- a/drivers/iommu/tegra-gart.c +++ b/drivers/iommu/tegra-gart.c @@ -422,7 +422,7 @@ static int tegra_gart_probe(struct platform_device *pdev) do_gart_setup(gart, NULL);

gart_handle = gart; - + bus_set_iommu(&platform_bus_type, &gart_iommu_ops); return 0; }

-- 2.13.0

Joerg Roedel

30 May 30 May

9:21 a.m.

New subject: [PATCH 22/22] Revert "iommu/tegra: gart: Do not register with bus"

On Tue, May 23, 2017 at 03:16:33AM +0300, Dmitry Osipenko wrote:

...

The GART driver was disabled because it was picked by as an IOMMU provider for the DRM driver on Tegra20, which is not the purpose of the GART. Now DRM driver avoids to use IOMMU on Tegra20, so the GART driver can be re-enabled. Potentially there are interesting use cases of the GART for Tegra20, like mmap'ing of a fallback memory allocation and using its GART aperture for an accelerated graphics operation.

This reverts commit c7e3ca515e784a82223f447b79eb2090bdb91378.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/iommu/tegra-gart.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Joerg Roedel jroedel@suse.de

Joerg Roedel

9:21 a.m.

New subject: [PATCH 21/22] drm/tegra: Don't use IOMMU on Tegra20

On Tue, May 23, 2017 at 03:16:32AM +0300, Dmitry Osipenko wrote:

...

There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU provider.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 5 ++++- drivers/gpu/host1x/dev.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)

Not sure what the context here is or if the patches need to merged with the rest of the patch-set, so:

Acked-by: Joerg Roedel jroedel@suse.de

Dmitry Osipenko

10:08 a.m.

New subject: [PATCH 21/22] drm/tegra: Don't use IOMMU on Tegra20

On 30.05.2017 12:21, Joerg Roedel wrote:

...

On Tue, May 23, 2017 at 03:16:32AM +0300, Dmitry Osipenko wrote:

...
There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU provider.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 5 ++++- drivers/gpu/host1x/dev.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)

Not sure what the context here is or if the patches need to merged with the rest of the patch-set, so:

Acked-by: Joerg Roedel jroedel@suse.de

Probably I would have to add you on the CC to the cover latter, sorry. These two patches do not depend on the rest of the patchset, I just aggregated all the patches into a series for ease of managing them for me and maintainers. Thank you for the ack!

-- Dmitry

Dmitry Osipenko

1 Jun 1 Jun

8:36 a.m.

New subject: [PATCH 21/22] drm/tegra: Don't use IOMMU on Tegra20

On 23.05.2017 03:16, Dmitry Osipenko wrote:

...

There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU provider.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 5 ++++- drivers/gpu/host1x/dev.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 17416e1c219a..ac8f76d9475f 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -15,6 +15,8 @@ #include <drm/drm_atomic.h> #include <drm/drm_atomic_helper.h>

+#include <soc/tegra/fuse.h>

#include "drm.h" #include "gem.h"

@@ -131,7 +133,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags) if (!tegra) return -ENOMEM;

if (iommu_present(&platform_bus_type)) {
if (iommu_present(&platform_bus_type) &&
   tegra_get_chip_id() != TEGRA20) {
u64 carveout_start, carveout_end, gem_start, gem_end; struct iommu_domain_geometry *geometry; unsigned long order;
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 5c1c711a21af..56b7e6d51894 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -25,6 +25,8 @@ #include <linux/of.h> #include <linux/slab.h>

+#include <soc/tegra/fuse.h>

#define CREATE_TRACE_POINTS #include <trace/events/host1x.h> #undef CREATE_TRACE_POINTS @@ -177,7 +179,8 @@ static int host1x_probe(struct platform_device *pdev) return err; }

if (iommu_present(&platform_bus_type)) {
if (iommu_present(&platform_bus_type) &&
   tegra_get_chip_id() != TEGRA20) {
struct iommu_domain_geometry *geometry; unsigned long order;

Nicolas noticed that this breaks driver module compilation, as the `tegra_get_chip_id` is not an exported symbol. I'll consider alternatives to symbol exporting and will fix it in the new revision of the patch.

-- Dmitry

Mikko Perttunen

5:57 p.m.

New subject: [PATCH 21/22] drm/tegra: Don't use IOMMU on Tegra20

On 06/01/2017 11:36 AM, Dmitry Osipenko wrote:

...

On 23.05.2017 03:16, Dmitry Osipenko wrote:

...
There is no IOMMU on Tegra20, instead a GART would be picked as an IOMMU provider.

Signed-off-by: Dmitry Osipenko digetx@gmail.com

drivers/gpu/drm/tegra/drm.c | 5 ++++- drivers/gpu/host1x/dev.c | 5 ++++- 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 17416e1c219a..ac8f76d9475f 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -15,6 +15,8 @@ #include <drm/drm_atomic.h> #include <drm/drm_atomic_helper.h>

+#include <soc/tegra/fuse.h>

#include "drm.h" #include "gem.h"

@@ -131,7 +133,8 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags) if (!tegra) return -ENOMEM;

if (iommu_present(&platform_bus_type)) {
if (iommu_present(&platform_bus_type) &&
   tegra_get_chip_id() != TEGRA20) {
u64 carveout_start, carveout_end, gem_start, gem_end; struct iommu_domain_geometry *geometry; unsigned long order;
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 5c1c711a21af..56b7e6d51894 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -25,6 +25,8 @@ #include <linux/of.h> #include <linux/slab.h>

+#include <soc/tegra/fuse.h>

#define CREATE_TRACE_POINTS #include <trace/events/host1x.h> #undef CREATE_TRACE_POINTS

@@ -177,7 +179,8 @@ static int host1x_probe(struct platform_device *pdev) return err; }

if (iommu_present(&platform_bus_type)) {
if (iommu_present(&platform_bus_type) &&
   tegra_get_chip_id() != TEGRA20) {
struct iommu_domain_geometry *geometry; unsigned long order;
Nicolas noticed that this breaks driver module compilation, as the `tegra_get_chip_id` is not an exported symbol. I'll consider alternatives to symbol exporting and will fix it in the new revision of the patch.

My recommendation is of_machine_is_compatible()

Cheers, Mikko

2889

Age (days ago)

2910

Last active (days ago)

dri-devel@lists.freedesktop.org

68 comments

5 participants

tags (0)

participants (5)

Dmitry Osipenko
Erik Faye-Lund
Joerg Roedel
Mikko Perttunen
Thierry Reding